Muschamp Rd

Curation & Aggregation

March 29th, 2012
Pinterest Logo

Curation is the next big thing on the Internet with Pinterest being the latest media darling. Unfortunately for the talking heads in mainstream media, curation has been a feature of the world wide web since the beginning. A home page, something I’ve maintained online since 1995, is by very definition a curated collection of personal interests. What is news however, is how far technology has advanced, removing barriers that prevented people from easily achieving the curated collection of personal interests that they always wanted to display but didn’t have the time, energy, or desire to learn how to actually build.

Pinterest IPO

Now in 2019 Pinterest has finally gone public. Before I moved to China I tried to get a job there. Too many times I’ve just missed out on opportunities. I started the interview process with Tiny Speck they too are set to go public in 2019 as Slack. I wrote too much PHP code while I was unemployed I should have gotten another development job but I kept trying for a PM position. Today I’m older and wiser and a senior business intelligence analyst.

Aggregation VS Curation

Aggregation is also almost as old as the world wide web itself. Aggregation is what the original portals like Yahoo and Excite tried to do, but it didn’t really take off until the widespread adoption of standards such as RSS, XML, and Atom. Aggregation is supposed to be algorithmic, putting the newest, freshest content first. Curation is often thought to be the opposite of aggregation, curation is personal, subjective, even random, while aggregation is deterministic, ordered, and structured. They aren’t opposites and they aren’t mutually exclusive something that has been driven home while building my latest hobby side project which in 2022 after a PHP upgrade needs to be fixed.

My miniature painting news aggregator in 2019

News Aggregators

News aggregators aren’t complete meritocracies, many incorporate a voting mechanism, a human element to go along with their determinism. Slashdot and Reddit are two famous examples of news aggregators with a strong community that determines which content rises to the top, rather than relying purely on timestamps, algorithms, or editors/curators.

Niche Verticals

One of my hobbies which I sometimes don’t have time for in real life, but is a large component of my online efforts is miniature painting. I was re-introduced to painting miniatures in high school and as I started to use the Internet and the world wide web shortly thereafter it has long been something I’ve followed online. Although I don’t always have the time and energy to paint miniatures, I and others have often expended great time and effort discussing the hobby online. Miniatures can be used in tabletop games, but not everyone in the hobby is a hardcore gamer. I personally think the collecting gene is stronger in me than the gaming gene.

Web 1.0 to Web 2.0

The single most popular page on the domain, month in and month out, is a collection of miniature painting links for which I manually maintain the HTML. This collection is so not Web 2.0, but it is ordered and sorted and has been maintained at the same URL for over a twenty years. I curate this collection. It reflects my biases, preferences, and interests. But obviously other people find it useful and it did very well in Google and other search engines for some terms and keywords.

Facebook VS Google

Search engines are deterministic, algorithmic, and structured. It isn’t an accident that a lot of search engine companies do news aggregation. Google in particular has had a lot of success with products that are deterministic and algorithmic. They have had less success when they try to be personal and social. Facebook and Google are locked in a grudge match which isn’t necessarily for the best, especially for the average web surfer. Smaller, nimbler companies historically arise to fill niches neglected by larger corporate entities. Many of these companies are more personal, much more subjective about how best to curate the vast amount of content online. This is supposedly the next big growth opportunity.

Open VS Closed Web

You’d think this argument would be over by 2012, but just like curation VS aggregation, the information wants to be free crowd continue to clash with corporations and government agencies who have a vested interest and a strong desire to maintain control. Lawyers are of course profiting from both sides. Consumers/web surfers are stuck in the middle. Google as the owner of the largest index of information online and the most active search robot is generally seen as a champion of openness and web standards. Facebook which is supposedly home to 1/3 of the photos ever taken in the history of photography, is a much more closed system, which is ironic because they called their API/developer tools Open Graph.

Private VS Public Collections

Curation can take place in an open or closed system. There are private libraries, private art collections, these are controlled, viewing of them is restricted but they are still curated. There are of course public libraries and public art galleries. These are also curated but the general public can access these collections much more readily. Aggregation is usually public, as the information being aggregated is generally publicly available. There are specialty private information aggregators, these are used in the finance industry, but you generally don’t hear about them on the 6 o’clock news.

The desire to be popular

Many consumers prefer open systems. Some people are obsessed with popularity. If you’re going to spend hours of your time curating a collection of information, often you want other people to see your hard work. Posting just to Facebook can limit who can see what you post. Popularity doesn’t equal influence, and it certainly doesn’t equal Quality. Some of the best curated collections of information aren’t popular. People obsessed with popularity often don’t make the best curators. What is more important, the collection of information or the person who collected the information?

Privacy & Copyright

I think Blood of Kittens hub is too much like content scraping

Also important when considering curation and aggregation and an open VS a closed network is privacy rights and copyright. For art galleries and libraries whether they are private or public doesn’t matter, they own the material in their collections. The oldest aggregators, entities like the Associated Press newswire service have a pretty open sharing policy but they do insist on assigning proper credit and retaining intellectual property rights for creators. Curated collections online regularly do not respect intellectual property rights. There is also a fine line between aggregating news feeds and scraping content just to surround it with ads.

Blame the End User

New laws and legal precedents are continually being set. Pinterest is squarely placing the blame on posting copyrighted images on the end user, the curator, not themselves. Tumblr is overrun with copyrighted material that is a big factor in why it isn’t more popular with advertisers and why it was ultimately sold privately. YouTube might not have succeed without Google’s billions and clout as it was a haven for copyrighted material in the early days. Online social networks continue to alter their user agreements and usually it is in regards to privacy and copyright.

Are closed networks easier to police?

Of the millions of images shared on Facebook, many of them are copyrighted by someone other than the poster, but as a closed system it is harder to measure. On Pinterest, Tumblr, and YouTube corporations and governments can determine much more easily when you share information they own the intellectual property rights to or otherwise want to control access. Then there is the case of China which has a heavily policed, monitored, and censored closed Internet.

The benefits of an Open Web

As someone who has spent time building a news aggregator, open systems and web standards are crucial. I can find cool images posted to Twitter or Flickr much more easily than content shared on Facebook. I’ve been searching the web for new information and particularly new RSS and Atom feeds related to miniature painting. There are a lot of niche sites and I’ve moved well beyond just blogs, but some information is not easy to access, certainly not readily available in a well behaved RSS feed. I find myself doing more and more curation and considering other tools beyond just SimplePie and feeds. I’m going to join Pinterest and try my hand at curating some miniature painting content over there. Pinterest supports RSS feeds for pin boards though it isn’t necessarily a widely advertised feature as it probably costs them ad revenue.

Do people or algorithms know best?

Aggregation is something computer scientists can grasp and excel at, but the human element, the social aspect is still important even vital. Efforts continue to use algorithms to determine which information people will find most interesting, but time and again it has been shown, people know best what other people similar to themselves like.

The Rise of Algorithms

Like FAANG, Pinterest switched from a purely chronological feed to a recommendation algorithm. I spent considerable time trying to train it but after Pinterest became blocked in China my usage declined dramatically. At some point Pinterest reversed their “no porn” policy. I used to rigorously report it but now when I see a new follower also posts porn I block them. I do not want the algorithm to recommend adult content.

Social Media Backlash

After Brexit, the 2016 Trump electoral victory, and too many scandals there is a growing backlash against social media by consumers, governments, even tech workers. The switch from human curated chronological content to algorithmic recommendations is seen as a symptom of what is wrong with social media at least on Twitter. One advantage Pinterest and Instagram have is people generally post only their best curated content. Twitter and Facebook on the other hand are both outlets for outrage, disgust, and even hate.

The Rise of Direct Messaging

Possibly to get around the recommendation algorithms people are using private messages more to share news and other content. These messages directly from your personal network carry greater weight and presumably trustworthiness. The number of competing messaging systems globally is quite high. Facebook owns three and plans to allow Facebook users to message Instagram and WhatsApp accounts. In China, WeChat dominates and although there are “moments” some prefer to just share content by direct message or to invitation only groups. Regardless of how you choose to use WeChat all communication is monitored by the Chinese government.

My Curated Collections

In 2019, my miniature painting advice is no longer my most popular curated content. But on Pinterest itself it represents the majority of my pins. This entire domain can be considered curated content but certain portions are more popular or more regularly updated. These are among my better curated collections:

My News Aggregator

Personal news aggregation is an idea that has merit. The code I wrote still worked until I upgraded PHP to 7.4 but I have not had the time to do much content curation due to my CFA studies. I hope to have more time for hobby projects in 2022 but currently I get the majority of my news from Twitter and Flipboard my latest news app that I’ve been training.

The Future

I don’t know what the future holds. I never expected to spend so much time to start 2019 improving the Quality of my blog. I still have about a dozen more posts to edit and many other tasks on my “ta-do list”. I hope to work on my news aggregator and my web mashup code but I also think about downsizing my web empire.

If you have thoughts on curation, aggregation, algorithms or social media you can leave them below.


Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Posts on Muskblog © Andrew "Muskie" McKay.
CFA Institute does not endorse, promote or warrant the accuracy or quality of Muskblog. CFA® and Chartered Financial Analyst® are registered trademarks owned by CFA Institute.