Muschamp Rd

Curation & Aggregation

March 29th, 2012
Pinterest Logo

Curation is the next big thing on the Internet with Pinterest being the latest media darling. Unfortunately for the talking heads in mainstream media, curation has been a feature of the world wide web since the beginning. A home page, something I’ve maintained online since 1995, is by very definition a curated collection of personal interests. What is news however, is how far technology has advanced, removing barriers that prevented people from easily achieving the collection of personal interests that they always wanted but didn’t have the time, energy, or desire to learn necessary to actually build.

Aggregation is also almost as old as the world wide web itself. Aggregation is what the original portals like Yahoo and Excite tried to do, but it didn’t really take off until the widespread adoption of standards such as RSS, XML, and Atom. Aggregation is supposed to be algorithmic, putting the newest, freshest content first. Curation is often thought to be the opposite of aggregation, curation is personal, subjective, even random, while aggregation is deterministic, ordered, and structured. They aren’t opposites and they aren’t mutually exclusive something that has been driven home while building my latest hobby side project.

News aggregators aren’t complete meritocracies, many incorporate a voting mechanism, a human element to go along with their determinism. Slashdot and Reddit are two famous examples of news aggregators with a strong community that determines which content rises to the top, rather than relying purely on timestamps, algorithms, and/or editors (curators).

One of my hobbies which I sometimes don’t have time for in real life, but is also a large component of my online efforts is miniature painting. I was re-introduced to painting miniatures in high school and as I started to use the Internet and the world wide web shortly thereafter it has always been something I’ve followed online. Although I don’t always have the time and energy to paint miniatures, I and others have often expended great time and effort discussing the hobby online. Miniatures can be used in tabletop games, but not everyone in the hobby is a hardcore gamer. I personally think the collecting gene is stronger in me than the gaming gene. Plus as I’ve gotten older I’ve become less competitive, it often brings out the worst in people, especially online.

The single most popular page on the domain, month in and month out, is a collection of miniature painting links which I maintain by hand. This collection is so not Web 2.0, but it is ordered and sorted and has been maintained at the same URL for over a decade. I curate this collection. It reflects my biases, preferences, and interests. But obviously other people find it useful and it does very well in Google and other search engines for some terms and keywords.

Search engines are deterministic, algorithmic, and structured. It isn’t an accident that a lot of search engine companies do news aggregation. Google in particular has had a lot of success with products that are deterministic and algorithmic. They have had less success when they try to be personal and social. Facebook and Google are locked in a grudge match which isn’t necessarily for the best, especially for the average web surfer. Smaller, nimbler companies historically arise to fill niches neglected by larger corporate entities. Many of these companies are more personal, much more subjective about how best to curate the vast amount of content online. This is supposedly the next big growth opportunity.

Open VS Closed Web

You’d think this argument would be over by 2012, but just like curation VS aggregation, the information wants to be free crowd continue to clash with corporations and government agencies who have a vested interest and a strong desire to maintain control. Lawyers are of course profiting from both sides. Consumers/web surfers are stuck in the middle. Google as the owner of the largest index of information online and the most active search robot is generally seen as a champion of openness and web standards. Facebook which is supposedly home to 1/3 of the photos ever taken in the history of photography, is a much more closed system, which is ironic because they called their API/developer tools Open Graph.

Curation can take place in an open or closed system. There are private libraries, private art collections, these are controlled, viewing of them is restricted but they are still curated. There are of course public libraries and public art galleries. These are also curated but the general public can access these collections much more readily. Aggregation is usually public, as the information being aggregated is generally publicly available. There are specialty private information aggregators, these are used in the finance industry, but you generally don’t hear about them on the 6 o’clock news.

Consumers generally prefer open systems. Some people are obsessed with popularity. If you’re going to spend hours of your time curating a collection of information, often you want other people to see your hard work. Posting just to Facebook likely limits who can see what you post. Popularity doesn’t equal influence, and it certainly doesn’t equal Quality. Some of the best curated collections of information aren’t popular. People obsessed with popularity often don’t make the best curators. What is more important, the collection of information or the person who collected the information? I think Blood of Kittens hub is too much like content scraping

Privacy & Copyright

Also important when considering curation and aggregation and an open VS a closed network is privacy rights and copyright. For art galleries and libraries whether they are private or public doesn’t matter, they own the material in their collections. The oldest aggregators, entities like the Associated Press newswire service have a pretty open sharing policy but they do insist on assigning proper credit and retaining intellectual property rights for creators. Curated collections online regularly do not respect intellectual property rights. There is also a fine line between aggregating news feeds and scraping content just to surround it with ads.

New laws and legal precedents are continually being set. Pinterest is squarely placing the blame on posting copyrighted images on the end user, the curator, not themselves. Tumblr is overrun with copyrighted material that is a big factor in why it isn’t more popular with advertisers and why it is still privately held. YouTube might not have succeed without Google’s billions and clout as it was a haven for copyrighted material in the early days. Online social networks continue to alter their user agreements and usually it is in regards to privacy and copyright.

Of the millions of images shared on Facebook, many of them are copyrighted by someone other than the poster, but as a closed system it is hard to see who is sharing what. On Pinterest, Tumblr, and Twitter corporations and governments can determine much more easily when you share information they own the intellectual property rights to or otherwise want to control access.

As someone who has spent time building a news aggregator, open systems and web standards are crucial. I can find cool images posted to Twitter or Flickr much more easily than content shared on Facebook. I’ve been searching the web for new information and particularly new RSS and Atom feeds related to miniature painting. There are a lot of niche sites and I’ve moved well beyond just blogs, but some information is not easy to access, certainly not readily available in a well behaved RSS feed. I find myself doing more and more curation and considering other tools beyond just SimplePie and feeds. I’m going to join Pinterest and try my hand at curating some miniature painting content over there. Pinterest supports RSS feeds for pin boards though it isn’t necessarily a widely advertised feature as it probably costs them ad revenue.

Aggregation is something computer scientists can grasp and excel at, but the human element, the social aspect is still important even vital. Efforts continue to use computers to determine which information people will find most interesting, but time and again it has been shown, people know best what other people similar to themselves ‘like’.


Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Posts on Muskblog © Andrew "Muskie" McKay.
CFA Institute does not endorse, promote or warrant the accuracy or quality of Muskblog. CFA® and Chartered Financial Analyst® are registered trademarks owned by CFA Institute.