Are Link-Sharing Services Irrelevant?

You can use RSS to easily follow a few high-profile websites and link sharing services like Slashdot or Digg to discover popular web content. But that’s like reading a classic newspaper and some magazines: The information provided may have a higher chance of being relevant to you, but there’s still a lot of noise that wastes your time.

In this article, I’ll discuss the shortcomings of link sharing services using Dzone as an example. Dzone is a relatively small-scale service targeted at software developers and one of my most important sources of information.

On Dzone, users publish links along with a short description and a few tags from a limited set of ca. 50 predefined tags. Users rate entries using "up"/"down" votes and popular entries appear on the front page. Unlike with Slashdot, Reddit, or Digg, a web page being linked from Dzone typically receives just a few hundred, sometimes a few thousand page views. Because of this, there is little incentive for content producers to game the system. That’s one of the reasons I like Dzone.

Limiting tag choice is an unusual thing in Web 2.0 sites but it simplifies automatic processing (see my article on Yahoo! Pipes). A taxonomy would be more appropriate and more powerful but you can reverse-engineer one based on the tags.

There are two interesting metrics connected to each entry, where only the first one is supported by Dzone:

  • Quality of the entry (aka popularity)
  • Relevancy of the entry to me

Quality is determined by user votes which works reasonably well for Dzone. I have seen little abuse, but as in many Web 2.0 sites (see MusicBrainz, for example), there is little incentive to vote. Typically, there is a dedicated minority of users who contribute most of the votes. You often try to create incentive by building karma systems ("top voters of the week" etc.). Sometimes this works, most of the time it doesn’t.

The second metric, relevancy, is more interesting. It is a completely subjective measure and thus doesn’t work well with the "wisdom of the crowd" approach. To get around this, some link sharing sites build focused sub-communities ("programming", "politics", "cat pictures"). This is mostly a workaround though. What you really want is personalization: I expect the system to present entries that match my interests and are thus relevant to me.

In Dzone, you can subscribe to tag feeds ("java", "web design", etc.), but if you subscribe to multiple feeds you end up with duplicates because one entry may be present on more than one feed. Dzone currently doesn’t offer aggregated feeds based on user-selected tags. Because of this I’m using Yahoo! Pipes to filter the front page feed. This leaves me with ca. 40% of the original stories from which more than half are still irrelevant.

While user-defined feeds would be a big step forward, there is only one real solution: The system has to learn from my actions (clicks and votes) and present only those entries that are interesting to me personally. A system like this is called a Recommender System. Basically, there are content-based recommender systems that select entries (called items in the literature) based on my user profile and properties of the entries. And there are collaborative filtering systems that recommend items based on what users with a similar profile found interesting. Recommender systems is an active research topic. See this survey paper for a good overview.

While collaborative filtering is more difficult to scale and doesn’t handle item churn well (see the Google News paper for details), content-based recommenders are a lot easier to scale. However, content-based schemes often suffer from over-specialization and can be more difficult to implement.

Recommender systems have been in use on the web for more than a decade, so it’s surprising that none of the popular link sharing services has implemented personalization features yet. The first service to offer this would likely have an advantage over its competitors. My time and attention is limited, so the other services would quickly become irrelevant. At least to me.

This entry was posted in computer science and tagged , , . Bookmark the permalink.

2 Responses to Are Link-Sharing Services Irrelevant?

  1. ihosama says:

    Actually, what you propose is the best way to lock oneself up to a SYNTHETICALLY narrowed information pool.

    Any “automatic” recommender is by definition destined to a limited set of arbitrary rules (compared to a human brain).

    While giving you an illusion you get all the data you need, you would get all the data THE SYSTEM believes you need.

    Without thought-monitoring a for recommender to be correct by using you own thoughts, and then preprocessing them via artificial intelligence, any recommender is destined to synthetic but arbitrary filtering.

    Best solution is to limit the amount of feeds to a manageable amount and, very important, one needs to learn disregarding bloat without reading it in full or focusing his mind on it.
    That is the best filter one could have.

    • mafr says:

      As I wrote, content based recommenders often suffer from overspecialization which isn’t such a big problem with collaborative filtering. “The System” uses input from all users, particularly those with a taste similar to yours. With a good explain feature, it can be a quite transparent process.

      Following very few carefully selected feeds is a low-tech solution that may work for some. However, it narrows your “information pool” even more than a recommender system does.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s