Thursday, August 26, 2010

‘Digg In The Future’ can predict Digg’s frontpage with 63% accuracy

Hard-core Digg users might spend all day digging their favorite links to the front page and burying unpopular ones to the bottom, and argue and debate why certain links made it and others didn’t, but all it take is a 17-year-old high school kid and some spare time to determine which stories will reach the frontpage 24 hours before it actually does.

Raj Vir, from Beverly Hills, CA, is a 17-year-old programmer who has come up with an algorithm he says can predict which links will make it to the front of Digg with a high degree of accuracy. The algorithm powers a site called Digg In The Future.

Because Digg has to wait on user feedback to promote links, sometimes news reaches Digg much later than it does mainstream outlets. Digg in the Future attempts to get a hold of stories before they are popular, in a way predicting the future. Not every story that appears on Digg in the Future will appear on Digg's homepage; nevertheless, Digg in the Future will give you a realistic glimpse of Digg's frontpage tomorrow.

digg-in-the-future

The long standing joke is that if you want to find out what Digg’s frontpage will be like two days from now, checkout Reddit's frontpage today. The joke is not without a premise. A large number of stories reach Reddit’s frontpage first before they appear on Digg. Raj Vir said he even tried the “Reddit approach” but his present algorithm is better.

I have tried the reddit approach – and it is not nearly as successful as the current method. Yes, stories frequently hit reddit first, but a much larger number of stories reach reddit’s homepage and never even see a glimpse of Digg’s. The current algorithm takes into account popular URLs from twitter (which is even more predicting than reddit), but more importantly the effect of power users. Simply displaying reddit posts only would not yield results even close to the current 63%.

Vir, who has been working on his algorithm in his spare time for the past several months, says his algorithm is 63-percent accurate when it comes to predicting what will make the Digg front page. He also says it will still work even with the new version of Digg, which was unveiled yesterday.

Vir’s algorithm takes into account two main factors: “power submitters” (users who frequently submit future frontpage stories) and “power diggers” (users who frequently digg future frontpage stories). Digg In The Future keeps track of stories that have been dugg or submitted by successful users. The algorithm also relies on other factors including the time of day (since stories submitted in the early morning hours are unlikely to reach the front page) and whether the link comes from “preferred” sites that appeal to Digg users: a list that includes Cracked, Wired, The Huffington Post, The Daily Mail and The Telegraph.

One interesting element of the algorithm is that it doesn’t just look at Digg or its users. Since many of the links that make it to the front page of the site have already been shared on other social networks, the Digg In The Future software looks at frequently shared URLs from Twitter and gives those added weight.

Vir says he isn’t looking to build a business or sell his algorithm at this point, but if his algorithm is good enough he is going to get plenty of offers coming his way. GigaOm writes:

… if his algorithm proves to be really good at predicting trending topics on Digg, it might be good at predicting what links and content will become popular elsewhere, and a lot of companies are very interested in doing that. Vir may wind up getting an offer he can’t refuse.

This kid is going a long way.

0 comments:

Post a Comment