Kaggle–the Future of Computing?

When I first read about Kaggle I had to know more. Kaggle is only a few years old, but it’s already one of the most interesting developments in data science in recent years, and one that’s sure to be influential.

Kaggle was founded in 2010 by Anthony Goldbloom. Back in 2008, Goldbloom was working as an intern at The Economist. When he was assigned a story on predictive modeling he was surprised at how many people told him it was difficult to make sense of their data.

Goldbloom taught himself to code and launched Kaggle from his bedroom with a contest: $1000 to whomever could come closest to determining how countries would vote in the Eurovision Song Contest. Then Allstate offered $6000 to anyone who could create an algorithm for calculating bodily-injury payments in certain types of accidents. More companies came up with challenges, and more data experts joined Kaggle.

Now the procedures are well established. A company proposes a problem, and Kaggle users take a shot at it. NASA, MasterCard, Allstate, and Facebook have all posted problems. Compensation has ranged from T-shirts to $250,000. There was a recent competition, posted by Heritage Provider Network, for which the compensation was $3 million. Facebook ran a contest with the prize being a data scientist position at Facebook. For those interested in applying sometime, this was the contest:

This competition tests your text skills on a large dataset from the Stack Exchange sites. The task is to predict the tags (a.k.a. keywords, topics, summaries), given only the question text and its title. The dataset contains content from disparate stack exchange sites, containing a mix of both technical and non-technical questions.

Kaggle has over 100,000 data-scientist users, and it’s a fairly exclusive group. Kaggle users come from all over the world, and what they have in common is their advanced degrees—over 80% of the top performers have master’s degrees, and 35% have Ph.D.’s. Kaggle has forums that give these rarified intellectuals opportunities to work together and exchange ideas.

To make Kaggle even more interesting—and even more of a challenge to traditional ways of working—it’s just introduced a new service, Kaggle Connect. Through Kaggle Connect, employers can hire data scientists for specific projects from among Kaggle’s top 500 participants. A very interesting employment model—companies now have access to proven top talent from around the world. Kaggle charges a subscription fee, then matches a data scientist to the company based on expertise. Kaggle provides a set of tool it calls Workbench, which takes raw datasets and turns them into instantly usable ones.

Employers such as American Express and the New York Times have begun listing a Kaggle score as a qualification in their data scientist help wanted ads. This may be indicative of a new trend: valuing actual skills rather than “paper” skills. Is this the brave new labor market, where expertise is measurable and really makes a difference?

Lani Carroll lives in Colorado Springs with her bees, chickens, and horses. She can be found at her Google+ Profile.