Which languages should a programmer know? There’s a surprising degree of consensus. C. C++. C#. But how about visiting an entirely different part of the alphabet? How about learning R?
R was released in 1996. It’s open-source software that was developed by statistics professors Ross Ihaka and Robert Gentleman of the University of Auckland in New Zealand. They wanted a programming language that would be easy for their statistics students to use. R immediately attracted a large following. Now it’s a big player not just in academia, but in big business too.
R is one of that comparatively new breed, open-source software that’s being embraced by business giants. R’s open-source status means interesting packages are always being written for it. For social scientists, R has packages for social network analysis. It has packages for analyzing language. And, probably needless to say, there are packages for biology and physics. Finance is a big user of R, and there are lots of finance packages, such as derivatives analysis packages.
In a survey conducted a couple of months ago by KDNuggets, R was the most popular language for data analytics, data science, and data mining. R was the preferred language of 61% of readers, having grown 16% in popularity in the past year.
Google is a big user. It utilizes R to understand advertising trends and for analyzing search patterns. Google Developers has even released a series of instructional videos for R developers.
Is R a competitor for the giants? SAS is a statistical analysis language that was invented in the 60s and that’s the cornerstone of SAS Institute. SAS’s statistical muscle makes it a natural for big data. But in an interview in TechRepublic, data scientist David Smith of Revolution Analytics questions whether older languages are really suitable for today’s data needs.
He says, “SAS is one of the legacy systems from the 1970s with an enormous user base, so it is a major big data ‘incumbent.’ SAS is widely used, but the analytics it delivers originated in a different era that pre-dated parallel processing, server clusters, and Hadoop. Consequently, SAS is not suited for many modern and emerging big data requirements.” Smith says R was specifically developed to work with big data that’s being parallel processed, and “what might take you one whole week to do with SAS, can take just half a day with R.”
How do R’s developers feel about its success? Mr. Ihaka says, “R is a real demonstration of the power of collaboration, and I don’t think you could construct something like this any other way. We could have chosen to be commercial, and we would have sold five copies of the software.”
Computerworld has a fabulous introduction to R with links to all kinds of interesting goodies.
Lani Carroll lives in Colorado Springs with her bees, chickens, and horses. She can be found at her Google+ Profile.