Renjin is a JVM-based interpreter for the R language for statistical computing. This project is an initiative of BeDataDriven, a company providing consulting in analytics and decision support systems.
Over the past two decades, the R language for statistical computing has emerged as the de facto standard for analysts, statisticians, and scientists. Today, a wide range of enterprises –from pharmaceuticals to insurance– depend on R for key business uses. Renjin is a new implementation of the R language and environment for the Java Virtual Machine (JVM), whose goal is to enable transparent analysis of big data sets and seamless integration with other enterprise systems such as databases and application servers.
Renjin is still under development, but it is already being used in production for a number of our client projects, and supports most CRAN packages, including some with C/Fortran dependencies.
We built Renjin, a new interpreter for the JVM because we wanted the beauty, the flexibility, and power of R with the performance of the Java Virtual Machine.
R has been traditionally limited by the need to fit data sets into memory, and working with even modest sets of data can quickly exhaust memory due to historical limitations in GNU R interpreter’s implementation.
Renjin will allow R scripts to transparently interact with data wherever it’s stored, whether that’s on disk, in a remote database, or in the cloud.
While there have been attempts to bring big data to the original interpreter, these have generally provided a parallel set of data structures and algorithms, threatening a fragmentation of the language and platform. Renjin, in contrast, will allow existing R code to run on larger datasets with no modification, using R’s familiar and standard data structures and algorithms.
Renjin offers performance improvements in executing R code on several fronts:
These improvements make it possible to perform real-time analyses using complex models.
Renjin enables R developers to deploy their code to Platform-as-a-Service providers like Google Appengine, Amazon Beanstalk or Heroku without worrying about scale or infrastructure. Renjin is pure Java - it can run anywhere.
You can try the live demo of the interpreter on Google Appengine to see Renjin in action.
Built on the JVM, Renjin allows R code to interact directly with JVM libraries and data structures, without the need for expensive data transfer or brittle inter-process communication.
Java and Scala developers can expose their data directly to R scripts by implementing a simple interface. To the R developer, the data looks no different than a normal R data structure.
Renjin runs safely in multithreaded application servers, allowing embedders to create as many independent “apartments” to run existing single-threaded R code, or to allow multi- threaded access if the user’s R code is written in a pure, functional manner. Renjin allows developers to choose the right language for the different components of an enterprise system, without worrying how they’ll interact.
Many enterprises will prototype analyses and models in R because of it’s ease of use, but then port the results to Java/C++ for production use, either because of performance limitations or due to poor interoperability between the original R VM and other enterprise systems.
With Renjin, analysts can move directly from prototype to production. R scripts can be packaged as JARs and called directly from JVM-based environments and referenced directly by other JVM languages like Java, Scala, JRuby, or Clojure.
Like R itself, the Renjin source code is available under an open-source license. That means you get to stand on the shoulders of giants at little or no cost.
Check out the support page if you need (commercial) support for Renjin.
The current version of Renjin is 0.7.0-RC7. Choose your flavor below and hit the download button!
Add to your Maven Project:
<dependencies> <dependency> <groupId>org.renjin</groupId> <artifactId>renjin-script-engine</artifactId> <version>0.7.0-RC7</version> </dependency> </dependencies> <repositories> <repository> <id>bedatadriven</id> <name>bedatadriven public repo</name> <url>http://nexus.bedatadriven.com/content/groups/public/</url> </repository> </repositories>
As Renjin is a Java application it will run on any platform with a JVM installed. Therefore, if you want to run Renjin, you need to have installed at least version 6 (or greater) of the Java SE Runtime Environment (JRE). If you haven't, then download and install it from http://www.oracle.com/technetwork/java/javase/downloads/index.html.
The Swing-based Studio GUI is a very simple demo for the moment. Download the .jar file and double-click to execute, or start with the following command:
java -jar renjin-studio-0.7.0-RC7-jar-with-dependencies.jar
The Debian package can be installed on Ubuntu Linux using the following command:
sudo dpkg -i renjin-debian-package-0.7.0-RC7.deb
After this, you can start Renjin by simply typing
renjin on your command line.
After a long period of silence surrounding Renjin, we are now back with a completely revised guide for Java and R developers as well as some new functionality for creating packages for Renjin.
The DALI workshop brought together an amazing group of scientists, language implementors, and VM-wonks in one room last week. So many great ideas!
Radford Neal started a great conversation with his comparison of Renjin, pqR and Riposte. I want to continue this conversation by going into a bit more depth about how Renjin's vector pipeliner works, and where we want to go with it.