IPython Documentation

Table Of Contents

Previous topic

octavemagic

Next topic

storemagic

This Page

Warning

This documentation is for an old version of IPython. You can find docs for newer versions here.

rmagic

Magic command interface for interactive work with R via rpy2

Note

The rpy2 package needs to be installed separately. It can be obtained using easy_install or pip.

You will also need a working copy of R.

Usage

To enable the magics below, execute %load_ext rmagic.

%R

%R [-i INPUT] [-o OUTPUT] [-w WIDTH] [-h HEIGHT] [-d DATAFRAME]
[-u {px,in,cm,mm}] [-r RES] [-p POINTSIZE] [-b BG] [-n] [code [code ...]]

Execute code in R, and pull some of the results back into the Python namespace.

In line mode, this will evaluate an expression and convert the returned value to a Python object. The return value is determined by rpy2’s behaviour of returning the result of evaluating the final line.

Multiple R lines can be executed by joining them with semicolons:

In [9]: %R X=c(1,4,5,7); sd(X); mean(X)
Out[9]: array([ 4.25])

As a cell, this will run a block of R code, without bringing anything back by default:

In [10]: %%R
   ....: Y = c(2,4,3,9)
   ....: print(summary(lm(Y~X)))
   ....:

Call:
lm(formula = Y ~ X)

Residuals:
    1     2     3     4
 0.88 -0.24 -2.28  1.64

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)   0.0800     2.3000   0.035    0.975
X             1.0400     0.4822   2.157    0.164

Residual standard error: 2.088 on 2 degrees of freedom
Multiple R-squared: 0.6993,Adjusted R-squared: 0.549
F-statistic: 4.651 on 1 and 2 DF,  p-value: 0.1638

In the notebook, plots are published as the output of the cell.

%R plot(X, Y)

will create a scatter plot of X bs Y.

If cell is not None and line has some R code, it is prepended to the R code in cell.

Objects can be passed back and forth between rpy2 and python via the -i -o flags in line:

In [14]: Z = np.array([1,4,5,10])

In [15]: %R -i Z mean(Z)
Out[15]: array([ 5.])

In [16]: %R -o W W=Z*mean(Z)
Out[16]: array([  5.,  20.,  25.,  50.])

In [17]: W
Out[17]: array([  5.,  20.,  25.,  50.])

The return value is determined by these rules:

  • If the cell is not None, the magic returns None.
  • If the cell evaluates as False, the resulting value is returned unless the final line prints something to the console, in which case None is returned.
  • If the final line results in a NULL value when evaluated by rpy2, then None is returned.
  • No attempt is made to convert the final value to a structured array. Use the –dataframe flag or %Rget to push / return a structured array.
  • If the -n flag is present, there is no return value.
  • A trailing ‘;’ will also result in no return value as the last value in the line is an empty string.

The –dataframe argument will attempt to return structured arrays. This is useful for dataframes with mixed data types. Note also that for a data.frame, if it is returned as an ndarray, it is transposed:

In [18]: dtype=[('x', '<i4'), ('y', '<f8'), ('z', '|S1')]

In [19]: datapy = np.array([(1, 2.9, 'a'), (2, 3.5, 'b'), (3, 2.1, 'c'), (4, 5, 'e')], dtype=dtype)

In [20]: %%R -o datar
datar = datapy
   ....: 

In [21]: datar
Out[21]: 
array([['1', '2', '3', '4'],
       ['2', '3', '2', '5'],
       ['a', 'b', 'c', 'e']], 
      dtype='|S1')

In [22]: %%R -d datar
datar = datapy
   ....: 

In [23]: datar
Out[23]: 
array([(1, 2.9, 'a'), (2, 3.5, 'b'), (3, 2.1, 'c'), (4, 5.0, 'e')], 
      dtype=[('x', '<i4'), ('y', '<f8'), ('z', '|S1')])

The –dataframe argument first tries colnames, then names. If both are NULL, it returns an ndarray (i.e. unstructured):

In [1]: %R mydata=c(4,6,8.3); NULL

In [2]: %R -d mydata

In [3]: mydata
Out[3]: array([ 4. ,  6. ,  8.3])

In [4]: %R names(mydata) = c('a','b','c'); NULL

In [5]: %R -d mydata

In [6]: mydata
Out[6]: 
array((4.0, 6.0, 8.3), 
      dtype=[('a', '<f8'), ('b', '<f8'), ('c', '<f8')])

In [7]: %R -o mydata

In [8]: mydata
Out[8]: array([ 4. ,  6. ,  8.3])
positional arguments:
code
optional arguments:
-i INPUT, --input INPUT
 Names of input variable from shell.user_ns to be assigned to R variables of the same names after calling self.pyconverter. Multiple names can be passed separated only by commas with no whitespace.
-o OUTPUT, --output OUTPUT
 Names of variables to be pushed from rpy2 to shell.user_ns after executing cell body and applying self.Rconverter. Multiple names can be passed separated only by commas with no whitespace.
-w WIDTH, --width WIDTH
 Width of png plotting device sent as an argument to png in R.
-h HEIGHT, --height HEIGHT
 Height of png plotting device sent as an argument to png in R.
-d DATAFRAME, --dataframe DATAFRAME
 Convert these objects to data.frames and return as structured arrays.
-u {px,in,cm,mm}, –units {px,in,cm,mm}
Units of png plotting device sent as an argument to png in R. One of [“px”, “in”, “cm”, “mm”].
-r RES, --res RES
 Resolution of png plotting device sent as an argument to png in R. Defaults to 72 if units is one of [“in”, “cm”, “mm”].
-p POINTSIZE, --pointsize POINTSIZE
 Pointsize of png plotting device sent as an argument to png in R.
-b BG, --bg BG Background of png plotting device sent as an argument to png in R.
-n, --noreturn Force the magic to not return anything.

%Rpush

A line-level magic for R that pushes variables from python to rpy2. The line should be made up of whitespace separated variable names in the IPython namespace:

In [7]: import numpy as np

In [8]: X = np.array([4.5,6.3,7.9])

In [9]: X.mean()
Out[9]: 6.2333333333333343

In [10]: %Rpush X

In [11]: %R mean(X)
Out[11]: array([ 6.23333333])

%Rpull

%Rpull [-d] [outputs [outputs ...]]

A line-level magic for R that pulls variables from python to rpy2:

In [18]: _ = %R x = c(3,4,6.7); y = c(4,6,7); z = c('a',3,4)

In [19]: %Rpull x  y z

In [20]: x
Out[20]: array([ 3. ,  4. ,  6.7])

In [21]: y
Out[21]: array([ 4.,  6.,  7.])

In [22]: z
Out[22]:
array(['a', '3', '4'],
      dtype='|S1')

If –as_dataframe, then each object is returned as a structured array after first passed through “as.data.frame” in R before being calling self.Rconverter. This is useful when a structured array is desired as output, or when the object in R has mixed data types. See the %%R docstring for more examples.

Notes

Beware that R names can have ‘.’ so this is not fool proof. To avoid this, don’t name your R objects with ‘.’s...

positional arguments:
outputs
optional arguments:
-d, --as_dataframe
 Convert objects to data.frames before returning to ipython.

%Rget

%Rget [-d] output

Return an object from rpy2, possibly as a structured array (if possible). Similar to Rpull except only one argument is accepted and the value is returned rather than pushed to self.shell.user_ns:

In [3]: dtype=[('x', '<i4'), ('y', '<f8'), ('z', '|S1')]

In [4]: datapy = np.array([(1, 2.9, 'a'), (2, 3.5, 'b'), (3, 2.1, 'c'), (4, 5, 'e')], dtype=dtype)

In [5]: %R -i datapy

In [6]: %Rget datapy
Out[6]: 
array([['1', '2', '3', '4'],
       ['2', '3', '2', '5'],
       ['a', 'b', 'c', 'e']], 
      dtype='|S1')

In [7]: %Rget -d datapy
Out[7]: 
array([(1, 2.9, 'a'), (2, 3.5, 'b'), (3, 2.1, 'c'), (4, 5.0, 'e')], 
      dtype=[('x', '<i4'), ('y', '<f8'), ('z', '|S1')])
positional arguments:
output
optional arguments:
-d, --as_dataframe
 Convert objects to data.frames before returning to ipython.