Magic command interface for interactive work with R via rpy2
%R
- %R [-i INPUT] [-o OUTPUT] [-w WIDTH] [-h HEIGHT] [-d DATAFRAME]
- [-u UNITS] [-p POINTSIZE] [-b BG] [-n] [code [code ...]]
Execute code in R, and pull some of the results back into the Python namespace.
In line mode, this will evaluate an expression and convert the returned value to a Python object. The return value is determined by rpy2’s behaviour of returning the result of evaluating the final line.
Multiple R lines can be executed by joining them with semicolons:
In [9]: %R X=c(1,4,5,7); sd(X); mean(X)
Out[9]: array([ 4.25])
As a cell, this will run a block of R code, without bringing anything back by default:
In [10]: %%R
....: Y = c(2,4,3,9)
....: print(summary(lm(Y~X)))
....:
Call:
lm(formula = Y ~ X)
Residuals:
1 2 3 4
0.88 -0.24 -2.28 1.64
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0800 2.3000 0.035 0.975
X 1.0400 0.4822 2.157 0.164
Residual standard error: 2.088 on 2 degrees of freedom
Multiple R-squared: 0.6993,Adjusted R-squared: 0.549
F-statistic: 4.651 on 1 and 2 DF, p-value: 0.1638
In the notebook, plots are published as the output of the cell.
%R plot(X, Y)
will create a scatter plot of X bs Y.
If cell is not None and line has some R code, it is prepended to the R code in cell.
Objects can be passed back and forth between rpy2 and python via the -i -o flags in line:
In [14]: Z = np.array([1,4,5,10])
In [15]: %R -i Z mean(Z)
Out[15]: array([ 5.])
In [16]: %R -o W W=Z*mean(Z)
Out[16]: array([ 5., 20., 25., 50.])
In [17]: W
Out[17]: array([ 5., 20., 25., 50.])
The return value is determined by these rules:
unless the final line prints something to the console, in which case None is returned.
by rpy2, then None is returned.
Use the –dataframe flag or %Rget to push / return a structured array.
value in the line is an empty string.
The –dataframe argument will attempt to return structured arrays. This is useful for dataframes with mixed data types. Note also that for a data.frame, if it is returned as an ndarray, it is transposed:
In [18]: dtype=[('x', '<i4'), ('y', '<f8'), ('z', '|S1')]
In [19]: datapy = np.array([(1, 2.9, 'a'), (2, 3.5, 'b'), (3, 2.1, 'c'), (4, 5, 'e')], dtype=dtype)
In [20]: %%R -o datar
datar = datapy
....:
In [21]: datar
Out[21]:
array([['1', '2', '3', '4'],
['2', '3', '2', '5'],
['a', 'b', 'c', 'e']],
dtype='|S1')
In [22]: %%R -d datar
datar = datapy
....:
In [23]: datar
Out[23]:
array([(1, 2.9, 'a'), (2, 3.5, 'b'), (3, 2.1, 'c'), (4, 5.0, 'e')],
dtype=[('x', '<i4'), ('y', '<f8'), ('z', '|S1')])
The –dataframe argument first tries colnames, then names. If both are NULL, it returns an ndarray (i.e. unstructured):
In [1]: %R mydata=c(4,6,8.3); NULL
In [2]: %R -d mydata
In [3]: mydata
Out[3]: array([ 4. , 6. , 8.3])
In [4]: %R names(mydata) = c('a','b','c'); NULL
In [5]: %R -d mydata
In [6]: mydata
Out[6]:
array((4.0, 6.0, 8.3),
dtype=[('a', '<f8'), ('b', '<f8'), ('c', '<f8')])
In [7]: %R -o mydata
In [8]: mydata
Out[8]: array([ 4. , 6. , 8.3])
-i INPUT, --input INPUT | |
Names of input variable from shell.user_ns to be assigned to R variables of the same names after calling self.pyconverter. Multiple names can be passed separated only by commas with no whitespace. | |
-o OUTPUT, --output OUTPUT | |
Names of variables to be pushed from rpy2 to shell.user_ns after executing cell body and applying self.Rconverter. Multiple names can be passed separated only by commas with no whitespace. | |
-w WIDTH, --width WIDTH | |
Width of png plotting device sent as an argument to png in R. | |
-h HEIGHT, --height HEIGHT | |
Height of png plotting device sent as an argument to png in R. | |
-d DATAFRAME, --dataframe DATAFRAME | |
Convert these objects to data.frames and return as structured arrays. | |
-u UNITS, --units UNITS | |
Units of png plotting device sent as an argument to png in R. One of [“px”, “in”, “cm”, “mm”]. | |
-p POINTSIZE, --pointsize POINTSIZE | |
Pointsize of png plotting device sent as an argument to png in R. | |
-b BG, --bg BG | Background of png plotting device sent as an argument to png in R. |
-n, --noreturn | Force the magic to not return anything. |
%Rpush
A line-level magic for R that pushes variables from python to rpy2. The line should be made up of whitespace separated variable names in the IPython namespace:
In [7]: import numpy as np In [8]: X = np.array([4.5,6.3,7.9]) In [9]: X.mean() Out[9]: 6.2333333333333343 In [10]: %Rpush X In [11]: %R mean(X) Out[11]: array([ 6.23333333])
%Rpull
%Rpull [-d] [outputs [outputs ...]]
A line-level magic for R that pulls variables from python to rpy2:
In [18]: _ = %R x = c(3,4,6.7); y = c(4,6,7); z = c('a',3,4)
In [19]: %Rpull x y z
In [20]: x
Out[20]: array([ 3. , 4. , 6.7])
In [21]: y
Out[21]: array([ 4., 6., 7.])
In [22]: z
Out[22]:
array(['a', '3', '4'],
dtype='|S1')
If –as_dataframe, then each object is returned as a structured array after first passed through “as.data.frame” in R before being calling self.Rconverter. This is useful when a structured array is desired as output, or when the object in R has mixed data types. See the %%R docstring for more examples.
Beware that R names can have ‘.’ so this is not fool proof. To avoid this, don’t name your R objects with ‘.’s...
-d, --as_dataframe | |
Convert objects to data.frames before returning to ipython. |
%Rget
%Rget [-d] output
Return an object from rpy2, possibly as a structured array (if possible). Similar to Rpull except only one argument is accepted and the value is returned rather than pushed to self.shell.user_ns:
In [3]: dtype=[('x', '<i4'), ('y', '<f8'), ('z', '|S1')]
In [4]: datapy = np.array([(1, 2.9, 'a'), (2, 3.5, 'b'), (3, 2.1, 'c'), (4, 5, 'e')], dtype=dtype)
In [5]: %R -i datapy
In [6]: %Rget datapy
Out[6]:
array([['1', '2', '3', '4'],
['2', '3', '2', '5'],
['a', 'b', 'c', 'e']],
dtype='|S1')
In [7]: %Rget -d datapy
Out[7]:
array([(1, 2.9, 'a'), (2, 3.5, 'b'), (3, 2.1, 'c'), (4, 5.0, 'e')],
dtype=[('x', '<i4'), ('y', '<f8'), ('z', '|S1')])
-d, --as_dataframe | |
Convert objects to data.frames before returning to ipython. |