Torch for r

Torch for r DEFAULT

Provides functionality to define and train neural networks similar to 'PyTorch' by Paszke et al (2019) <arXiv:1912.01703> but written entirely in R using the 'libtorch' library. Also supports low-level tensor operations and 'GPU' acceleration.

Version:0.6.0Imports:Rcpp, R6, withr, rlang, methods, utils, stats, bit64, magrittr, tools, coro, callr, cli, ellipsisLinkingTo:RcppSuggests:testthat (≥ 3.0.0), covr, knitr, rmarkdown, glue, palmerpenguins, mvtnorm, numDeriv, katexPublished:2021-10-07Author:Daniel Falbel [aut, cre, cph], Javier Luraschi [aut], Dmitriy Selivanov [ctb], Athos Damiani [ctb], Christophe Regouby [ctb], Krzysztof Joachimiak [ctb], RStudio [cph]Maintainer:Daniel Falbel <daniel at>BugReports: + file LICENSEURL:,, LibTorch ( views:MachineLearningCRAN checks:torch results

How to create neural networks with Torch in R

In this tutorial we are going to learn how to use Torch in R. Torch is one of the most used frameworks for creating neural networks and deep-learning and has recently been released for R.

In the tutorial we will cover everything you need to know to code your neural networks from start to finish. More specifically we are going to see:

  • How to install Torch. This might seem silly, but for Torch to work correctly you have to have the correct version of R and certain plugins installed. It is not difficult, but it is very important.
  • Different ways to create the structures of our neural networks. Torch offers several ways to create our networks. We will see what they are and when it is advisable to use each of them.
  • How to upload our data to Torch. Whether we use numerical data or images, you will learn how to load your data in a format that Torch understands. In addition, in the case of images, in this tutorial, I also show you how to use Torchvision, a Torch package that allows you to apply transformations and load images in a batch.
  • How to train our networks with Torch in R. I will explain what types of activation functions and optimizers there are, how they are implemented and, the characteristics of each of them.
  • How to save your Torch models. And, a model is useless if we do not put it into production. So I will explain how to save and load them and the conventions that exist.
  • Comparison with Tensorflow and Keras. As you can see, Torch offers functionalities that are very similar to the ones offered by Tensorflow and Keras. In this section, I explain, from my point of view, the advantages of each of them for Python users.

As you can see, this is a very extensive tutorial on how to use Torch in R, so let’s get to it!

Torch in R: first steps

Install Torch in R

Although Torch works correctly with version 3.6.3 of R, if your goal is to Docker-ize the model and put it into production (as I explained in this post), I would recommend using R version 4.0.3, since with previous versions it has given error. (Note: you can check your version of R by viewing the object).

Besides, if you install version 4.0.3 and you are Windows users, you will also have to install RTools40. This tool allows you to compile the C ++ code, in which Torch is written. Doing it is very easy, just follow this tutorial.

Once you do this, we can download and install the library. If you have version 3.6.3 it will ask you to install the binaries. If you have any problem, try upgrading youR version to 4.0.3.

Use Torch on CPU or on GPU

Now that we have R installed, whe have to to decide if we want the models to train on GPU (if we have one compatible with the installed drivers) or if they will train on CPU.

If you don’t know if we have Cuda to use Torch on the GPU, you can run the following command:

If we want to use Torch on GPU we will have to indicate that the tensors run on GPU. To do this, you have to apply the $cuda() class every time you create a Tensor. While it is recommended by Torch, it is not optimal and is a bit tedious.

Now that we have Torch installed and we know how to run it on a GPU, we can dive right into creating our neural networks. Let’s go for it!

How to code the structure of a neural network in R with Torch

As I mentioned at the beginning of this tutorial, there are two different ways of coding a neural network in Torch (in both R and Python):

  1. Create a sequential model. This way of working is very similar to the way of doing it in Keras. We simply have to indicate the input and output tensors and add the layers to the model. Simple and fast.
  2. Create a model from scratch. This case is quite similar to using Tensorflow, since we must indicate all the elements of the network and program the forward pass manually. There are two ways of creating a model from scratch: we can either code everything or we can create a class for our model. This last way of coding the network is the most frequent one.

Of course, let’s see how each of these ways of programming a neural network in R with Torch works:

Creating the structure of the sequential network

To program a sequential neural network, we must include the different layers of the network within the function. In this sense, an important difference from Keras is that the definition of the activation function is done outside the declaration of the layer type.

Besides, in each layer, we must indicate both the input and output dimensions of the data. Although this is something very simple in dense or fully-connected neural networks, in the case of convolutional neural networks, you have to be clear about how padding, kernels, and others work.

If this is not your case, don’t worry, you will find everything you need to know in my post about convolutional neural networks 😉

In any case, we are going to program a fully-connected neural network with 4 layers and 4,8,16 and 3 neurons per layer, respectively. Without a doubt, it is a very large network for the problem that we will solve, but it serves as an example.

As you can see, this way of creating a neural network in Torch with R is very very simple and it is very similar to the way of doing it in Keras.

Having seen the first way to create a neural network in Torch, we are going to see the second classic way to create a neural network in Torch: creating a neural network through a custom class.

Creating the structure of the neural network using a class in R with Torch

Another way of coding a neural network in R with Torch is by defining our neural network as a custom class. In this case, the model is much more flexible, but we will have to define all the layers that we are going to use and do the forward-pass manually.

Now, we can call our model as if it was a function:

When to use each type of neural network in Torch

At first glance we may think that modules (or classes) do not have great advantages over sequential models, but this is not the case.

And is that when we start to create networks of hundreds of layers, using the sequential model can be a bit chaotic. A much better way to approach these networks is to split the network into different sections and in each section use a module. Thus, the final network will be a module of modules, making the code much more readable and actionable.

Besides, another use case is when we create a network that is composed of several networks, as we saw in the post on how to create a GAN in Python. In these cases, the simplest thing is that each network is a module itself and that the entire network is a module of modules.

In summary, as a rule of thumb, if we are going to create a simple neural network, we have to do more than with sequential networks. However, for more complex networks, better create classes.

Now that you know how to create the structure of our network … let’s see how to convert our data into tensors that Torch understands!

Convert data to Tensors

Now we have the structure of our network created, but of course, with this, we are not even able to get a prediction. To do this, we must convert our data into Tensors, the data format that Torch understands.

As in the rest of frameworks, in Torch there are two main types of data that we can pass: numerical data and images. This is a complete tutorial for Torch in R, so don’t worry, we’ll see both. Let’s start with the numerical data!

Numeric input data

If we work with numerical data we will have to convert our input data and our labels into tensors.

To do this, Torch offers us the function, to which we must pass two parameters:

  1. The data that we want to convert into a tensor.
  2. The type of data to be interpreted by Torch. Each type of data has a function, although we will generally use numbers, which are defined with the function.

As you can see, it is very easy, although there are two important details that, if you know, will save you time:

Detail 1. Data types that you can convert into Tensors

Generally, when working with R we tend to work with dataframes. However, the function does not support dataframes, as it only supports three types of data: arrays, vectors, and arrays (images).

So if you have the data stored in a data frame, make sure to convert your data to these data types first.

Detail 2. Data types according to the cost function

As I said before, we are most likely working with numeric data, so we define our tensors as .

However, different cost functions may require different types of data, if you don’t have the data in the right type, it will return an error (with a pretty clear message, though). In my case, for example, the cost function is Categorical Cross Entropy, which requires the labels to have a long format.

Luckily, this is something that is clearly explained in the error message, so if this is the case, it will be easy for you to detect it.

That being said, let’s put it into practice with an example.

Example of Converting Numeric Input Data to Tensor

In our case we are going to simulate a real case with the Iris dataset. To do this, we will:

  1. Split our data between train and test.
  2. Convert our input data to matrices and labels to vectors.
  3. Convert our input data and labels into tensors.

A good way to check that the network is created correctly is to verify that the prediction and the output data are the same size. In our case, when using CrossEntropyLoss, it will not be fulfilled, but in the vast majority of cases, it is::

Now that we know how to handle numeric input data, let’s see how to use images as input data!

Use images as input data in Torch R

Applying transformations on Images

Regarding image processing, Torch has an additional package called that helps to load and transform images, among other things. So the first thing to do is install and load the package.

Now that we have the package installed, we are going to download a simple image to see the transformations it offers:

Now that we know the image, we can see everything we can do with the Torchvision transformations, among which we find:

  • and : allow you to crop the image. In the case of we have to indicate from which pixel (below and to the left) to start, while with center_crop cuts starting from the middle.
  • and : allow you to re-scale the image to fit the desired size. In addition, in case we want to cut it, we have the resized_crop function, which allows us to carry out both steps in one.
  • and : allow flipping the image both horizontally and vertically. This is a very common technique to significantly increase the number of images in the dataset.
  • : this is not a function, but rather a family of functions that allow you to modify different aspects of the image, such as brightness, contrast, gamut, or saturation.
  • : as in the previous case, it is a family of functions that allow performing all the functions mentioned above, but only to a random sample of images. In this way, we can increase the number of images preventing the network from learning to “undo” the generated transformations.

That being said, let’s see an example:

Things to take into account of the transformations

As we can see, the transformations are very simple. However, currently, version 0.1.0 of torchvision does not allow to perform all the operations on all the types of objects indicated in the documentation. For example, in the documentation of the image adjustment functions, they indicate that it works on objects of class , when, in fact, it does not (it returns an error).

Luckily, when working with images with Torch in R, the images we have created are generally not wanted to display. Instead, these images are to be passed to a neural network to train the network. To do this, Torch uses the .

Upload images to Torch with dataloaders

Dataloaders allow images to be treated in a grouped manner. For example, when we train a CNN, we generally train the network with groups of images called batches. Dataloaders allow us to define the batch size, if the images are the same or if each batch should have a random sample, etc.

In addition, to see how the dataloaders work, we are going to take the opportunity to see how to download a torchvision dataset, as well as learn how to load a dataset that is saved in a local folder. To do this, we will:

  1. Download a preloaded dataset from torchvision.
  2. Upload the images from the folder we just downloaded.
  3. Create the dataloader that allows us to load the images in batches with a certain size.
  4. Load a batch.

As we can see, with Torch downloading datasets, uploading our own datasets, and uploading those datasets in batches is very very simple, although it is true that they still have things to polish.

Now, we have the structure of the network, the input data, and we already know how to forward-pass to obtain a prediction. There is only one thing left: train the network. Let’s go for it!

How to train a neural network with Torch in R

Regardless of how we created the network, training a neural network in Torch consists of four steps:

  1. Set gradients to zero.
  2. Define and calculate the cost and optimizer
  3. Propagate the error on the network.
  4. Apply gradient optimizations.

Set gradients to zero

In order to apply the optimizations, Torch accumulates the gradients of the backward passes. We do not want the gradients to accumulate, but in each backward pass, we want to take a step in the direction towards the minimum, that is, at each iteration, apply a single gradient. Therefore, in each pass of the training, we must start by setting the gradients to zero.

We can do this very easily by applying the zero_grad method of the optimizer that we are going to define. In our case, we will do this point with the following function:

Yes, I know it seems silly, but if you skip this step, your network will not train well.

Define and calculate the cost and optimizer

As in the case of Keras, in Torch we have many functions to be used as cost functions, although basically, the three main ones are:

  • : it is used with classification networks with more than 2 possible categories.
  • : tt is used in binary classification algorithms, that is, when we predict between two classes.
  • : for the algorithms used in regression problems.

Once we have defined the cost function, we must calculate the cost. To do this, we simply have to pass the prediction and the actual values to our cost function.

Note: In Torch it is a standard to define the cost function in the criterion variable. It is not mandatory, but surely in the documentation (especially PyTorch), you will see it like this.

In this sense, it is important to highlight an issue that I have already commented on: different cost functions require that the output data or the labels to be in a specific form. For example, in my case I use nn_cross_entropy_loss, the labels must be of type long, otherwise, it will not work.

So, let’s see how to create and calculate the error:

Also, we have to define the optimizer. In this case we also have many options, although generally the Adam optimizer will be used, which gets good results.

To create our optimizer, we simply have to pass the parameters of our model to the optimizer function we want. In our case, the optimizer Adam is defined in the function.

Propagate the error on the network

Now that we have calculated the error in the last layer and our optimizer, we have to propagate the error through the network and then apply the gradients.

To propagate the error through the network, simply apply the backward method of our error.

Apply gradient optimizations

Finally, now that we have the error in each neuron, we must apply the gradients. To do this, you simply have to apply the step method of our optimizer.

These would be all the steps to train a neural network with Torch. But, I think it will be clearer if we see it all together in a practical example, don’t you think? Well let’s get to it!

Example of training a neural network in R with Torch

Now that we know all the steps separately, let’s see how everything would be done together. As you will see, you simply have to “put together” the different parts that we have used so far.

We have just created a neural network with Torch in R! Pretty easy, right? There is only one thing left to learn: learn to save our models in order to put them into production.

Save and Load Models with Torch

Now that we have our model trained, we will have to put it into production (which I explained in this post). To do this, you have to know how to save and load Torch models, which has a couple of important issues.

First of all, if your model uses a or layer we have to put those layers in evaluation mode before making the prediction, otherwise, the results will be inconsistent. To do this, we have to apply the eval method of our model. If our network does not have this kind of layers, we can skip this step.

Now that we have our layers in “prediction mode”, we are going to save the model. PyTorch models are usually saved with the extensions or . In our case, we can save Torch models in R with this extension or, if we also have PyTorch models, we can save them with the extension , ..

Finally, to load the model we simply have to pass the torch_load function and assign it to the variable that we want:

And this would be all! You already know how to create neural networks with Torch in R (in two different ways), how to create the input data (even applying batch transformations in case they are images), how to train your neural networks and, finally, how to save them.

Finally, I would like to make a small comparison between Torch and Keras / Tensorflow, since I think it is something that is worth it.

Comparison creating neural networks with Torch vs Keras / Tensorflow in R

If we want to compare Keras / Tensorflow and Torch in R, I think there are two things that we have to consider:

  • Torch, currently (v0.1.1) is in a maturing state. Broadly speaking, it works well, but it does show in two ways:
    • Lack of documentation. There is a lot, a lot of documentation and real examples missing in R. It is true that the framework is the same as in Python, so you can review the PyTorch documentation, as long as you know about Python. Without a doubt, I think it is something very relevant.
    • Incomplete functions: Not all functions have the characteristics that are indicated in the documentation. While it is something that will be fixed in the short term, as of this writing, it is somewhat annoying.
  • Keras and Tensorflow require Python, while Torch works directly with C ++. In terms of putting the models into production, this makes a big difference, since if you want to put a Tensorflow / Keras model into production with R code, the Docker will weigh almost twice as much as if you do it with Torch.

That being said, there are personally certain things that I like best about each of them. In my opinion, Torch is somewhat more intuitive to use, as it seems to be more integrated into the programming language. This also makes it more flexible for me.

Pn the other hand, Tensorflow / Keras have functionalities not developed in Torch that save you from having to write more lines of code, such as Tensorboard.

In any case, I think both frameworks are very interesting, useful and, conceptually, they work with quite similar abstractions. So if you can, I would encourage you to learn both.

As always, if this complete Torch in R tutorial has been interesting to you, I encourage you to subscribe so as not to miss the posts that I am uploading. And if you have any questions, do not hesitate to write to me. See you in the next one!

  1. Dallas stars elite girls
  2. Razer gold login
  3. Melanie and ian pawlowski
  4. Ff7 collectors

A first look at torch for R

In this post, I explore - a package for R that mirrors the PyTorch framework for deep learning.


I’ve been a bit reluctant to join in on the deep learning hype for some time. Much of this I attribute to my lack of enthusiasm toward Python frameworks for deep learning. Don’t get me wrong. Tensorflow + Keras offers an intuitive API for neural nets, but can I just be frank and say I like R better for everything else?

A few years back, I was briefly tantalized by the package for R, but I couldn’t establish a solid workflow with its Python backend constantly getting in the way. (Many an enfuriating hour was spent fruitlessly trying to configure my GPU).

Jump ahead to autumn of 2020 and the package for R is announced. My enthusiasm is rekindled when I hear that is a native R package that uses a C++ backend instead of Python. The clouds began to part.

However, is still somewhat in its infancy and although it is capable of what most mature deep learning frameworks can do, it doesn’t offer the kind of high-level API that makes it intuitive to grasp for beginners. The purpose of this post is to help the reader get familiar with the package. This post will be focused more on the programming aspects of rather than the mathematical and theoretical aspects of deep learning.

Road Map

This post will be broken into 4 steps:

  1. Get and explore the data
  2. Create a object
  3. Build a network
  4. Run the model

1. Get and explore the data

In this tutorial, we’ll use the Star Type Classification Data from NASA shared on The data contains details about stars and their type, which we will attempt to predict.

I like to use to import data directly from Kaggle into R. The code below will show how to get the data using , assuming you have an account with Kaggle. You can also just go to the link and download the data as a .csv if you prefer.

The data includes the following features to use as predictors:

  • = Relative Luminosity (L/Lo)
  • = Relative Radius (R/Ro)
  • = Absolute Magnitude
  • = Temperature in Kelvin
  • = Star color

We want to use these to predict star type, which is encoded in the variable. There are six different types of stars: 0 = Red Dwarf, 1 = Brown Dwarf, 2 = White Dwarf, 3 = Main Sequence, 4 = Super Giants, and 5 = Hyper Giants.

There are some categorical features here. Color is one we’ll take a closer look at.

Lots of red and blue - that’s ok. But what’s the deal with different categories for and ? We see a similar pattern elsewhere. We’d do well to implement some string handling as well as factor lumping. For this, we’ll leverage the and packages, respectively, which are loaded as part of the .

Instead of directly changing the data, I’m going to create a function that we can include in a preprocessing pipeline.

2. torch Datasets

Before we dig into the code we should familiarize ourselves with some semantics.


If you know a little about deep learning, you know that it works with tensors. A tensor is like a matrix but it can have more dimensions. As well, tensors can be on a GPU which makes for much faster learning.

In , our data must be represented as a object. provides a number of functions for creating tensors from scratch or transforming native R objects to tensors and back:


In , a Dataset is kind of like a traditional , but it has a few special features that make it easier for deep learning. Instead of thinking of it as a static spreadsheet, think of it as a function that will feed data into our network in bite-sized chunks or batches.

We create a Dataset object using the function. It’s essentially a of attributes and methods that we can access from the Dataset. A Dataset should have the following attributes:

  1. Name (e.g., )

  2. method that takes in a

  3. method that allows us to index the Dataset (e.g., to get first row of data)

  4. method to get the number of rows in the Dataset.

It’s also in here that we run our preprocessing on the data and convert everything to a . Moreover, it’s wise to split our predictors into separate tensors for numeric and categorical types because we’ll treat these differently in our neural net.


We have our training and validation Datasets. Last thing we need to do with the data is create objects. A feeds batches of data through the network. We shuffle the training set so that at every epoch (iteration of the learning phase) the data is reshuffled.

3. Building the neural net

With the data spoken for we can move on to creating the neural net architecture. represents network architectures as one or more modules. As we’ll see, modules can be combined together to make more complex models out of separate, reusable chunks. We’ll start this section by creating a small module to handle the categorical features in the data and then a larger module representing our full neural net.

Embedding Categorical Features

Our data has two categorical features, Color and Spectral Class. Because these features don’t have an inherent ordering to them, we can’t use the raw numeric values. The solution is to use embeddings. This means we represent each level of the categorical feature in some n-dimensional space. The space is represented by a trainable vector. In other words, the embeddings are parameters in the model.

We use to create the embedding layer. In , nn_modules have a method. The forward method details how the module will feed data through the network when the network is making predictions (i.e., feed-forward). This also means that the nn_module contains parameters that can learn.

There’s a lot going on here. Taking it piece by piece, we create a list called using the function. For each set of levels (there are two sets because there are two categorical features) we instantiate an embedding layer - with dimension roughly half the number of levels.

The piece is responsible for taking each of the embedding layers and using it on each of the categorical predictors. then combines everything together into one along the second dimension.

Neural net architecture

Our neural net will consist of a series of layers and activation functions. In the initialize method we create the layers. Because we’re dealing with fairly simple tabular data, we’ll stick with fully-connected layers (customarily denoted as ).

I follow the advice in the torch for tabular blog post and use the number of levels + the number of numeric columns as the first input layer dimension. For a fully-connected network, the number of features we input in each layer should equal the number of features outputted by the preceding layer.

In the forward method, we pass the predictors through each layer in the network. Note that we output with a softmax activation function because we are predicting a categorical variable with more than two possible values.

Ok, so we have defined the network, but we still need to instantiate it. Our net needs a vector of levels and the number of numeric features.

The network can run on a GPU if you have one, otherwise a CPU is fine. (For this example the model will run fast enough on CPU anyway).

4. Run the model

To run the model we need to choose an optimizer. This is the algorithm that will modify the weights so as to minimize the loss function. I’ll use but you can experiment with others like .

We also need to set up the training and evaluation loop. This part looks a bit daunting, but a lot of it is repetition (we do almost the same thing for both the training and evaluation). Perhaps the most awkward part is where we assign the output. Ignore the part and just look at the subscripted pieces, and . We’re just passing in the numeric and categorical features, respectively. The model uses the weights to make a prediction.

Once we have a prediction, which we assign to , we need to compute the loss. For multilabel classification, we use , passing it the output and the true label. The rest is boilerplate to backpropagate and the loss and update the weights.

We can understand better how the model performs with a confusion matrix.

In the end, it’s the small amount of data that keeps us from doing much more with this model. We could probably do just as well with multinomial regression. Still, the goal was to explore on a fairly tame data set.


I’m extremely grateful for the documentation and tutorials for helping me get started with this post. I’m also indebted to the brilliant blog posts on the RStudio AI Blog. In particular, I leaned heavily on this post by Sigrid Keydana for insights as well as for help with the embedding module.

How to build a CNN AI using pyTorch for R

introducing torch for R

As of this writing, two deep learning frameworks are widely used in the Python community: TensorFlow and PyTorch. TensorFlow, together with its high-level API Keras, has been usable from R since 2017, via the tensorflow and keras packages. Today, we are thrilled to announce that now, you can use Torch natively from R!

This post addresses three questions:

  • What is deep learning, and why might I care?
  • What’s the difference between and ?
  • How can I participate?

If you are already familiar with deep learning – or all you can think right now is “show me some code” – you might want to head directly over to the more technical introduction on the AI blog. Otherwise, you may find it more useful to hear about the context first, and then play with the step-by-step example in that complementary post.

What is deep learning, and why might I care?

If you’re a data scientist, and your data normally comes in tabular, mostly-numerical form, a toolbox of linear and non-linear methods like those presented in James et al.’s Introduction to Statistical Learning may be all you need. This holds even more strongly if the number of data points is limited, as tends to be the case in some academic fields, such as anthropology or ethnology. In this case, Bayesian modeling, as taught by Richard McElreath’s Statistical Rethinking, may be the best approach. Carrying the argument to the extreme: Yes, we can construct deep learning models to predict penguin species based on biometric attributes, and doing this may be very useful in teaching, but this type of task is not really where deep learning shines.

In contrast, deep learning has seen its greatest successes when there are lots of data of a type that is often (misleadingly) called “unstructured” – images, text, heterogeneous data resisting unification. Over the last decade, public triumphs have spread from image classification and related tasks, like segmentation and detection (important in many sciences), to natural language processing (NLP); prominent examples are translation, summarization, and dialogue generation. Beyond these areas of benchmark datasets and official, academically organized competitions, deep learning is pervasively employed in generative art, recommendation systems, and probabilistic modeling. Needless to say, current research is working to expand its limits even more, striving to integrate capabilities for e.g. concept learning or causal inference.

Many readers are likely to work in a field that could benefit from deep learning. But even if you don’t, learning about how a technology works yields power, power to look behind appearances and make up your own mind and decisions.

What’s the difference between and ?

In the Python world, as of 2020, which framework you end up using for a project may be largely a matter of chance and context. (Admittedly, to say so takes the fun out of “TensorFlow vs. PyTorch” debates, but that’s no different from other popular “comparison games”. Take vim vs. emacs, for example. How many people, among those who use one of them preferentially, have come to do so because “that’s what I learned first” or “that’s what was used in my first company”?).

Not too long ago, there was a big difference, though. Before the introduction of TensorFlow 2 (the current release is 2.3), TensorFlow code was compiled to a static graph, and raw TensorFlow code was hard to write. Many users didn’t have to write low-level code, however: The high-level API Keras provided concise, declarative idioms to define, train, and evaluate a neural network. On the other hand, Keras did not, at that time, offer a way to easily customize the training process. Ease of customization, then, used to be PyTorch’s competitive advantage, relevant to researchers in particular. On the other hand, PyTorch did not, initially, excel in production and deployment facilities. Historically, thus, the respective strengths used to be seen as ease of experimentation on the one side, and production readiness on the other.

Today, however, with TensorFlow having become more flexible and PyTorch being increasingly employed in production settings, the traditional dichotomy has weakened. For the R user, this means that practical considerations are likely to prevail.

One such practical consideration that, for some users, may be of tremendous importance, is the following. and are based on reticulate, that helpful genie which lets you use Python packages seamlessly from R. In other words, they do not replace Python TensorFlow/Keras; instead, they wrap its functionality and in many cases, add syntactic sugar, resulting in more R-like, aestethically-pleasing (to the R user) code.

is different. It is built directly on libtorch, PyTorch’s C++ backend. There is no dependency on Python, resulting in a leaner software stack and more straightforward installation. This should make a huge difference, especially in environments where users have no control over, or are not allowed to modify, the software their organization provides.

Otherwise, at the current point in time, maturity of the ecosystem (on the R side) naturally constitutes a major difference. As of this writing, a lot more functionality – as well as documentation – is available in the ecosystem than in the one. But time doesn’t stand still, and we’ll get to that in a second.

To wrap up, let’s quickly mention another aspect, to be explained in more detail in a dedicated article. Due to its in-built facility to do automatic differentiation, can also be used as an R-native, high-performing, highly-customizable optimization tool, beyond the realm of deep learning. For now though, back to our hopes for the future.

How can I participate?

As with other projects, we sincerely hope that the R community will find the new functionality useful. But that is not all. We also hope that you, many of you, will take part in the journey. There is not just a whole framework to be built. There is not just a whole “bag of data types” to be taken care of (images, text, audio…), each of which requires their own pre-processing functionality. There is also the expanding, flourishing ecosystem of libraries built on top of PyTorch: PySyft and CrypTen for privacy-preserving machine learning, PyTorch Geometric for deep learning on manifolds, and Pyro for probabilistic programming, to name just a few.

Whether small PRs for torch or torchvision, or model implementations, or help with porting some of the PyTorch ecosystem – we welcome any participation and support from the R community!

Thanks for reading, and have fun with !


R torch for


Lifecycle: experimentalR build statusCRAN statusDiscord


torch can be installed from CRAN with:


You can also install the development version with:


At the first package load additional software will be installed.

Installation with Docker

If you would like to install with Docker, please read following document.


You can create torch tensors from R objects with the function and convert them back to R objects with .

library(torch) x<-array(runif(8), dim= c(2, 2, 2)) y<- torch_tensor(x, dtype= torch_float64()) y#> torch_tensor#> (1,.,.) = #> 0.5955 0.3436#> 0.4946 0.4344#> #> (2,.,.) = #> 0.9322 0.7824#> 0.6503 0.7516#> [ CPUDoubleType{2,2,2} ] identical(x, as_array(y)) #> [1] TRUE

Simple Autograd Example

In the following snippet we let torch, using the autograd feature, calculate the derivatives:

x<- torch_tensor(1, requires_grad=TRUE) w<- torch_tensor(2, requires_grad=TRUE) b<- torch_tensor(3, requires_grad=TRUE) y<-w*x+by$backward() x$grad#> torch_tensor#> 2#> [ CPUFloatType{1} ]w$grad#> torch_tensor#> 1#> [ CPUFloatType{1} ]b$grad#> torch_tensor#> 1#> [ CPUFloatType{1} ]


No matter your current skills it’s possible to contribute to development. See the contributing guide for more information.

Hozier - Take Me To Church (Official Video)


Now discussing:


401 402 403 404 405