Automating R scripts on Linux with cron

By Rich

January 21, 2020


Hieronymus Bosch, “The Visions of Tondal” 1479.

Introduction

cron is a task scheduler that comes baked into Linux.

The heart of cron is the crontab file that you can add tasks to.

To edit the crontab file type:

crontab -e

This will open the VI editor.

To exit, press esc, type in :wq, then press Enter. Intuitive, right? I know.

Comments in the crontab file start with #, and tasks take the form:

# Check out this cool task below!
MIN HOUR DOM MON DOW CMD

Comments and tasks can’t live on the same line.

Allowable values for each parameter are detailed in this table that I copied from Geeks for Geeks:

Field Description Allowed Value
MIN Minute field 0 to 59
HOUR Hour field 0 to 23
DOM Day of Month 1-31
MON Month field 1-12
DOW Day Of Week 0-6
CMD Command Any command to be executed.

You can use a * in any of the date-time fields to indicate all values. Therefore, 1 * * * * CMD executes CMD every minute of every hour of every day of the month of every month and so on.


But how do we use this to automate R scripts?

First, the CMD is RScript. Next, we pass RScript the .R script we want to run ( see the docs).

Let’s pretend we have a script (my_script.R) that we want to run once per minute. This script generates 100 random samples from a normal distribution with mean=0 and sd=1 and writes them to a csv called my_file.csv:

library(readr)

d <- rnorm(100)

write_csv(data.frame(num = d), "my_file.csv")

Now we locate RScript. In your favorite R development environment, run R.home().

On my Mac it’s:

> R.home()
[1] "/Library/Frameworks/R.framework/Resources"

Whereas on the EC2 I’m running on AWS it’s:

> R.home()
[1] "/usr/lib/R"

You can navigate to this directory to verify that RScript lives there, or believe me.


Putting it all together

Let’s create a crontab that runs my_script.R once every minute. We use RScript to run my_script.R. We add the following line to the crontab file we opened with crontab -e:

# once every minute, run `my_script.R`
1 * * * * RScript "my_script.R"

Note that the first line is just a comment, whereas the second line is the command. Moreover, in the example above, you need to:

  1. specify the full path of RScript
  2. specify the full path of my_script.R

I’ve found that on the AWS EC2 I’m using, ~/my_script doesn’t work, whereas /home/richpauloo/my_script.R does.


Here are some resources I found helpful in writing this short summary:

  1. Steven Mortimer’s blog
  2. Geeks for Geeks
Posted on:
January 21, 2020
Length:
3 minute read, 429 words
See Also: