# Automating R scripts on Linux with cron

cron is a task scheduler that comes baked into Linux.

The heart of cron is the crontab file that you can add tasks to.

To edit the crontab file type:

crontab -e


This will open the VI editor.

To exit, press esc, type in :wq, then press Enter. Intuitive, right? I know.

Comments in the crontab file start with #, and tasks take the form:

# Check out this cool task below!
MIN HOUR DOM MON DOW CMD


Allowable values for each parameter are detailed in this table that I copied from Geeks for Geeks:

FieldDescriptionAllowed Value
MINMinute field0 to 59
HOURHour field0 to 23
DOMDay of Month1-31
MONMonth field1-12
DOWDay Of Week0-6
CMDCommandAny command to be executed.

You can use a * in any of the date-time fields to indicate all values. Therefore, 1 * * * * CMD executes CMD every minute of every hour of every day of the month of every month and so on.

## But how do we use this to automate R scripts?

First, the CMD is RScript. Next, we pass RScript the .R script we want to run (see the docs).

Let's pretend we have a script (my_script.R) that we want to run once per minute. This script generates 100 random samples from a normal distribution with mean=0 and sd=1 and writes them to a csv called my_file.csv:

library(readr)

d <- rnorm(100)

write_csv(data.frame(num = d), "my_file.csv")


Now we locate RScript. In your favorite R development environment, run R.home().

On my Mac it's:

> R.home()
[1] "/Library/Frameworks/R.framework/Resources"


Whereas on the EC2 I'm running on AWS it's:

> R.home()
[1] "/usr/lib/R"


You can navigate to this directory to verify that RScript lives there, or believe me.

## Putting it all together

Let's create a crontab that runs my_script.R once every minute. We use RScript to run my_script.R. We add the following line to the crontab file we opened with crontab -e:

# once every minute, run my_script.R
1 * * * * RScript "my_script.R"


Note that the first line is just a comment, whereas the second line is the command. Moreover, in the example above, you need to:

1. specify the full path of RScript
2. specify the full path of my_script.R
I've found that on the AWS EC2 I'm using, ~/my_script doesn't work, whereas /home/richpauloo/my_script.R does.

Here are some resources I found helpful in writing this short summary:

##### Rich Pauloo, PhD
###### Data Scientist

My interests include data science, hydrology, geology, physical simulation, building simple solutions to complex problems, and expedition behavior.