Using cron/crontab on a Linux server has always been a good way to control the scheduling of repeating background tasks. Especially in the days before cloud, containers and virtual machines. There are plenty of websites offer advice on “Crontab Best Practices” and many forum posts that start with
“I have inherited a sprawling crontab“. In a complex application crontab suffers “organic complexity” where the crontab entries become confusing to the new sysadmins one crontab entry at a time.
There are lots of posts about how to get a listing of all your cron jobs for all users that run on a machine. It is very easy to accidentally schedule resource intensive tasks at the same time and kill performance. Manually moving cron jobs to different servers can also create conflicting or duplicate tasks. Cron tasks running on multiple servers also can make debugging difficult. Debugging cron tasks usually means tailing/greping text logs. There is also no easy way to know that a task that used to take 1 minute is now taking 10 minutes.
The crontab scheduling syntax is elegant but does leave lots of room for errors by new sysadmins, or even experienced. Personally I like to add a comment for each line with the syntax of when the task is to run is spelled out in English and a description of what the task does. There are lots of cron tab generators and decoders online. For example https://crontab-generator.org/ and https://freeformatter.com. Just paste a mysterious cron entry into the decoder to get an explanation. When remediating sprawling crontabs it is not unusual to find tasks that should run once a month running everyday unintentionally.
Another annoyance of using cronjobs is being able to manage shutting down the tasks because of maintenance requirements. If the devops team is in the middle of performing an upgrade, we would want to ensure that cron tasks are not also running at the same time and which ones need to run after the maintenance event.
Fortunately, there is a terrific tool to solve all these problems with cron. The solution is Rundeck (https://www.rundeck.com/) . Rundeck can be deployed as a free open source version or it can be deployed as a paid enterprise version with support.
Rundeck provides a web based portal where less technical users can manage the scheduling of tasks. The web portal allows users to ‘self service’ tasks. The users can create jobs, run jobs, edit jobs and view the output of the jobs logs. This is terrific for audit purposes.
RunDeck allows the user to define worker nodes. In the age of cloud, this means that you can spin up multiple worker nodes (ec2 instances) to run specific task with unique resource size attributes. The nodes can be destroyed after the task is completed. This makes it easy to avoid idle cloud images that cost money while ensuring that the instances have the resources that match the task to run on them.
The old crontab issue of not being aware that a task did not run for some reason and manually having to figure out why also goes away. Notifications to users are easy to setup so that if a job fails all the relevant users will know immedicately. This is especially user for things like Salesforce integrations where your Talend task dies because of too many rows returned and Salesforce killed your extract.
The jobs can also be defined to pull the scripts for the task from a git repository. This ensures that there is only one source managed copy of each cron task and that everytime it runs it will pull the current version of the script.
With crontab the tasks are listed out under the user’s crontab. Most of the time in the past, the user running the task likely had more access than the task needed. Rundeck supports LDAP authentication as well as LDAP syncronization. I will save integrating JumpCloud with Rundeck for another blog post.
Rundeck also elegantly solve the problems of storing secrets and passwords for integrations in their Key Storage Facility. I will need to explore the potential to integrate Jumpcloud’s SSH key storage with Rundeck’s Key Storage Facility.
There are plenty of resources online on Youtube. Rundeck has many resources online here. The Rundeck community provides plenty of support. I will not repeat instruction on installing or using here.
Your devops team will be happy when all of the crontab tasks are migrated to Rundeck jobs. Visibility to running tasks, notifications and self service will make your technical users happier and productive.