Kelson Martins Blog

Recently when working with scheduled python scripts on cron, a requirement appeared to check if the previous script execution had finished before allowing the next cron.
This post aims to present a possible solution for such requirement, which I believe deserver a post.

Linux Scenario

Let’s have a scenario where we have a simple cron scheduled such as:
*/5 * * * * /usr/bin/python /home/kelson/scripts/python_task.py
This schedule the script to be executed every 5 minutes.
Let’s now assume that the scripts perform a series of tasks.
Let’s also assume that the execution time of these tasks varies based on external factors which are out of the control of the script.
Saying that one script execution may take 10 seconds to execute. Another execution may take 30 seconds, or a minute, or 2, or 10 and so on.
As you may have noticed, if our script is scheduled to run every 5 minutes and for any external reason our script execution takes more than 5 minutes to be executed, another instance of our script will run and this is exactly what we want to avoid.

Using a PIDFILE

The approach present here is to make use of a PIDFILE to control whether the process is already running.
#!/usr/bin/python
import os
import time
# Get process PID
PID = str(os.getpid())
PIDFILE = "RUNNING.pid"
def can_it_run():
  # Check wether lock PIDFILE exists
  if os.path.isfile(PIDFILE):
    return False
  else:
    return True
def run():
  # Write lock file containing process PID
  file(PIDFILE, 'w').write(PID);
  print "Executing under PID " + PID
  try:
    # Simulating miscellaneous tasks
    time.sleep(60)
  finally:
    # removing lock file upon script execution
    os.unlink(PIDFILE)
if __name__ == '__main__':
  if can_it_run():
    run();
  else:
    # Retrieving PID of previous execution
    old_pid = ''.join(file("RUNNING.pid"))
    print "Script already running under PID %s, skipping execution." % old_pid
This simple script demonstrates the use of a PIDFILE to control whether the script is allowed to execute or not, pending previous script execution completion.
On each execution, we check whether the PIDFILE exists.
When negative, we create a lock file containing the PID of the process and proceed execution.
When positive, we skip current execution printing the PID of the previous execution.
Note the finally block on the try statement, which removes the lock file upon execution.
Let’s try it?
Open a terminal to execute the script to get an output similar to:
[[email protected] scripts]$ python pid.py
Executing under PID 320
While the script is running, listing files under the script folder will show the PIDFILE as shown:
[[email protected] scripts]$ ls -l /home/kelson/scripts/
total 8
-rwxrwxr-x. 1 kelson kelson 523 Sep 19 11:00 python_task.py
-rw-rw-r--. 1 kelson kelson   4 Sep 19 13:19 RUNNING.pid
Let’s cat the contents of the file, which will show us the PID of the python script process:
[[email protected] scripts]$ cat RUNNING.pid
320
Great, now if you open a second terminal to execute another instance of the script under 60 seconds, your output will be:
[[email protected] scripts]$ python pid.py
[[email protected] scripts]$ Script already running under PID 320, skipping execution.
Consequently, after 60 seconds, the script execution will end, cleaning the PIDFILE, allowing further script execution.

Terminating old processes

For demonstration purposes, let’s provide an (not recommended) alternative. Instead of skipping the execution of the script, let’s kill the old process and initiate a new one.
For that, let’s make an update on the __main__ method to extract the process PID from the PIDFILE and kill it before starting a new one:
if __name__ == '__main__':
  if can_it_run():
    run();
  else:
    # Retrieving PID of previous execution
    old_pid = ''.join(file("RUNNING.pid"))
    print "Script already running under PID %s, which is now being terminated." % old_pid
    # forcing a new execution by killing old process
    os.kill(int(old_pid),signal.SIGTERM)
    run()
Note that we retrieve the old_pid from the PIDFILE before terminating it.
We then start a new execution.
Let’s now try again.
Open a terminal to execute the script to get an output similar to:
[[email protected] scripts]$ python python_task.py
[[email protected] scripts]$ Executing under PID 2022
Let’s now force the execution of a second instance and note the results.
[[email protected] scripts]$ python python_task.py
Script already running under PID 2022, which is now being terminated.
Executing under PID 2024
Great, we now terminated the python old process before spawning a new one.

Final Considerations

If for any reason the PIDFILE exists, but the PID inside it is not running, this indicates that the script did not shut down gracefully.
This will block any further attempts on executing the script until the PIDFILE is removed.
This may or may not be the behavior that you are expecting, depending on each use case. A non-graceful shutdown may require log investigation to detect the real culprit.
To force the removal of the PIDFILE independently on how it was terminated, the python module atexit may be used, which is out of the scope of this post but may be a topic of a future post =).
The full code snippet is available on Gist.

Software engineer, geek, traveler, wannabe athlete and a lifelong learner. Works at @IBM

Next Post