Several moments ago, I was wondering if we have multi cron job that are running same binaries, but we need to make sure that each job / task process is being ran only by one instance, is there a way to achieve that. I know this is weird, why do we bother having multiple binaries? just create one for the cron job! It is right though, but wondering about the problem, I was triggered if we could do it with MYSQL, and it can.
Given Problem Statement and Solution
To sum up, here are the challenge
- You have multiple binaries that is running a cron, it can run multiple jobs at your designated time (every 5' clock, every two minutes, every one hour, etc)
- You need to make sure each job / task if it’s “in the same time” can is only being processed by one binary / process
- Utilizing Mysql for the “distributed lock” like mechanism
Solution that comes up in mind
- Every pair of job and the time it is running, should be marked / labeled with a unique key
- Use these key by utilizing unique key constraint in Mysql
- Any job / task triggered by another cron with same key will be rejected (not processed), only the first one ignited will be processed.
Disclaimer first, that this is not exactly a distributed lock mechanism, the focus is that only one thing can run the same task in the same time. Therefore only one can get “a lock”, but no need to “release the lock”. So let’s change the calling of “acquiring lock” to “Get permission”. The one who “got permission” can run the process, otherwise cannot.
For instance, let say we have a cron that run these:
- RemoveExpiredUserCart (run every day at 01:00 AM)
- ReminderUserBuyNotification (run every 1 hour)
and for some reason maybe you run the binary consisting those cron in a multiple pods or multiple VM instances.
Get Permission Process
So how are we going to do it? The process contains this steps
- Create a table (ie “job_list”) that contains one column only. Column “job_key” with type of VARCHAR (100) as an example. Set this key to be unique (add constraint for the column)
- For every task / job paired with the time it runs, create a unique combination key in string form
- Insert this key to our “jobs_list” table, and if it’s detected as duplicate, do nothing instead
With those two steps, we’re ensuring that only one key will be inserted, therefore only one logic / binary will run the task.
In example, these are the keys that might be suitable as a unique identifier (unique key):
- RemoveExpiredUserCart process | key (string) = “exp_usr_cart:<date_pattern>” |example: “exp_usr_cart:15–01–2021”
- ReminderUserBuyNotification process | key (string) = “rmnd_usr_buy_notif:<timestamp_until_hour>” |example: “rmn_usr_buy_notif:15–01–2021:15” (if at 15.00, or 15.20 or 15.13, basically all 3PM converted into 15) or “rmn_usr_buy_notif:15–01–2021:23” if it’s at 11PM (regardless the minutes).
User cart unique key is chosen due to it’s being ran once in a day, so a date (year-month-date) should be sufficient to determine its uniqueness (if we have 3 crons for example, all will use the the same date when run on the same day).
For the User reminder notification, since it’s every hour, we need to make the uniqueness down to the hour detail (15 → 3 PM, or 23 → 11PM) or so. The minutes are discarded since let say Cron A runs on 15.01 and Cron B runs on 15.02 → Both will be considered as one process == cron running at 3 PM, therefore, using the same key exp_usr_cart:15–01–2021:15.
After we got our unique key, then we will do insertion upon our table (“job_list”) with this query.
INSERT INTO job_list (job_key) VALUES (‘<your_cron_task_key>’) ON DUPLICATE KEY UPDATE job_key = job_key
Detail of the process:
- We’re going to insert the key into the table to “mark” that this process is asking a permission to run and record it into the table
- If the key is duplicate (same job key already inserted by another process first), update the job_key to its own value (aka do no update its value at all in MYSQL)
- Mysql will return the number of updated / affected / inserted rows, if it’s greater than zero, then it means that we’re the first job that successfully get permission then run the process, else (zero rows affected) means that we’re failed to get permission, therefore just return or do not run the next process.
The simplified flow can be draw into this one:
Only one Job A will run with the other job A (cron 1 and 2) failed to get the permission. It works because MYSQL prevent the duplication on the key insertion and update, it’s safe even if you have so many instance / pods running the same binaries (take note that choosing the right unique key is very important to prevent more than 1 machine running same job at the same “time range”). Also keep in mind that we need to delete old key in the tables for any unused old keys.
Hope it’s useful, thanks for reading and have a good day :)