Asynchronous execution of timer jobs with multiple Job Executors

erikvoorbraak · January 20, 2020, 1:22pm

I have a setup where I want to run multiple Job Executor instances in parallel. Actually, that’s how it is recommended in this article: http://www.mastertheboss.com/jboss-jbpm/activiti-bpmn/clustering-activiti-bpmn.

I had a look at the source code, and it looks like Flowable does it like this:

Query all not-yet-locked timed jobs from ACT_RU_TIMER_JOB (the condition for lock existence is “LOCK_EXP_TIME_ is null”);
Lock these jobs so that other Job Executors cannot start these jobs in parallel.

If this is indeed how it works, then there is a small time gap between step 1 and step 2. This could introduce a race condition when two threads are doing this at the exact same time: thread B could find jobs that thread A also found but didn’t persist the locks for yet.

Could this really happen, or is there another mechanism at work that prevents this race condition?

erikvoorbraak · January 21, 2020, 9:45am

I found a hint that there is indeed another mechanism at work, which is ‘optimistic locking’, in this code of AcquireTimerJobsCmd:
protected void lockJob(CommandContext commandContext, TimerJobEntity job, int lockTimeInMillis) {
// This will trigger an optimistic locking exception when two concurrent executors
// try to lock, as the revision will not match.

This gives me some hope that we won’t run into race conditions. Digging in further, I found code in DbSqlSession.flushUpdates() that throws FlowableOptimisticLockingException when optimistic locking fails.

Hence the race condition is not really avoided, but the effects are mitigated when it occurs.

filiphr · January 21, 2020, 3:10pm

Hey @erikvoorbraak,

Your analysis is spot on.

In theory you cannot really avoid such race condition. That is why there are multiple layers of protection to ensure that once a lock is acquired the job can be executed. Failing to acquire a lock (via optimistic locking) is actually the mechanism that ensures that the timer job would be executed only once which is what you need.

Topic		Replies	Views
OptimisticLockingException in a clustered environment Flowable Engine	2	1094	August 9, 2017
Behavior of async executor Flowable Engine	8	4316	December 19, 2017
Flowable Engine in clustered enviroment Flowable Engine	5	940	October 18, 2019
Asynchronous Service Invocation using Flowable Flowable Engine	3	2307	January 29, 2020
How to get a single multi-instance task to execute in parallel? Flowable Engine	4	2829	August 16, 2019

Asynchronous execution of timer jobs with multiple Job Executors

Related topics