Fast async job processing at first, slow after

pbastov · July 4, 2024, 2:06pm

Here is our setup:

We run 7 Kubernetes pods with Flowable (v6.4.1) in it configured using these settings (the rest are defaults)

asyncExecutorThreadPoolQueueSize=200
asyncExecutorCorePoolSize=1
asyncExecutorMaxPoolSize=10
asyncExecutorTimerLockTimeInMillis=600000 (10 minutes)
asyncExecutorAsyncJobLockTimeInMillis=600000 (10 minutes)
asyncExecutorMaxAsyncJobsDuePerAcquisition=5

We have a process definition that goes like this: Script Task (async=true) → Exclusive Gateway → Script Task. At the beginning of each month we have to run about 20K of process instances of this process definition.

Because the first Script Task is marked as asynchronous we end up having 20K entries in act_ru_job.

When I keep running this query repeatedly SELECT COUNT(*) FROM act_ru_job I observe this pattern: the count goes down pretty rapidly for some time, which means the jobs are being processed quite quickly. But then it stalls and takes forever to finish the jobs.

Is it a matter of misconfiguration, e.g. should I adjust some other settings to match the reality? At this point I ran out of ideas. If anyone came across a similar scenario and a similar issue I would appreciate if you could share your experience of solving the issue.

Thank you.

filiphr · July 8, 2024, 6:47am

Hey @pbastov,

I would suggest having the core pool size and max pool size to the same value.

I would also suggest to you to measure the runtime, how many jobs are getting rejected, how fast each job gets executed etc. If everything that is done in the first and second script task is the same speed you should not see a drop in the execution.

You said you are using Flowable 6.4.1, that version is from January 2019, that’s quite old. I would suggest migrating to a newer version, we’ve done numerous improvements around the job execution.

Cheers,
Filip

pbastov · July 8, 2024, 9:12am

Thank you @filiphr. I will try what you suggest…

Indeed, we are still using a pretty old version of the engine.

pbastov · July 8, 2024, 9:13am

Btw, @filiphr any advice on how to “measure the runtime”? Are there tools built into Flowable to do that?

pbastov · July 8, 2024, 9:24am

And another question, @filiphr: there is this parameter called asyncExecutorDefaultQueueSizeFullWaitTime that is described as follows:

The time (in milliseconds) the async job (both timer and async continuations) acquisition thread will wait when the internal job queue is full to execute the next query. By default set to 0 (for backwards compatibility). Setting this property to a higher value allows the async executor to hopefully clear its queue a bit.

Does it help, in general, to speed things up if I set it to a non-zero value, e.g. 10000 (10 seconds)?

filiphr · July 8, 2024, 9:42am

In the open source we don’t really have something out-of-the-box. We do have different listeners that you could hook into in theory and measure it.

Now that you mention this, I see that you also have the asyncExecutorMaxAsyncJobsDuePerAcquisition set to 5. This means that there is a lot of contention by the different nodes. The nodes keep trying to get the same jobs and lock them. We’ve done a lot of improvements in this area in 6.7.

I would suggest that you read the following blog serios:

Cheers,
Filip

pbastov · July 8, 2024, 9:57am

Thank you for your reply and for the links. I believe I came across them and read, but I will give them another read.

Topic		Replies	Views
Unwanted retries: DefaultAsyncJobExecutor Flowable Engine	2	988	November 28, 2017
Process element duplicate execution Flowable Engine	7	911	May 26, 2022
Flowable models are stuck Flowable Engine	15	1285	June 15, 2021
Behavior of async executor Flowable Engine	8	4313	December 19, 2017
Flowable engine performance drops dramatically under HTTP requests load Flowable Engine	3	816	March 25, 2020

Fast async job processing at first, slow after

Related topics