Concurrent variable addition in a cluster (assistance needed)

Hi,

I am currently working on a project that includes flowable as an orchestrator.

We have chosen to use the embedded version (6.4.2), ran with spring boot (2.1.6) on java 8 and an oracle database.

We are facing a bit of an issue with some optimistic locking exceptions in one of our sub processes.

The issue only started to happen in test env (2 nodes, clustered), not in local (1 node, H2 DB) but, from what I’ve gathered from your documentation and my research on the problem, this should not be a part of the problem.

I am looking for advice / a possible solution to this but any tip / useful advice / best practices about flowable integration is welcome as well.

Here’s a sample of the process that causes the issue with a light description of what’s under the hood

The use case :
The Task “Calls” sends a varying number of request to another app that will trigger asynchronous returns.

Every “WaitX” conditional receive task is used to wait on an asynchronous return

These returns include some data that we want to persist in the flowable context (process instance)

The doStuff task uses the “raw” data put in the flowable context by the receive task to properly modify our business object

The error :
An optimistic locking exception, presumably, caused by the way we add this data to the flowable context

The implementation :
We use the method setVariable (with the pid, a unique varname, and a balue) of the runtimeService to add our data to the flowable context

We have this “modifications to the variables of a process instance from outside of the flowable context” use case in other places.

Is it possible that the use of setVariable is the cause of our problem ?
Any advice / best practice to add variables to our process instance in a transaction safe way ?

I’ve had a few ideas :

  • Look into variableService and see if variable insertion raises the same type of issue
  • Use a Subprocess for the reception and an out variable
  • Persist the payload of the asynchronous return in a non-flowable table and fetch it from a flowable task
  • Put a local variable from the runtimeService and make the doStuff task synchronous to see if the shared transaction would allow us to retrieve the local var while hopefully avoiding locking

These are either a bit time consuming research-wise or seem to be hacks rather than solutions so I was hoping for a good advice.

I’ve read about asynchronous / exclusive tasks in your documentation, as well as about transaction management but I remain rather new to flowable in general.

It is my understanding that inclusive gateways generate executions for every path and that could prove useful but I haven’t found enough info about this.

If you need any additional info about our understanding / implementation / configuration before diving in, feel free to ask.
I’m bound to keep the business specifics to myself but will happily provide further detail concerning the hows and whys of some of our choices, or code samples (with altered naming).

Thanks in advance,
Lionel

We’ve moved on to using a triggerAsync to trigger our task and add our payload but are still facing this issue.
For the moment we’re catching the exception and retrying after a small random time but this seems like a terrible hack.

The problem (without knowing which tasks are async) is most likely that multiple calls return at the same time, trying to save data on the same instance at the same moment.

Some questions (which can’t be seen from the diagram):

  • is the joining gateway also async/exclusive? This would solve the problem with the tasks trying to complete at the same time (the exclusive avoid all of them running at the same time)
  • Can you share how you implemented the Triggerable task? Did you make it async too?

Thanks for your answer !

The issue has occurred on a standalone node during one of our tests so, as expected, this is not cluster-related.

The catch and retry hack is working in test env for the moment but I’m still interested in a more elegant solution.

The joining gateway is not async, nor is it exclusive.

All the “Update business with notification about X” tasks are exclusive & async. They are not trigerrable. We did look into that but after understanding that the triggering occurs as the last operation on the task, it seemed to be an improper solution for our business need.

The task we trigger is the receiveTask that occurs before every update and that is used as a wait state.
The runtimeService.triggerAsync(activityId, map) call seems to be the sole source of our problem.

I could take some time to remove business specifics from my bpmn and attach it if you’d like.
Regarding the receive task, is there any additional info that would be relevant ?