How to model an efficient count down latch in bpmn

BertRobben · January 2, 2025, 11:59am

Hey,

I want to model a process that waits for all of its participants to reach a certain final state. I designed this by creating a process that has a process variable that counts the number of participants that are not yet in that state. Each time a participant reaches the final state, it sends a message to the process instance. The process handles such messages by a non-interrupting event subprocess that runs a simple service that decrements the counter and checks if it reached zero. Simple enough.

However, I want to make sure this runs efficiently in situations where many of the participants reach the final state at the same time. A naive implementation will result in many optimistic locking exceptions with associated retries.

So I made the service async. That works and avoids optimistic locking exceptions but is also horribly slow as the service is only executed once every time flowable queries for jobs (= by default once every 10s).

Is there a better way to express this in BPMN? The alternative is that I implement some synchronization mechanism (e.g. based on database locks) in Java code of the participants such that they don’t send messages concurrently and not declare the service as async.

PS: Here’s my process def.

<?xml version="1.0" encoding="UTF-8"?>
<definitions xmlns="http://www.omg.org/spec/BPMN/20100524/MODEL"
                          xmlns:flowable="http://flowable.org/bpmn"
                     xmlns:dc="http://www.omg.org/spec/DD/20100524/DC"
                     xmlns:di="http://www.omg.org/spec/DD/20100524/DI"
                     xmlns:bpmndi="http://www.omg.org/spec/BPMN/20100524/DI"
                          targetNamespace="Examples">
  <signal id="counterZeroSignal" name="CounterZero" />
  <message id="countDownMessage" name="CountDown" />
  <process id="counting-process">
    <startEvent id="start-main">
      <outgoing>Flow_0jsk20j</outgoing>
    </startEvent>
    <subProcess id="count-activity" triggeredByEvent="true">
      <startEvent id="start-decrement" **isInterrupting="false"**>
        <outgoing>Flow_0csobbx</outgoing>
        <messageEventDefinition messageRef="countDownMessage" />
      </startEvent>
      <sequenceFlow id="Flow_0csobbx" sourceRef="start-decrement" targetRef="count-service" />
      <serviceTask id="count-service" name="Count" flowable:delegateExpression="${countServiceTask}" **flowable:async="true"**>
        <incoming>Flow_0csobbx</incoming>
        <outgoing>Flow_1sry4ac</outgoing>
      </serviceTask>
      <endEvent id="end-decrement">
        <incoming>Flow_1sry4ac</incoming>
      </endEvent>
      <sequenceFlow id="Flow_1sry4ac" sourceRef="count-service" targetRef="end-decrement" />
    </subProcess>
    <intermediateCatchEvent id="catch-signal">
      <incoming>Flow_0jsk20j</incoming>
      <outgoing>Flow_1dxixqo</outgoing>
      <signalEventDefinition signalRef="counterZeroSignal" />
    </intermediateCatchEvent>
    <sequenceFlow id="Flow_0jsk20j" sourceRef="start-main" targetRef="catch-signal" />
    <sequenceFlow id="Flow_1dxixqo" sourceRef="catch-signal" targetRef="complete-service" />
    <serviceTask id="complete-service" name="Complete" flowable:delegateExpression="${completeServiceTask}">
      <incoming>Flow_1dxixqo</incoming>
      <outgoing>Flow_13cbiu0</outgoing>
    </serviceTask>
    <endEvent id="main-end">
      <incoming>Flow_13cbiu0</incoming>
    </endEvent>
    <sequenceFlow id="Flow_13cbiu0" sourceRef="complete-service" targetRef="main-end" />
  </process>
</definitions>

martin.grofcik · January 2, 2025, 1:30pm

Hi Bert,

Isn’t multiinstance behavior enough for your case?

Regards
Martin

BertRobben · January 2, 2025, 5:57pm

Wouldn’t multi instance suffer from the exact same optimistic conc rollback failures because after each instance finishes, the shared parent is updated?

I’ve seen multi instance subflows generate a lot of retries because of this. In my business problem (of which the counter is a simplification) I will have a whole bunch of them finish at the same time.

martin.grofcik · January 2, 2025, 6:29pm

My understanding :
You want to execute plenty of parallel jobs. The outputs are aggregated when all results are ready.

Proposal:
Create one table to store all outputs. Create all jobs with inputs. The job takes the inputs and stores outputs to one table row. No clash with optimistic locking exception, because no variable in the process is updated. When the results are stored you can check whether all jobs are completed and continue or not in the execution.
May be you can use org.flowable.batch.api.BatchService as an implementation.

Regards
Martin

BertRobben · January 3, 2025, 12:20pm

Yes, that’s what I want to do. The parallel jobs are actually separate process instances that run with their own separate process and continue living afterwards. So, that’s very similar to the job setup you propose.

When a job finishes, it remains a bit tricky to check if everything is completed. Doing a simple query will not suffice because different jobs might be ending at the same time and will not see each other’s (uncommitted) results in the db. So I guess I’ll end up with having a counter record in the database that I update using pessimistic locking. In other words, I’ll do the synchronization effort outside of the bpmn processes and ensure that only a single message is sent to wake up the parent process.

thanks for helping me explore this,

Bert

martin.grofcik · January 3, 2025, 12:33pm

Hi Bert,

BatchService has the same problem. It uses one job to monitor batchpart’s status. Batchpart is a job, which process the task. May be you can use it. Batch service is used internally in flowable to perform long running parallel tasks, e.g. house keeping (deleting history when it is not necessary anymore)

Regards
Martin

Topic		Replies	Views
Asynchronous Service Invocation using Flowable Flowable Engine	3	2301	January 29, 2020
Sequence flow using countLoop Flowable Engine	1	1120	August 16, 2018
Waiting for process to finish Flowable Engine	1	1397	November 20, 2019
Handling bulk timer events Flowable Engine	2	671	March 16, 2023
Bad performance with async service tasks Flowable Engine	19	3815	October 13, 2021

How to model an efficient count down latch in bpmn

Related topics