Error: Process Instance Cannot Complete an End Event

Dear experts,

we experience a problem where, under certain conditions, our process instance does not complete and remains running having only one active element: an End Event.

In our test environment, a business process is used which looks as follows:

A non-cancelling boundary timer event is followed by a message catching event.

In the test, we interract with the running process instance using the API in the following manner:

  1. Wait for the process instance to have the message event subscription:

runtimeService.createExecutionQuery().messageEventSubscriptionName(MESSAGE_NAME).processInstanceId(processInstanceId).count() > 0

  1. Send the message to the process instance:

runtimeService.messageEventReceived(MESSAGE_NAME, runtimeService.createExecutionQuery().messageEventSubscriptionName(MESSAGE_NAME).processInstanceId(processInstanceId).singleResult().getId())

  1. Wait for a process instance variable to have a value expected if the corresponding script task has been executed:

VALUE_SET_BY_SCRIPT_TASK.equals(historicDataService.createHistoricVariableInstanceQuery().processInstanceId(processInstanceId).variableName(variableName).singleResult())

  1. Complete user task:

taskService.complete(userTaskId)

Our observation shows that if steps 2 and 3 happen with no delay between each other the corresponding process instance does not become completed after the step 4, even though it should (both tasks are executed successfully). There remains a related job which permanently tries to process an End Event (it could be the one from the top branch as well as the one from the bottom, but the bottom one has come into play more often in our observations)

However, including a delay of 500 ms between steps 2 and 3 leads to the process instance to complete successfully:

Thread.sleep(500)

Do you have an idea of what could be going wrong in our case?

Frankly, we are still on the Activiti v5.21, but an update to Flowable is happenning in a very near future.

Thank you very much in advance!

Best regards,
Vasil Tsimashchuk

The logic looks allright, so I’m guessing it’s a race condition between the two ends being reached (the one from the timer and the one from the user task). Which does sound like a bug.

Do you happen to have the process xml (and unit test if we’re really lucky ;-)) so we can dig into it?