ORA-00060 deadlock flowable with oracleDB as datasource

Hi,

We are getting the following deadlock on a flowable application using oracle db as datasource:

org.apache.ibatis.exceptions.PersistenceException:

Error updating database. Cause: java.sql.SQLException: ORA-00060: deadlock detected while waiting for resource

The error may involve org.flowable.engine.impl.persistence.entity.ExecutionEntityImpl.updateExecution-Inline

The error occurred while setting parameters

SQL:

     update ACT_RU_EXECUTION
     SET REV_ = ?, SUPER_EXEC_ = ?      
     where ID_ = ? and REV_ = ?

Cause: java.sql.SQLException: ORA-00060: deadlock detected while waiting for resource

org.apache.ibatis.exceptions.ExceptionFactory.wrapException(ExceptionFactory.java:30)
org.apache.ibatis.session.defaults.DefaultSqlSession.update(DefaultSqlSession.java:200)
org.flowable.common.engine.impl.db.DbSqlSession.flushUpdates(DbSqlSession.java:500)
org.flowable.common.engine.impl.db.DbSqlSession.flush(DbSqlSession.java:289)
org.flowable.common.engine.impl.interceptor.CommandContext.flushSessions(CommandContext.java:191)
org.flowable.common.engine.impl.interceptor.CommandContext.close(CommandContext.java:61)
org.flowable.common.engine.impl.interceptor.CommandContextInterceptor.execute(CommandContextInterceptor.java:80)
org.flowable.idm.spring.SpringTransactionInterceptor$1.doInTransaction(SpringTransactionInterceptor.java:49)
org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:133)
org.flowable.idm.spring.SpringTransactionInterceptor.execute(SpringTransactionInterceptor.java:46)
org.flowable.common.engine.impl.interceptor.LogInterceptor.execute(LogInterceptor.java:30)
org.flowable.common.engine.impl.cfg.CommandExecutorImpl.execute(CommandExecutorImpl.java:56)
org.flowable.common.engine.impl.cfg.CommandExecutorImpl.execute(CommandExecutorImpl.java:51)
org.flowable.engine.impl.RuntimeServiceImpl.setVariable(RuntimeServiceImpl.java:285)
es.sd.inventario2.workflow.commons.protom.call.task.ProtomListenerAnswer.processProtomResponse(ProtomListenerAnswer.java:226)
es.sd.inventario2.workflow.commons.protom.call.task.ProtomListenerAnswer.processProtomAnswer(ProtomListenerAnswer.java:186)
es.sd.inventario2.workflow.commons.protom.call.task.ProtomListenerAnswer.getSoftSMSProtomAnswer(ProtomListenerAnswer.java:110)
sun.reflect.GeneratedMethodAccessor444.invoke(Unknown Source)
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
java.lang.reflect.Method.invoke(Method.java:498)
org.springframework.messaging.handler.invocation.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:180)
org.springframework.messaging.handler.invocation.InvocableHandlerMethod.invoke(InvocableHandlerMethod.java:112)
org.springframework.cloud.stream.binding.StreamListenerMessageHandler.handleRequestMessage(StreamListenerMessageHandler.java:55)
org.springframework.integration.handler.AbstractReplyProducingMessageHandler.handleMessageInternal(AbstractReplyProducingMessageHandler.java:109)
org.springframework.integration.handler.AbstractMessageHandler.handleMessage(AbstractMessageHandler.java:127)
org.springframework.integration.dispatcher.AbstractDispatcher.tryOptimizedDispatch(AbstractDispatcher.java:116)
org.springframework.integration.dispatcher.UnicastingDispatcher.doDispatch(UnicastingDispatcher.java:148)
org.springframework.integration.dispatcher.UnicastingDispatcher.dispatch(UnicastingDispatcher.java:121)
org.springframework.integration.channel.AbstractSubscribableChannel.doSend(AbstractSubscribableChannel.java:89)
org.springframework.integration.channel.AbstractMessageChannel.send(AbstractMessageChannel.java:425)
org.springframework.integration.channel.AbstractMessageChannel.send(AbstractMessageChannel.java:375)
org.springframework.messaging.core.GenericMessagingTemplate.doSend(GenericMessagingTemplate.java:115)
org.springframework.messaging.core.GenericMessagingTemplate.doSend(GenericMessagingTemplate.java:45)
org.springframework.messaging.core.AbstractMessageSendingTemplate.send(AbstractMessageSendingTemplate.java:105)
org.springframework.integration.handler.AbstractMessageProducingHandler.sendOutput(AbstractMessageProducingHandler.java:360)
org.springframework.integration.handler.AbstractMessageProducingHandler.produceOutput(AbstractMessageProducingHandler.java:271)
org.springframework.integration.handler.AbstractMessageProducingHandler.sendOutputs(AbstractMessageProducingHandler.java:188)
org.springframework.integration.handler.AbstractReplyProducingMessageHandler.handleMessageInternal(AbstractReplyProducingMessageHandler.java:115)
org.springframework.integration.handler.AbstractMessageHandler.handleMessage(AbstractMessageHandler.java:127)
org.springframework.integration.channel.FixedSubscriberChannel.send(FixedSubscriberChannel.java:70)
org.springframework.integration.channel.FixedSubscriberChannel.send(FixedSubscriberChannel.java:64)
org.springframework.messaging.core.GenericMessagingTemplate.doSend(GenericMessagingTemplate.java:115)
org.springframework.messaging.core.GenericMessagingTemplate.doSend(GenericMessagingTemplate.java:45)
org.springframework.messaging.core.AbstractMessageSendingTemplate.send(AbstractMessageSendingTemplate.java:105)
org.springframework.integration.endpoint.MessageProducerSupport.sendMessage(MessageProducerSupport.java:188)
org.springframework.integration.amqp.inbound.AmqpInboundChannelAdapter.access$1100(AmqpInboundChannelAdapter.java:57)
org.springframework.integration.amqp.inbound.AmqpInboundChannelAdapter$Listener.processMessage(AmqpInboundChannelAdapter.java:250)
org.springframework.integration.amqp.inbound.AmqpInboundChannelAdapter$Listener.onMessage(AmqpInboundChannelAdapter.java:205)
org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.doInvokeListener(AbstractMessageListenerContainer.java:856)
org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.invokeListener(AbstractMessageListenerContainer.java:779)
org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer.access$001(SimpleMessageListenerContainer.java:105)
org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer$1.invokeListener(SimpleMessageListenerContainer.java:208)
org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer.invokeListener(SimpleMessageListenerContainer.java:1381)
org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.executeListener(AbstractMessageListenerContainer.java:760)
org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer.doReceiveAndExecute(SimpleMessageListenerContainer.java:1324)
org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer.receiveAndExecute(SimpleMessageListenerContainer.java:1294)
org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer.access$1800(SimpleMessageListenerContainer.java:105)
org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer$AsyncMessageProcessingConsumer.run(SimpleMessageListenerContainer.java:1550)
java.lang.Thread.run(Thread.java:748)

Caused by: java.sql.SQLException: ORA-00060: deadlock detected while waiting for resource
oracle.jdbc.driver.T4CTTIoer11.processError(T4CTTIoer11.java:494)
oracle.jdbc.driver.T4CTTIoer11.processError(T4CTTIoer11.java:446)
oracle.jdbc.driver.T4C8Oall.processError(T4C8Oall.java:1054)
oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:623)
oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:252)
oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:612)
oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:226)
oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:59)
oracle.jdbc.driver.T4CPreparedStatement.executeForRows(T4CPreparedStatement.java:910)
oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1119)
oracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePreparedStatement.java:3780)
oracle.jdbc.driver.T4CPreparedStatement.executeInternal(T4CPreparedStatement.java:1343)
oracle.jdbc.driver.OraclePreparedStatement.execute(OraclePreparedStatement.java:3887)
oracle.jdbc.driver.OraclePreparedStatementWrapper.execute(OraclePreparedStatementWrapper.java:1079)
com.zaxxer.hikari.pool.ProxyPreparedStatement.execute(ProxyPreparedStatement.java:44)
com.zaxxer.hikari.pool.HikariProxyPreparedStatement.execute(HikariProxyPreparedStatement.java)
org.apache.ibatis.executor.statement.PreparedStatementHandler.update(PreparedStatementHandler.java:46)
org.apache.ibatis.executor.statement.RoutingStatementHandler.update(RoutingStatementHandler.java:74)
org.apache.ibatis.executor.SimpleExecutor.doUpdate(SimpleExecutor.java:50)
org.apache.ibatis.executor.BaseExecutor.update(BaseExecutor.java:117)
org.apache.ibatis.executor.CachingExecutor.update(CachingExecutor.java:76)
org.apache.ibatis.session.defaults.DefaultSqlSession.update(DefaultSqlSession.java:198)… 57 more
Caused by: Error : 60, Position : 0,
Sql = update ACT_RU_EXECUTION
SET REV_ = :1 ,
SUPER_EXEC_ = :2
where ID_ = :3 and REV_ = :4 ,
OriginalSql = update ACT_RU_EXECUTION
SET REV_ = ?,
SUPER_EXEC_ = ?
where ID_ = ? and REV_ = ?,
Error Msg = ORA-00060: deadlock detected while waiting for resource
oracle.jdbc.driver.T4CTTIoer11.processError(T4CTTIoer11.java:498)… 78 more

As oracle db schema is located on a cluster database with several other schema, we can’t change isolation level as suggested here:

https://forum.flowable.org/t/ms-sqlserver-deadlock-thrown-on-finding-process-instances/508
https://github.com/Activiti/Activiti/issues/1977 (activiti github, but can apply)

These are the only post i have found googling relating to this issue, btw.

I’m not an Oracle user myself, but going from this comment it looks like there may be a missing index.

Looking at flowable.oracle.create.engine.sql it looks like the create script is missing an index, but it is present in the update scripts.

Can you check to see if there is an index for EXECUTION_ID_ on ACT_RU_EVENT_SUBSCR. If not you can add one with:

create index ACT_IDX_EVENT_SUBSCR on ACT_RU_EVENT_SUBSCR(EXECUTION_ID_);

Hi @wwitt,

Thanks for the fast response. As soon as we try this out, we’ll let you know.

Greetings

Hi @wwitt,

It seems like that index is already created.

I have been watching github logs since version 6.3.1 (the one we use), and i have seen some interesting new features and fixes:

SQL queries on batch: https://github.com/flowable/flowable-engine/commit/89b8a121b8a0b39f97d4ade5f7b44fa248df01f7#diff-7709168244fdf53602bc14feb5f5cf7f

JPA variables visibility fix: https://github.com/flowable/flowable-engine/commit/cdc5bb78601dff594cbf78ac23fe9e678346a230#diff-7709168244fdf53602bc14feb5f5cf7f

Change of locking policy: https://github.com/flowable/flowable-engine/commit/7981ba28827dae3e2ab473d36e1aad0e426720bf#diff-7709168244fdf53602bc14feb5f5cf7f

Do you think this may improve performance and avoid deadlock?

Greetings

I’m not certain any of these specific items would improve this deadlock issue, so don’t know if upgrading will solve it. If the root cause is/was something fixed in the Flowable source, it would be unlikely that we’d back port it to 6.3.1 so upgrading would likely be involved anyway.

But lets talk about your process and what’s going on in this section of code:

org.flowable.engine.impl.RuntimeServiceImpl.setVariable(RuntimeServiceImpl.java:285)
es.sd.inventario2.workflow.commons.protom.call.task.ProtomListenerAnswer.processProtomResponse(ProtomListenerAnswer.java:226)
es.sd.inventario2.workflow.commons.protom.call.task.ProtomListenerAnswer.processProtomAnswer(ProtomListenerAnswer.java:186)
es.sd.inventario2.workflow.commons.protom.call.task.ProtomListenerAnswer.getSoftSMSProtomAnswer(ProtomListenerAnswer.java:110)
...
org.springframework.cloud.stream.binding.StreamListenerMessageHandler.handleRequestMessage(StreamListenerMessageHandler.java:55)

It looks like you are receiving a message via RabbitMQ using a Spring Cloud Stream binding, doing some processing, and attempting to set a variable. Is it possible that two messages are being recieved for the same process at the same time?

Side note: This coming feature may be interesting to you:

Hi @wwitt,

Great video! I guess you linked it due to send-receive problem they talk at 38:30. You are correct with your guesses. We have a event driven architecture in which we send a SOAP request to a external system. That external system performs an asynchronous process and as soon as it has a response, it is sent to a service which publish it in a rabbit mq exhange. We have a listener on a rabbit queue. With a process variable, we select the waiting flowable process and restart it. As far as i know, a race condition on flowable produce a FlowableOptimisticException, not a DeadLock on database, but i may be mistaken. But the answer at your question: yes, it’s possible.

As the system has been on production for several months without deadlock problems and we have been getting them recently and occasionally, our bets are on a slow network issue right now. I guess a slow network issue can cause a transaction wait long enough for a table lock to interpret it as a deadlock.

Greetings

I actually included it because it may allow you to remove some of your custom wiring of the messages into the process.

You are correct that most of the time a race condition will trigger a FlowableOptimisticException but clearly something is getting around the locking we do have. I would expect a slow database connection to lead to connection timeouts not a deadlock, though I suppose it increase the time a transaction takes and also time the application is vulnerable to this occurring.

I’d look at the following:

  • Foreign keys without indexes (there are scripts out there to help find them)
  • Prevent running on the same execution id at the same time somehow.
  • Perhaps try an upgrade to 6.4.2 to see if it might solve the problem

Another upcoming feature that may help once 6.5.0 is released is Logging Sessions:

Shameless plug: Flowable does offer support and consultancy services

Hi,

We have finally found what the problem was and solved it. We had the following flowable process with an event gateway:

Captura1

This process is a common part called by other flowable processes. In some cases, timeout and answer was firing at the same time and both execution branches were continuing the process until the end of the calling flowable process, where Flowable Engine was trying, apparently, to commit both transactions and getting a deadlock in one of them. We have solved it by adding an async point (waiting point) at the end of the subprocess:

Captura2

This async point (waiting point) forces to commit transaction before returning to calling flowable process. First execution branch to commit will return to calling flowable process and second one will throw a FlowableOptimisticLockingException and end.

Hope this help someone.

Thanks for your help, @wwitt

Greetings!

Thanks @Void for posting this follow-up!