Uploaded image for project: 'camunda BPM'
  1. camunda BPM
  2. CAM-10360

Deadlock exception when set removal time batch is executed

XMLWordPrintable

    • Icon: Bug Report Bug Report
    • Resolution: Fixed
    • Icon: L3 - Default L3 - Default
    • 7.13.0, 7.12.5, 7.13.0-alpha4
    • 7.11.0
    • engine
    • db2 11.1, WebSphere 9

      Steps:

      • setup websphere 9 + db2 11.1
      • In cockpit open Batch operations
      • Run set removal time to process/decision instance, please mark some amount of instances (at least > 5) to absolute time

      Expected:
      The batch operation is performed successfully.

      Observed:
      An exception is thrown for some of the jobs. However, after rerun the jobs finish successfully. Please find the exception below:

      org.camunda.bpm.engine.ProcessEngineException: ENGINE-03004 Exception while executing Database Operation 'UPDATE_BULK updateByteArraysByProcessInstanceId {processInstanceId=4af766b1-7d64-11e9-b07b-0242ac110003, removalTime=Tue Sep 24 11:09:00 CEST 2019}' with message '
      ### Error updating database.  Cause: com.ibm.db2.jcc.am.SqlTransactionRollbackException: DB2 SQL Error: SQLCODE=-911, SQLSTATE=40001, SQLERRMC=2, DRIVER=4.22.29
      ### The error may involve org.camunda.bpm.engine.impl.persistence.entity.VariableInstanceEntity.updateByteArraysByProcessInstanceId-Inline
      ### The error occurred while setting parameters
      ### SQL: update ACT_GE_BYTEARRAY       set REMOVAL_TIME_ = ?       where ID_ IN (                  SELECT BYTEARRAY_ID_         FROM ACT_HI_VARINST         WHERE PROC_INST_ID_ = ?       ) or ID_ IN (                  SELECT BYTEARRAY_ID_         FROM ACT_HI_DEC_IN I         INNER JOIN ACT_HI_DECINST D         ON I.DEC_INST_ID_ = D.ID_         WHERE D.PROC_INST_ID_ = ?       ) or ID_ IN (                  SELECT BYTEARRAY_ID_         FROM ACT_HI_DEC_OUT O         INNER JOIN ACT_HI_DECINST D         ON O.DEC_INST_ID_ = D.ID_         WHERE D.PROC_INST_ID_ = ?       ) or ID_ IN (                  SELECT JOB_EXCEPTION_STACK_ID_         FROM ACT_HI_JOB_LOG         WHERE PROCESS_INSTANCE_ID_ = ?       ) or ID_ IN (                  SELECT ERROR_DETAILS_ID_         FROM ACT_HI_EXT_TASK_LOG         WHERE PROC_INST_ID_ = ?       ) or ID_ IN (                  SELECT CONTENT_ID_         FROM ACT_HI_ATTACHMENT A         LEFT JOIN ACT_HI_TASKINST T ON A.TASK_ID_ = T.ID_         WHERE A.PROC_INST_ID_ = ?           OR T.PROC_INST_ID_ = ?       )
      ### Cause: com.ibm.db2.jcc.am.SqlTransactionRollbackException: DB2 SQL Error: SQLCODE=-911, SQLSTATE=40001, SQLERRMC=2, DRIVER=4.22.29'. Flush summary: 
       [
        INSERT HistoricJobLogEventEntity[c6251010-7e03-11e9-891c-0242ac110003]
        DELETE MessageEntity[be7d392d-7e03-11e9-891c-0242ac110003]
        DELETE_BULK deleteByteArrayNoRevisionCheck c4edef99-7e03-11e9-891c-0242ac110003
        DELETE_BULK deleteByteArrayNoRevisionCheck be7d392c-7e03-11e9-891c-0242ac110003
        UPDATE_BULK updateHistoricActivityInstancesByProcessInstanceId {processInstanceId=4af766b1-7d64-11e9-b07b-0242ac110003, removalTime=Tue Sep 24 11:09:00 CEST 2019}
        UPDATE_BULK updateHistoricTaskInstancesByProcessInstanceId {processInstanceId=4af766b1-7d64-11e9-b07b-0242ac110003, removalTime=Tue Sep 24 11:09:00 CEST 2019}
        UPDATE_BULK updateHistoricVariableInstancesByProcessInstanceId {processInstanceId=4af766b1-7d64-11e9-b07b-0242ac110003, removalTime=Tue Sep 24 11:09:00 CEST 2019}
        UPDATE_BULK updateHistoricDetailsByProcessInstanceId {processInstanceId=4af766b1-7d64-11e9-b07b-0242ac110003, removalTime=Tue Sep 24 11:09:00 CEST 2019}
        UPDATE_BULK updateHistoricIncidentsByProcessInstanceId {processInstanceId=4af766b1-7d64-11e9-b07b-0242ac110003, removalTime=Tue Sep 24 11:09:00 CEST 2019}
        UPDATE_BULK updateExternalTaskLogByProcessInstanceId {processInstanceId=4af766b1-7d64-11e9-b07b-0242ac110003, removalTime=Tue Sep 24 11:09:00 CEST 2019}
        UPDATE_BULK updateJobLogByProcessInstanceId {processInstanceId=4af766b1-7d64-11e9-b07b-0242ac110003, removalTime=Tue Sep 24 11:09:00 CEST 2019}
        UPDATE_BULK updateUserOperationLogByProcessInstanceId {processInstanceId=4af766b1-7d64-11e9-b07b-0242ac110003, removalTime=Tue Sep 24 11:09:00 CEST 2019}
        UPDATE_BULK updateIdentityLinkLogByProcessInstanceId {processInstanceId=4af766b1-7d64-11e9-b07b-0242ac110003, removalTime=Tue Sep 24 11:09:00 CEST 2019}
        UPDATE_BULK updateCommentsByProcessInstanceId {processInstanceId=4af766b1-7d64-11e9-b07b-0242ac110003, removalTime=Tue Sep 24 11:09:00 CEST 2019}
        UPDATE_BULK updateAttachmentsByProcessInstanceId {processInstanceId=4af766b1-7d64-11e9-b07b-0242ac110003, removalTime=Tue Sep 24 11:09:00 CEST 2019}
        UPDATE_BULK updateByteArraysByProcessInstanceId {processInstanceId=4af766b1-7d64-11e9-b07b-0242ac110003, removalTime=Tue Sep 24 11:09:00 CEST 2019}
        UPDATE_BULK updateHistoricProcessInstanceByProcessInstanceId {processInstanceId=4af766b1-7d64-11e9-b07b-0242ac110003, removalTime=Tue Sep 24 11:09:00 CEST 2019}
        UPDATE_BULK updateHistoricDecisionInstancesByProcessInstanceId {processInstanceId=4af766b1-7d64-11e9-b07b-0242ac110003, removalTime=Tue Sep 24 11:09:00 CEST 2019}
        UPDATE_BULK updateHistoricDecisionInputInstancesByProcessInstanceId {processInstanceId=4af766b1-7d64-11e9-b07b-0242ac110003, removalTime=Tue Sep 24 11:09:00 CEST 2019}
        UPDATE_BULK updateHistoricDecisionOutputInstancesByProcessInstanceId {processInstanceId=4af766b1-7d64-11e9-b07b-0242ac110003, removalTime=Tue Sep 24 11:09:00 CEST 2019}
      ]
      	at org.camunda.bpm.engine.impl.db.EnginePersistenceLogger.flushDbOperationException(EnginePersistenceLogger.java:131)
      	at org.camunda.bpm.engine.impl.db.entitymanager.DbEntityManager.flushDbOperations(DbEntityManager.java:346)
      	at org.camunda.bpm.engine.impl.db.entitymanager.DbEntityManager.flushDbOperationManager(DbEntityManager.java:325)
      	at org.camunda.bpm.engine.impl.db.entitymanager.DbEntityManager.flush(DbEntityManager.java:297)
      	at org.camunda.bpm.engine.impl.interceptor.CommandContext.flushSessions(CommandContext.java:208)
      	at org.camunda.bpm.engine.impl.interceptor.CommandContext.close(CommandContext.java:137)
      	at org.camunda.bpm.engine.impl.interceptor.CommandContextInterceptor.execute(CommandContextInterceptor.java:116)
      	at org.camunda.bpm.engine.spring.SpringTransactionInterceptor$1.doInTransaction(SpringTransactionInterceptor.java:46)
      	at org.springframework.transaction.jta.WebSphereUowTransactionManager$UOWActionAdapter.run(WebSphereUowTransactionManager.java:345)
      	at com.ibm.ws.uow.embeddable.EmbeddableUOWManagerImpl.runUnderNewUOW(EmbeddableUOWManagerImpl.java:798)
      	at com.ibm.ws.uow.embeddable.EmbeddableUOWManagerImpl.runUnderUOW(EmbeddableUOWManagerImpl.java:377)
      	at org.springframework.transaction.jta.WebSphereUowTransactionManager.execute(WebSphereUowTransactionManager.java:289)
      	at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:125)
      	at org.camunda.bpm.engine.spring.SpringTransactionInterceptor.execute(SpringTransactionInterceptor.java:44)
      	at org.camunda.bpm.engine.impl.interceptor.ProcessApplicationContextInterceptor.execute(ProcessApplicationContextInterceptor.java:70)
      	at org.camunda.bpm.engine.impl.interceptor.LogInterceptor.execute(LogInterceptor.java:33)
      	at org.camunda.bpm.engine.impl.jobexecutor.ExecuteJobHelper.executeJob(ExecuteJobHelper.java:51)
      	at org.camunda.bpm.engine.impl.jobexecutor.ExecuteJobHelper.executeJob(ExecuteJobHelper.java:44)
      	at org.camunda.bpm.container.impl.ejb.JobExecutionHandlerMDB.executeJob(JobExecutionHandlerMDB.java:41)
      	at sun.reflect.GeneratedMethodAccessor239.invoke(Unknown Source)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
      	at java.lang.reflect.Method.invoke(Method.java:508)
      	at com.ibm.ejs.container.EJSContainer.invokeProceed(EJSContainer.java:5323)
      	at com.ibm.ejs.container.interceptors.InvocationContextImpl.proceed(InvocationContextImpl.java:652)
      	at com.ibm.ws.cdi.ejb.impl.InterceptorChain.proceed(InterceptorChain.java:115)
      	at com.ibm.ws.cdi.ejb.impl.EJBCDIInterceptorWrapper.invokeInterceptors(EJBCDIInterceptorWrapper.java:140)
      	at com.ibm.ws.cdi.ejb.impl.EJBCDIInterceptorWrapper.aroundInvoke(EJBCDIInterceptorWrapper.java:56)
      	at sun.reflect.GeneratedMethodAccessor130.invoke(Unknown Source)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
      	at java.lang.reflect.Method.invoke(Method.java:508)
      	at com.ibm.ejs.container.interceptors.InterceptorProxy.invokeInterceptor(InterceptorProxy.java:201)
      	at com.ibm.ejs.container.interceptors.InvocationContextImpl.proceed(InvocationContextImpl.java:632)
      	at org.jboss.weld.ejb.AbstractEJBRequestScopeActivationInterceptor.aroundInvoke(AbstractEJBRequestScopeActivationInterceptor.java:73)
      	at com.ibm.ws.cdi.ejb.impl.WeldSessionBeanInterceptorWrapper.aroundInvoke(WeldSessionBeanInterceptorWrapper.java:58)
      	at sun.reflect.GeneratedMethodAccessor129.invoke(Unknown Source)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
      	at java.lang.reflect.Method.invoke(Method.java:508)
      	at com.ibm.ejs.container.interceptors.InterceptorProxy.invokeInterceptor(InterceptorProxy.java:201)
      	at com.ibm.ejs.container.interceptors.InvocationContextImpl.proceed(InvocationContextImpl.java:632)
      	at com.ibm.ejs.container.interceptors.InvocationContextImpl.doAroundInterceptor(InvocationContextImpl.java:306)
      	at com.ibm.ejs.container.interceptors.InvocationContextImpl.doAroundInvoke(InvocationContextImpl.java:273)
      	at com.ibm.ejs.container.EJSContainer.invoke(EJSContainer.java:5215)
      	at org.camunda.bpm.container.impl.ejb.MDBProxyJobExecutionHandlerMDB_53d2ac45.executeJob(MDBProxyJobExecutionHandlerMDB_53d2ac45.java)
      	at org.camunda.bpm.container.impl.threading.ra.inflow.JcaInflowExecuteJobsRunnable.executeJob(JcaInflowExecuteJobsRunnable.java:76)
      	at org.camunda.bpm.engine.impl.jobexecutor.ExecuteJobsRunnable.run(ExecuteJobsRunnable.java:61)
      	at org.camunda.bpm.container.impl.threading.ra.commonj.CommonjWorkRunnableAdapter.run(CommonjWorkRunnableAdapter.java:36)
      	at com.ibm.ws.asynchbeans.J2EEContext.run(J2EEContext.java:1046)
      	at com.ibm.ws.asynchbeans.WorkWithExecutionContextImpl.go(WorkWithExecutionContextImpl.java:199)
      	at com.ibm.ws.asynchbeans.CJWorkItemImpl.run(CJWorkItemImpl.java:236)
      	at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java:1892)
      Caused by: org.apache.ibatis.exceptions.PersistenceException: 
      ### Error updating database.  Cause: com.ibm.db2.jcc.am.SqlTransactionRollbackException: DB2 SQL Error: SQLCODE=-911, SQLSTATE=40001, SQLERRMC=2, DRIVER=4.22.29
      ### The error may involve org.camunda.bpm.engine.impl.persistence.entity.VariableInstanceEntity.updateByteArraysByProcessInstanceId-Inline
      ### The error occurred while setting parameters
      ### SQL: update ACT_GE_BYTEARRAY       set REMOVAL_TIME_ = ?       where ID_ IN (                  SELECT BYTEARRAY_ID_         FROM ACT_HI_VARINST         WHERE PROC_INST_ID_ = ?       ) or ID_ IN (                  SELECT BYTEARRAY_ID_         FROM ACT_HI_DEC_IN I         INNER JOIN ACT_HI_DECINST D         ON I.DEC_INST_ID_ = D.ID_         WHERE D.PROC_INST_ID_ = ?       ) or ID_ IN (                  SELECT BYTEARRAY_ID_         FROM ACT_HI_DEC_OUT O         INNER JOIN ACT_HI_DECINST D         ON O.DEC_INST_ID_ = D.ID_         WHERE D.PROC_INST_ID_ = ?       ) or ID_ IN (                  SELECT JOB_EXCEPTION_STACK_ID_         FROM ACT_HI_JOB_LOG         WHERE PROCESS_INSTANCE_ID_ = ?       ) or ID_ IN (                  SELECT ERROR_DETAILS_ID_         FROM ACT_HI_EXT_TASK_LOG         WHERE PROC_INST_ID_ = ?       ) or ID_ IN (                  SELECT CONTENT_ID_         FROM ACT_HI_ATTACHMENT A         LEFT JOIN ACT_HI_TASKINST T ON A.TASK_ID_ = T.ID_         WHERE A.PROC_INST_ID_ = ?           OR T.PROC_INST_ID_ = ?       )
      ### Cause: com.ibm.db2.jcc.am.SqlTransactionRollbackException: DB2 SQL Error: SQLCODE=-911, SQLSTATE=40001, SQLERRMC=2, DRIVER=4.22.29
      	at org.apache.ibatis.exceptions.ExceptionFactory.wrapException(ExceptionFactory.java:30)
      	at org.apache.ibatis.session.defaults.DefaultSqlSession.update(DefaultSqlSession.java:200)
      	at org.camunda.bpm.engine.impl.db.sql.DbSqlSession.executeUpdate(DbSqlSession.java:259)
      	at org.camunda.bpm.engine.impl.db.sql.DbSqlSession.updateBulk(DbSqlSession.java:285)
      	at org.camunda.bpm.engine.impl.db.AbstractPersistenceSession.executeDbOperation(AbstractPersistenceSession.java:58)
      	at org.camunda.bpm.engine.impl.db.entitymanager.DbEntityManager.flushDbOperations(DbEntityManager.java:341)
      	... 48 more
      Caused by: com.ibm.db2.jcc.am.SqlTransactionRollbackException: DB2 SQL Error: SQLCODE=-911, SQLSTATE=40001, SQLERRMC=2, DRIVER=4.22.29
      	at com.ibm.db2.jcc.am.ld.a(ld.java:797)
      	at com.ibm.db2.jcc.am.ld.a(ld.java:66)
      	at com.ibm.db2.jcc.am.ld.a(ld.java:140)
      	at com.ibm.db2.jcc.am.up.c(up.java:2796)
      	at com.ibm.db2.jcc.t4.bb.v(bb.java:1076)
      	at com.ibm.db2.jcc.t4.bb.m(bb.java:450)
      	at com.ibm.db2.jcc.t4.bb.a(bb.java:62)
      	at com.ibm.db2.jcc.t4.p.a(p.java:50)
      	at com.ibm.db2.jcc.t4.wb.b(wb.java:220)
      	at com.ibm.db2.jcc.am.vp.vc(vp.java:3568)
      	at com.ibm.db2.jcc.am.vp.a(vp.java:4613)
      	at com.ibm.db2.jcc.am.vp.b(vp.java:4151)
      	at com.ibm.db2.jcc.am.vp.pc(vp.java:2871)
      	at com.ibm.db2.jcc.am.vp.execute(vp.java:2846)
      	at com.ibm.ws.rsadapter.jdbc.WSJdbcPreparedStatement.execute(WSJdbcPreparedStatement.java:632)
      	at org.apache.ibatis.executor.statement.PreparedStatementHandler.update(PreparedStatementHandler.java:46)
      	at org.apache.ibatis.executor.statement.RoutingStatementHandler.update(RoutingStatementHandler.java:74)
      	at org.apache.ibatis.executor.SimpleExecutor.doUpdate(SimpleExecutor.java:50)
      	at org.apache.ibatis.executor.BaseExecutor.update(BaseExecutor.java:117)
      	at org.apache.ibatis.executor.CachingExecutor.update(CachingExecutor.java:76)
      	at org.apache.ibatis.session.defaults.DefaultSqlSession.update(DefaultSqlSession.java:198)
      	... 52 more
      

      Root Cause:

      • When the referenced byte arrays (like job's configuration and job's exception stack trace) are deleted, the respective rows are exclusively locked.
      • When executing the UPDATE statement to update the byte array's removal time, then DB2 performs a full table scan of ACT_GE_BYTEARRAY.
      • When doing the full table scan, DB2 decides to lock all existing rows in ACT_GE_BYTEARRAY exclusively.
      • If two commands try to set the removal time (for different process instance), the deadlock happens in the following sequence of events:
        1. Participant 1 deletes the byte arrays first. Therefore, participant 1 will acquire exclusively locks on those rows in ACT_GE_BYTEARRAY.
        2. Participant 2 deletes the byte arrays first. Therefore, participant 2 will acquire exclusively locks on those rows in ACT_GE_BYTEARRAY.
        3. Participant 2 is a bit faster and starts to execute the query updateByteArraysByProcessInstanceId. Participant 2 exclusively acquires locks on each row in ACT_GE_BYTEARRAY in sequential order.
        4. Participant 2 tries to acquire an exclusive lock on the rows which participant 1 wants to delete. Because participant 1 already holds the exclusive locks, participant 2 waits for participant 1 to release these locks.
        5. Participant 1 starts to execute the query updateByteArraysByProcessInstanceId.
        6. Participant 1 tries to acquire an exclusive lock on the first row in ACT_GE_BYTEARRAY. Because participant 2 already holds the exclusive lock on that row, participant 1 waits for participant 2 to release that lock.

      Solution:

      • Divide the UPDATE statement updateByteArraysByProcessInstanceId into six dedicated UPDATE statements to update the corresponding byte arrays accordingly.
      • The solution has been already validated with an existing customer.
      • For further details please check CAM_10360.diff.

        This is the controller panel for Smart Panels app

              thorben.lindhauer Thorben Lindhauer
              yana.vasileva Yana Vasileva
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: