Uploaded image for project: 'camunda BPM'
  1. camunda BPM
  2. CAM-12904

Deadlock exception with DB2 when executing set removal time batch for decisions

XMLWordPrintable

    • Icon: Bug Report Bug Report
    • Resolution: Fixed
    • Icon: L3 - Default L3 - Default
    • 7.14.5, 7.15.0-alpha4, 7.15.0
    • None
    • None

      Environment

      • Camunda BPM 7.12.5
      • Database: DB2 v11.1.4.4
        • Isolation Level: READ_COMMITTED

      Description

      • Setting the removal time for decisions via the corresponding batch operation results in a deadlock exception
      • Deadlock victim query, as well as the other involved query, is:
        -- VariableInstanceEntity#updateByteArraysByDecisionInstanceId
        UPDATE act_ge_bytearray 
        SET    removal_time_ = ? 
        WHERE  id_ IN (SELECT bytearray_id_ 
                       FROM   act_hi_dec_in I 
                              INNER JOIN act_hi_decinst D 
                                      ON I.dec_inst_id_ = D.id_ 
                       WHERE  D.id_ = ?) 
                OR id_ IN (SELECT bytearray_id_ 
                           FROM   act_hi_dec_out O 
                                  INNER JOIN act_hi_decinst D 
                                          ON O.dec_inst_id_ = D.id_ 
                           WHERE  D.id_ = ?) 
        

      Steps to reproduce

      • Setup an application server with DB2 v11.1 and Camunda BPM 7.12
      • Go to Cockpit and open the batch operation page
      • Choose the "Set removal time to decision instances" batch operation, select some decision instances (at least two), and set an absolute removal time
      • Execute the batch operation
      • Go to the batches page

      Observed behavior

      The batch operation cannot be successfully completed and has failed jobs due to a deadlock exception: exception_stacktrace.txt

      Expected behavior

      The batch operation can be completed successfully without failed jobs.

      Root cause

      • The query optimizer rewrites the SQL query as follows:
        (SELECT
           Q6.BYTEARRAY_ID_
         FROM
           CBPM_INPUT.ACT_HI_DECINST AS Q5,
           CBPM_INPUT.ACT_HI_DEC_IN AS Q6
         WHERE
           (Q6.DEC_INST_ID_ = 'bc6132f1-79dd-11e8-a021-005056bd9c8c') AND
           (Q5.ID_ = 'bc6132f1-79dd-11e8-a021-005056bd9c8c')
        ) AS Q7
        RIGHT OUTER JOIN CBPM_INPUT.ACT_GE_BYTEARRAY AS Q8
        ON (Q8.ID_ = Q7.BYTEARRAY_ID_)
        
      • The rows of the ACT_GE_BYTEARRAY table are exclusively locked as part of the RIGHT OUTER JOIN table scan:
        ....
        6) TBSCAN: (Table Scan)
        ....
        ROWLOCK: (Row Lock intent)
        EXCLUSIVE
        
      • When the above-mentioned query is executed in parallel (i.e., more than one removal time batch jobs run in parallel), a deadlock situation might occur as follows:
        • Participant 1
          • Holds exclusive lock on a row of ACT_GE_BYTEARRAY
          • Waits to set an exclusive lock on a row of ACT_GE_BYTEARRAY, however, the lock is held by participant 2
        • Participant 2
          • Holds exclusive lock on a row of ACT_GE_BYTEARRAY
          • Waits to set an exclusive lock on a row of ACT_GE_BYTEARRAY, however, the lock is held by participant 3
        • Participant 3
          • Holds exclusive lock on a row of ACT_GE_BYTEARRAY
          • Waits to set an exclusive lock on a row of ACT_GE_BYTEARRAY, however, the lock is held by participant 1

      Solution idea

      1. Split up the VariableInstanceEntity#updateByteArraysByDecisionInstanceId query into two queries
        • Since the same error pattern is likely to occur for the query VariableInstanceEntity#updateByteArraysByRootDecisionInstanceId (hierarchical flag true), we should split up this query as well
        • The solution has already been validated with a customer
        • There is already a pull request that fixes the problem: https://github.com/camunda/camunda-bpm-platform/pull/1201

        This is the controller panel for Smart Panels app

              tassilo.weidner Tassilo Weidner
              tassilo.weidner Tassilo Weidner
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: