What are the steps to reproduce your problem?
- Setup the engine to use an empty, supported SqlServer instance
- Start attached Spring Boot application
- Change History Time To Live to 0 in cockpit
- Create/Start 5000 Instances for cleanup
- Change History Time To Live to 2 in cockpit
- Start constantly 2 running processes (with a dalay of 50ms) + Start Cleanup in cockpit
What is the problem?
- Some running instances will end up failing in the end event due to deadlock exceptions when deleting variables (due to a deadlock with the deletion from byte array table by removal time triggered by the history cleanup)
What would be the expected behavior:
- All process instances finish successfully
- When deleting data from ACT_GE_BYTEARRAY during history cleanup (e.g., VariableInstance.xml#deleteByteArraysByRemovalTime), then SQL server must ensure that any foreign key constraints are still satisfied after the deletion, i.e., no other tables reference the byte array rows.
- Currently, the tables ACT_RU_VARIABLE, ACT_RU_JOB and ACT_RU_EXT_TASK have such a constraint.
- In order to ensure that, SQL server joins all the tables together during the delete (left join) and verifies that all rows from ACT_GE_BYTEARRAY have no join partners.
- When there is no or only a small amount of data in the tables, SQL server uses merge joins with index scans because it considers them to be faster than using a loop join with index seek (that is usually used when more data is contained in the tables as this is usually faster then).
- The index scans are locking substantial amounts of data in the tables as opposed to the index seek.
- Different transactions block different tables first and start creating deadlocks for the remaining tables to update.
- Using loop joins with index seeks can be forced with SqlServer by using the query hint OPTION (LOOP JOIN).
- Forcing the use of this hint would mean to create SqlServer-specific mappings in the following mapping files: