-
Bug Report
-
Resolution: Fixed
-
L3 - Default
-
None
Environment
- Oracle 12 and higher
- Default block size of 8,000 byte
Description
Creating the Camunda schema fails at the index ACT_IDX_JOB_HANDLER, if Oracle >= 12 is configured to create extended data types for character columns (including NVARCHAR2). This lifts the historic limitation of varchar fields to be restricted to 4,000 bytes, which depending on the encoding in use could not hold every possible value. With the larger field values, index keys can become too large for Oracle to handle, as one key must fit into a single block.
Steps to reproduce
- Enable extended data types (https://docs.oracle.com/en/database/oracle/oracle-database/18/spuss/enabling-the-new-extended-data-type-capability.html#GUID-88FB7FFD-4392-49C6-843A-45B49F8A1821). It lifts the limit of the size of NVARCHAR2 fields from 4,000 bytes to 32,767 bytes
- Create the Camunda schema, in particular index ACT_IDX_JOB_HANDLER
Observed Behavior
- Index creation fails, explaining that the maximum key length for the index is exceeded (maximum length is 6398)
Expected behavior
- The index and the entire schema can be created without problems
Root Cause
- On Oracle, an index key must fit into one block
- With the default block size of 8,000 bytes, the keys of the index may not exceed 6,398 bytes
- The index is defined on ACT_RU_JOB(HANDLER_TYPE_,HANDLER_CFG_). HANDLER_TYPE_ is NVARCHAR2(255), HANDLER_CFG_(2000)
- For example with UTF-8 encoding, HANDLER_CFG_ will consume up to 8,000 bytes
Solution Ideas
Oracle's recommendations for solutions can be found at https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/CREATE-INDEX.html -> Creating an Index on an Extended Data Type Column
- Instead of indexing the raw column, we can create a function-based index that hashes the column value on ACT_RU_JOB(HANDLER_TYPE_, STANDARD_HASH(HANDLER_CFG_)). According to Oracle docs, the index can still be used by the query optimizer if HANDLER_CFG_ is used in equality or IN predicates
- Instead of indexing the raw column, we can create an index on a substring of the column so that the index is overall below the mark of 6,398 bytes. In that case, the index can still be used for equality, IN and range predicates.
Option 1 looks preferable. With option 2, we would need to carefully evaluate that the field values never actually exceeds the chosen substring length (this may be true as of now, see hints for what we currently store in the field).
Hints
Function-based indexes
- Function-based indexes cannot index null values. STANDARD_HASH of null returns a non-null value. Also, since the index is composite and HANDLER_TYPE_ is never null, this should probably be ok either way
Query
- HANDLER_CFG_ is used in one query (besides the select by id and update queries): Job.xml selectJobsByConfiguration.
- Usages:
- If there is an async job for the triggering of an event subscription and that event subscription is deleted while jobs exist
- Deleting the timer start jobs when a process definition is deleted via API
- Replacing the timer start event jobs when a new version of a process is deployed
- The HANDLER_TYPE_ and HANDLER_CFG_ fields are only used in equality predicates
- Usages:
- Extended data types are only supported with Oracle >= 12
- Function-based indexes also work on Oracle 11
Usage of HANDLER_CFG_
- Maps to JobEntity#jobHandlerConfigurationRaw
- Written from JobHandlerConfiguration#toCanonicalString
- JobHandlerConfiguration has the following implementations (second part is the content that the configurations write):
- AsyncContinuationConfiguration: <atomic operation>$<transitionId>
- BatchJobConfiguration: <configurationByteArrayId>
- BatchMonitorJobConfiguration: <batchId>
- BatchSeedJobConfiguration: <batchId>
- EventSubscriptionJobConfiguration: <eventSubscriptionId>
- HistoryCleanupJobHandlerConfiguration: {"countEmptyRuns": <int>, "immediatelyDue": <boolean>, "minuteFrom": <int>, "minuteTo": <int>}
- JobDefinitionSuspensionStateConfiguration: {"by": <string: currently max 22 chars>, "jobDefinitionId": <string: id NVARCHAR2(64)>, "processDefinitionId": <string: id NVARCHAR2(64)>, "processDefinitionKey": <string: id NVARCHAR2(255)>, "processDefinitionTenantId": <string: custom id; NVARCHAR2(64)>, "includeJobs": <boolean>}
- ProcessDefinitionSuspensionStateConfiguration {"by": <string: currently max 22 chars>, "processDefinitionId": <string: id, NVARCHAR2(64)>, "processDefinitionKey": <string: id, NVARCHAR2(255)>, "processDefinitionTenantId": <string: custom id, NVARCHAR2(64)>, "includeProcessInstances": <boolean>}
- TimerJobConfiguration <timerElementKey; activity id of any size>$taskListener~<timerElementSecondaryKey; task listener id of any size>$followUpJobCreated
Additional sources