-
Bug Report
-
Resolution: Fixed
-
L3 - Default
-
None
-
None
-
Not defined
Our large import performance test has been failing recently with incorrect data values compared to expectations. The output suggests that the engine data is correct, but something is going wrong on import. Interestingly, the instance count is lower but some other fields have higher counts than expected, as per below. The failure counts are always the same too. The test passes using the stage dataset.
NUMBER_OF_PROCESS_INSTANCES [2021-04-19T09:45:01.421Z] + test 9993529 '=' 10000000 [2021-04-19T09:45:01.421Z] + error=true [2021-04-19T09:45:01.421Z] + echo NUMBER_OF_ACTIVITY_INSTANCES [2021-04-19T09:45:01.421Z] NUMBER_OF_ACTIVITY_INSTANCES [2021-04-19T09:45:01.421Z] + test 52020830 '=' 52011434 [2021-04-19T09:45:01.421Z] + error=true [2021-04-19T09:45:01.421Z] + echo NUMBER_OF_USER_TASKS [2021-04-19T09:45:01.421Z] NUMBER_OF_USER_TASKS [2021-04-19T09:45:01.421Z] + test 1989771 '=' 1989756 [2021-04-19T09:45:01.421Z] + error=true [2021-04-19T09:45:01.421Z] + echo NUMBER_OF_VARIABLES [2021-04-19T09:45:01.421Z] NUMBER_OF_VARIABLES [2021-04-19T09:45:01.421Z] + test 147235423 '=' 147235412 [2021-04-19T09:45:01.421Z] + error=true [2021-04-19T09:45:01.421Z] + echo NUMBER_OF_DECISION_INSTANCES [2021-04-19T09:45:01.421Z] NUMBER_OF_DECISION_INSTANCES [2021-04-19T09:45:01.421Z] + test 12857142 '=' 12857142 [2021-04-19T09:45:01.421Z] + '[' true ]
Notes:
It appears that the higher counts are caused by an issue we've noticed in the past: the "expected counts" from the engine are incorrect, the engine appears to progress with some process instance data after we've taken the expected count, so once we compare our imported data with the expected counts, they don't match. However, if we evaluate the expected count on demand at the time when we need it for the test, the count is correct. Creating a subtask to adjust the pipeline to evaluate the expected count on demand rather than refer to the metadata fields.
The lower process instance count appears to be caused by the maximum page size limitation when our importers fetch data. When fetching, we query for all entities with timestamp equal or larger than the last imported entity timestamp. We also set a the maxResults to return to 10.000 (from ENGINE_IMPORT_PROCESS_INSTANCE_MAX_PAGE_SIZE config). Due to how we manipulate the startdates of process instances, we sometimes have more than 10k instances with the same timestamp. In this case, some instances will be missed by the import.
Since we don't need the date adjustments for this dataset, we would like to add a flag to our data generation with which to specify whether or not dates should be adjusted. For the generation of this large dataset, we can then turn off the adjustment of startDate data.
AT:
- The expected Operate process instances and their properties are imported from the large dataset
This is the controller panel for Smart Panels app
1.
|
Add parameter to data generation to turn date adjustments off/on | Done | Unassigned | |
2.
|
Evaluate expected counts on demand during import performance tests | Done | Unassigned |