Uploaded image for project: 'Camunda Optimize'
  1. Camunda Optimize
  2. OPT-2845 Report on number of incidents
  3. OPT-4341

Handle incidents with missing process instance ids

XMLWordPrintable

    • Not defined

      Context:
      We import all incidents from the engine to Optimize. In our artificially generated test dataset it happened that an incident has been created which is not related to any process instance. See here:

      [
        {
          "id": "3416703e-fe61-11ea-8728-9e597284c1f5",
          "processDefinitionKey": null,
          "processDefinitionId": null,
          "processInstanceId": null,
          "executionId": null,
          "rootProcessInstanceId": null,
          "createTime": "2020-09-24T14:26:42.739+0200",
          "endTime": null,
          "removalTime": null,
          "incidentType": "failedJob",
          "activityId": null,
          "failedActivityId": null,
          "causeIncidentId": "3416703e-fe61-11ea-8728-9e597284c1f5",
          "rootCauseIncidentId": "3416703e-fe61-11ea-8728-9e597284c1f5",
          "configuration": "7d27ed1c-83b1-11ea-9b3a-76db8af5f89c",
          "historyConfiguration": "3415fb0d-fe61-11ea-8728-9e597284c1f5",
          "incidentMessage": "An exception occurred in the persistence layer. Please check the server logs for a detailed message and the entire exception stack trace.",
          "tenantId": null,
          "jobDefinitionId": null,
          "open": true,
          "deleted": false,
          "resolved": false
        }
      ]
      

      The job that failed shows the following:

      {
      "SELECT * FROM act_ru_job where id_ = '7d27ed1c-83b1-11ea-9b3a-76db8af5f89c'": [
      	{
      		"id_" : "7d27ed1c-83b1-11ea-9b3a-76db8af5f89c",
      		"rev_" : 35,
      		"type_" : "ever-living",
      		"lock_exp_time_" : null,
      		"lock_owner_" : null,
      		"exclusive_" : true,
      		"execution_id_" : null,
      		"process_instance_id_" : null,
      		"process_def_id_" : null,
      		"process_def_key_" : null,
      		"retries_" : 0,
      		"exception_stack_id_" : "87344c1d-fe7d-11ea-8774-fa32a6929e25",
      		"exception_msg_" : "An exception occurred in the persistence layer. Please check the server logs for a detailed message and the entire exception stack trace.",
      		"failed_act_id_" : null,
      		"duedate_" : "2020-04-21T15:48:08.146Z",
      		"repeat_" : null,
      		"repeat_offset_" : 0,
      		"handler_type_" : "history-cleanup",
      		"handler_cfg_" : "{\"countEmptyRuns\":14,\"immediatelyDue\":false,\"minuteFrom\":0,\"minuteTo\":59}",
      		"deployment_id_" : null,
      		"suspension_state_" : 1,
      		"job_def_id_" : null,
      		"priority_" : 0,
      		"sequence_counter_" : 4,
      		"tenant_id_" : null,
      		"create_time_" : "2020-04-21T09:21:32.044Z"
      	}
      ]}
      

      which shows that this is triggered by history cleanup.

      Also very interesting response from the engine team (Thorben):

      I guess this can happen if the history cleanup job or a timer start event job (that one should reference a process definition though) run out of retries.

      AT:

      • incidents that aren't associated with a process instance id are skipped during the import and a message is logged (info level) to inform the user
      • the documentation mentions that certain incidents are not imported to Optimize

        This is the controller panel for Smart Panels app

              Unassigned Unassigned
              johannes.heinemann Johannes
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

                Created:
                Updated:
                Resolved: