Loading...

XML

Word

Printable

Type: Feature Request
Resolution: Unresolved
Priority: L3 - Default
Fix Version/s: None
Affects Version/s: None
Component/s: engine
Labels:
None

Epic Link:
Support for engine monitoring

User Story (Required on creation):

As an operator, I can monitor/alert engine metrics as part of my operation monitoring tool of choice.

Functional Requirements (Required before implementation):

Make the following engine metrics available via Prometheus interface (authenticated and authorized):
- Performance metrics
  - Threads active
  - Threads idle/available in the pool
  - Threads blocking
  - Job Backlog (pending jobs that are due but not yet executed)
- Usage metrics
  - See https://docs.camunda.org/manual/latest/reference/rest/metrics/get-metrics-sum/#path-parameters
Document metrics that allow operators not familiar with the Camunda terms to monitor the system (similar to what MongoDB does)

Technical Requirements (Required before implementation):

Create an internal representation of tagged metric samples (counters and gauges, a sample can have multiple tags)
Enable the job executor implementations to expose thread performance (active threads, available threads, blocked threads) where possible
Create an internal metrics collector that tracks samples
- Should be pluggable in the engine configuration
- Collects reported usage metrics from ACT_RU_METER_LOG
- Collects performance metrics from the job executor (access via engine configuration)
- Fetches number of jobs in backlog from the ACT_RU_JOB table
- (optional) Makes the metrics collection interval configurable in the engine configuration
  - Used by the DbMetricsReporter to fetch the known metrics and to store them in ACT_RU_METER_LOG
  - We might optimize the reporter to only store non-zero metrics in the database to spare unnecessary data
- Makes the list of collected usage and performance metrics configurable in the engine configuration
Documentation
- What metrics do we collect (usage vs. performance)?
- How do we collect metrics?
- How can you influence this with configuration?
- How can you provide your own metrics collector?
Create a REST endpoint (preferably under “/metrics/prometheus”) that exposes collected metrics in Prometheus format
- Normalize metric names so they conform to Prometheus format (Snake case)
- Serve Prometheus format based on requested result format (see MetricsServlet and Exporter)
- Fetch metrics from configured engine collector via API
- Transform internal metrics samples to Prometheus samples

Limitations of Scope (Optional):

Metrics not in scope

Job configuration (can be part of the diagnostics interface)
- core-pool-size
- max-pool-size
- queue-capacity
- wait-time-in-millis
- max-wait
- max-jobs-per-acquisition
Process metrics (should be monitored in Optimize)
- Process Instances Started (by process definition)
- Process Instances Ended (by process definition)
- Process Instance Cycle Time
- Number of incidents
Troubleshooting metrics
- Can decrease performance.
- They can be implemented as needed as a custom extension of the metrics collector
Health metrics (SUPPORT-10327)

Hints (optional):

This is the controller panel for Smart Panels app

is depended on by

CAM-14729 Integrate Camunda metrics into Spring Boot Actuator registry

Ready

is related to

CAM-9401 Spring Boot: exposing of Metrics via Micometer

Open

links to

https://github.com/camunda/camunda-bpm-platform/issues/2304

https://github.com/camunda/camunda-bpm-platform/issues/2771

Assignee:: Unassigned

Reporter:: Tobias Metzke-Bernstein

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 22/Jun/22 12:09 PM

Updated:: 12/Oct/22 12:07 PM

Details

Description

User Story (Required on creation):

Functional Requirements (Required before implementation):

Technical Requirements (Required before implementation):

Limitations of Scope (Optional):

Hints (optional):

mgm-controller-panel

This is the controller panel for Smart Panels app

Attachments

Issue Links

Activity

People

Dates

Salesforce