Uploaded image for project: 'camunda BPM'
  1. camunda BPM
  2. CAM-12992

Telemetry command data should not include dynamically generated class names

    XMLWordPrintable

Details

    Description

      Environment

      • Any Camunda installation with telemetry enabled

      Description

      • In order to send command usage, we record the class name of the command. Some commands are implemented as anonymous classes or lambdas
      • The name of anonymous classes is deterministic for one Camunda release (e.g. DbMetricsReporter$1), but can vary between releases (e.g. we might add a second anonymous class so that the index changes)
      • The class name of a lamda is dynamic at runtime
      • In both cases, these names are hard/impossible to interpret by people who evaluate the data in Kibana
      • Furthermore, we index all of these fields in Elasticsearch. Due to the dynamic nature, over time the total amount of indexed fields grows in an unbounded way and the performance of elasticsearch decreases.

      Steps to reproduce

      • Start Camunda
      • Activate telemetry
      • Run an API call that uses a lambda or anonymous class command

      Observed Behavior

      • These classes are included in the telemetry data

      Expected behavior

      • The classes are not included in the telemetry data (we can debate if we want the non-lamda ones, but I don't think so)

      Root Cause

      • We do not distinguish the cases in the telemetry reporter

      Solution Ideas

      • Use e.g. Class#isAnonymousClass for distinction

      Hints

      Full context (original report):

      As part of camunda/et#194 I figured out that telemetry data sent by Camunda BPM runtime contains field names like MetricsRestService$$Lambda$268/0x00000001009c4040 . These seem to identify classes created for Java lambda expressions on-the-fly and I assume that there is very few data for each of these identifiers as these identifiers might change at least when the JVM is restarted. Because our E.T. service uses dynamic mapping in ElasticSearch, our ElasticSearch index configuration was growing each time we received a new field with such an identifier which is why we assume that our Elasticsearch indexing failed to map further fields at some point.

      As part of my slack time activities I started experimenting with contract tests for the integration of Camunda BPM Runtime with E.T. and therefore had a deeper look into the BPM Runtime code. (Thank you for helping me get started in https://camunda.slack.com/archives/C6M6X3EKZ/p1608283160195000.) I'm happy I had the opportunity to have the look and this is by no means meant to preempt your analysis or anything bad - I highly value your integration of E.T.! I'd like to share what I found:

      What causes such identifiers is that CommandCounterInterceptor uses ClassNameUtil.getClassNameWithoutPackage(Object) that can return a class name that includes "$" signs e.g. for nested classes or classes that seem to be created for lamdba expressions at runtime. E.g. TelemetrySendingTask line 154 and line 335 contain commands that are lambda expressions and cause telemetry field names that include object IDs.

      We switched our Elasticsearch mapping strategy and are now excluding fields that have characters that are not in the character range [\p

      {IsLatin}

      0-9-_]. This has the effect that in the indices that have been created with the updated mapping in the last 2 days, the following command-related fields have been excluded from mapping (i.e. cannot be used in search or analysis):

      DbMetricsReporter$1
      FailedJobListener$1
      HistoricTaskInstanceReportImpl$2
      HistoryCleanupHandler$1
      MetricsRestService$$Lambda$268/0x00000001009c4040 (+ 2 similar occurrences)
      RepeatingFailedJobListener$CreateNewTimerJobCommand
      TaskReportImpl$1
      TelemetrySendingTask$$Lambda$1049/202207662 (+ 166 similar occurrences)
      TelemetrySendingTask$2

      Fields like DbMetricsReporter$1 are also covered by the new exclusion. We could allow these fields and just exclude fields whose name contains $$Lambda$. We would prefer if field names are only in the character range [\p\{IsLatin}0-9-_], but we leave this up to you: we could also index these fields or your could stop reporting telemetry for them in case this is not necessary for you.

      For fields like MetricsRestService$$Lambda$268/0x00000001009c4040 you need to stop reporting in the future - or change to names that do not contain object identifiers or so. Not tomorrow, as we are ignoring them right now, but this should happen with your next release (not knowning when that is scheduled), because even ignoring them creates a mapping entry that notes that these fields are ignored for each of these fields. This is going to grow to thousands of mapping entries in the close future.

      Our ping endpoint might in the future reject POST requests that contain such invalid data, but we do not have concrete plans about this yet.

      Please excuse that we did not analyze and fix the received data earlier and thereby missed the chance to make you aware of the issue earlier.

      mgm-controller-panel

        This is the controller panel for Smart Panels app

        Attachments

          Issue Links

            Activity

              People

                nikola.koevski Nikola Koevski
                sebastian.lohmeier Sebastian
                Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                  Created:
                  Updated:
                  Resolved:

                  Salesforce