Uploaded image for project: 'Camunda Optimize'
  1. Camunda Optimize
  2. OPT-1101

Optimize performance is tested against large data

    • Icon: Feature Request Feature Request
    • Resolution: Fixed
    • Icon: L3 - Default L3 - Default
    • 2.0.0
    • None
    • None
    • None

      • given
        • database with large dataset
      • when
        • daily trigger is reached
      • then
        • performance job is running against large dataset
          • starts engine-7.8
        • job is successful
        • I can see how long it took

        This is the controller panel for Smart Panels app

            [OPT-1101] Optimize performance is tested against large data

            Askar added a comment -

            Askar added a comment - see https://hq2.camunda.com/jenkins/optimize/view/All/job/Performance%20Large%20Dataset/

            Askar added a comment - - edited

            Q: How can start be bigger then total count?

            22:35:31.694 [ElasticsearchImportJobExecutor-pool-thread-2] DEBUG o.c.o.s.es.writer.ImportIndexWriter - Writing all entities based import index type [variable-process-instance-tracking] to elasticsearch. Starting from [1942635] and having a max entity count [4920]
            

            A: maxEntityCount is written on init from ES, or while calculating progress (Should not happen probably), or while resetting index. import progress is independent of this mechanic, as it relies on runtime query against PI ids

            Q: why is fetch size 1000 if page size in scroll is 2M? And why scroll returns 2M ids in the first place?

            22:52:15.514 [main] DEBUG o.c.o.s.e.i.i.h.i.VariableInstanceImportIndexHandler - Scroll search query got [2000000] results
            22:52:15.606 [EngineImportJobExecutor-pool-thread-1] DEBUG o.c.o.s.e.i.f.i.ActivityInstanceFetcher - Fetched [10000] historic activity instances within [189] ms
            22:52:16.257 [EngineImportJobExecutor-pool-thread-1] DEBUG o.c.o.s.e.i.f.i.FinishedProcessInstanceFetcher - Fetched [1000] historic process instances within [94] ms
            22:52:16.309 [EngineImportJobExecutor-pool-thread-1] DEBUG o.c.o.s.e.i.f.i.ProcessDefinitionFetcher - Fetched [1] process definitions within [39] ms
            22:52:16.350 [EngineImportJobExecutor-pool-thread-1] DEBUG o.c.o.s.e.i.f.i.ProcessDefinitionFetcher - Fetched [1] process definitions within [40] ms
            22:52:16.409 [EngineImportJobExecutor-pool-thread-1] DEBUG o.c.o.s.e.i.f.i.ProcessDefinitionXmlFetcher - Fetched [1] process definition xmls within [59] ms
            22:52:16.410 [EngineImportJobExecutor-pool-thread-1] DEBUG o.c.o.s.e.i.f.i.VariableInstanceFetcher - fetching variables for [1000] PIs
            

            A: debug statement used totalHitCount of the scroll instead of actual hits length. Which leads to ambiguous statements in log file.

            Askar added a comment - - edited Q: How can start be bigger then total count? 22:35:31.694 [ElasticsearchImportJobExecutor-pool-thread-2] DEBUG o.c.o.s.es.writer.ImportIndexWriter - Writing all entities based import index type [variable-process-instance-tracking] to elasticsearch. Starting from [1942635] and having a max entity count [4920] A: maxEntityCount is written on init from ES, or while calculating progress (Should not happen probably), or while resetting index. import progress is independent of this mechanic, as it relies on runtime query against PI ids Q: why is fetch size 1000 if page size in scroll is 2M? And why scroll returns 2M ids in the first place? 22:52:15.514 [main] DEBUG o.c.o.s.e.i.i.h.i.VariableInstanceImportIndexHandler - Scroll search query got [2000000] results 22:52:15.606 [EngineImportJobExecutor-pool-thread-1] DEBUG o.c.o.s.e.i.f.i.ActivityInstanceFetcher - Fetched [10000] historic activity instances within [189] ms 22:52:16.257 [EngineImportJobExecutor-pool-thread-1] DEBUG o.c.o.s.e.i.f.i.FinishedProcessInstanceFetcher - Fetched [1000] historic process instances within [94] ms 22:52:16.309 [EngineImportJobExecutor-pool-thread-1] DEBUG o.c.o.s.e.i.f.i.ProcessDefinitionFetcher - Fetched [1] process definitions within [39] ms 22:52:16.350 [EngineImportJobExecutor-pool-thread-1] DEBUG o.c.o.s.e.i.f.i.ProcessDefinitionFetcher - Fetched [1] process definitions within [40] ms 22:52:16.409 [EngineImportJobExecutor-pool-thread-1] DEBUG o.c.o.s.e.i.f.i.ProcessDefinitionXmlFetcher - Fetched [1] process definition xmls within [59] ms 22:52:16.410 [EngineImportJobExecutor-pool-thread-1] DEBUG o.c.o.s.e.i.f.i.VariableInstanceFetcher - fetching variables for [1000] PIs A: debug statement used totalHitCount of the scroll instead of actual hits length. Which leads to ambiguous statements in log file.

            Johannes added a comment -

            Review hints:

            • Should we maybe adjust the debug statements, if there are misunderstandable?
            • Please add the following properties to the default config, as it seems we need those to import a lot of data:
              import:
                handler:
                  backoff:
                    #Interval which is used for the backoff time calculation.
                    interval: 10000
                    #If all jobs are backing off at the moment, this interval is used
                    #to trigger general backoff
                    value: 60000
                    #Once all pages are consumed, the import scheduler component will
                    #start scheduling fetching tasks in increasing periods of time,
                    #controlled by "backoff" counter.
                    max: 15
                    #Tells if the backoff is enabled of not.
                    isEnabled: true
                  pages:
                    resetInterval:
                      unit: Hours
                      value: 12
              

            Johannes added a comment - Review hints: Should we maybe adjust the debug statements, if there are misunderstandable? Please add the following properties to the default config, as it seems we need those to import a lot of data: import : handler: backoff: #Interval which is used for the backoff time calculation. interval: 10000 #If all jobs are backing off at the moment, this interval is used #to trigger general backoff value: 60000 #Once all pages are consumed, the import scheduler component will #start scheduling fetching tasks in increasing periods of time, #controlled by "backoff" counter. max: 15 #Tells if the backoff is enabled of not. isEnabled: true pages: resetInterval: unit: Hours value: 12

            Askar added a comment -

            I have adjusted debug statements in one of previous commits. Max count logic is untouched.

            Askar added a comment - I have adjusted debug statements in one of previous commits. Max count logic is untouched.

              Unassigned Unassigned
              askar.akhmerov Askar
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: