Split import jobs that could exceed max size limitations

XMLWordPrintable

    • Type: Task
    • Resolution: Unresolved
    • Priority: L3 - Default
    • None
    • Affects Version/s: None
    • Component/s: backend
    • 2
    • M

      Since the introduction of object variable flattening, we have an import pipeline where the number of documents getting written to Elasticsearch could greatly exceed the number of entities fetched from the data source (engine or Zeebe). As a result, we could have requests failing, in extreme cases that can't even be solved by reduced batch sizes.

      One way we could handle this is to try to identify potentially large import batches (across all documents, not just variables), and split them into multiple smaller jobs to be executed.

      Hints:

      • This might not be a straightforward refactor, given that the imports are dependency on callbacks completed by the import jobs
      • We should consider making the max import batch size configurable

      Justification:

      Reduce the chance of missing data on import, or skipping data during import. In the worst case, this would mitigate the chance of import being blocked

            Assignee:
            Unassigned
            Reporter:
            Joshua Windels
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: