-
Task
-
Resolution: Fixed
-
L3 - Default
-
None
-
None
-
Not defined
See https://camunda.slack.com/archives/CNSMT82SJ/p1656407103942019
The migration stage is flaky sometimes failing with:
Suppressed: org.elasticsearch.client.ResponseException: method [POST], host [http://localhost:9200], URI [/_snapshot/my_backup/snapshot_1/_restore?master_timeout=30s&wait_for_completion=true], status line [HTTP/1.1 500 Internal Server Error] [2022-06-28T08:34:54.403Z] {"error":{"root_cause":[{"type":"snapshot_restore_exception","reason":"[my_backup:snapshot_1/Q9cTtPZsTfCbs3mhjfekRg] cannot restore index [optimize-process-instance-reviewinvoice_v8] because an open index with same name already exists in the cluster. Either close or delete the existing index or restore the index under a different name by providing a rename pattern and replacement name"}],"type":"snapshot_restore_exception","reason":"[my_backup:snapshot_1/Q9cTtPZsTfCbs3mhjfekRg] cannot restore index [optimize-process-instance-reviewinvoice_v8] because an open index with same name already exists in the cluster. Either close or delete the existing index or restore the index under a different name by providing a rename pattern and replacement name"},"status":500}
This may originate from a race condition between shutdown and deletion of indices, see these lines https://github.com/camunda/camunda-optimize/blob/master/qa/upgrade-tests/src/test/groovy/com/camunda/optimize/test/upgrade/UpgradeEsSchemaIT.groovy#L43-L44
as the stop call does not wait for the process to terminate.