-
Bug Report
-
Resolution: Done
-
L3 - Default
-
None
-
Not defined
Situation
Optimize needs to know other members of the organization for Optimize running in Camunda Cloud SaaS. For the time being, this has been implemented by periodically fetching the member-details from the accounts backend using a M2M client.
Complication
Starting with Camunda Cloud 1.2 Optimize is part of the Stable-Channel, every new Cluster that gets created also gets Optimize. According to accounts-stats we already have at least 433 clusters running with optimize.
Unfortunately, every optimize cluster seems to use exactly the same cron-config for periodically fetching the member details.
As the accounts-endpoint queried does need to request the information from auth0, auth0 is handing back a 429 (too many requests) to accounts backend which results in a 500 towards the caller (optimize).
I actually have no idea if this error has been observed by you already or what the user impact is.
Resolution
Before going into possible solutions, let me first state that accounts-backend will be changed in a way to pass through a 429 from auth0 to the initial caller. this might take some days to float over to production, but we will do that.
The right solution
As stated in Q3 already, the current approach of using a M2M-client for fetching the memberships on a periodic schedule is not the right way to do it.
The right way would be to use the JWT token provided by the Optimize-user towards the accounts-backend - flattening the requests to an on-demand schedule and also hardening the interface (as the current M2M client has "read members from all organizations" permission.
sebastian.bathke already stated, that this will be tackled in Q4
Stop the bleeding
One way to get the periodic fetching done in a more evenly distributed way would be not to use a hardcoded cron-config, but instead just use something like "startdate of the pod" for calculating the interval. Or just pick that startDate randomly...