On February 28th, 2022, between 12:30AM and 1:30AM UTC, and on March 7th, 2022, 12:26AM UTC and 12:42AM UTC, Atlassian customers using Jira native apps and apps developed by partners were unable to perform access token refreshes. The event was triggered by a faulty deployment of an OAuth2 service in the Atlassian Identity Platform. The incident was detected within 1 minute by automated monitoring and mitigated by a rollback of the offending service, which put Atlassian systems into a known good state. The total time to resolution was about 42 minutes.
The two incidents are tracked here and this incident review applies to both:
Client requests against the /oauth/token
endpoint expecting a GZIP’d response body received an unexpected response and so would summarily fail on their refresh grant flows.
A build of an OAuth2 service with broken content negotiation was autodeployed to production. This build exhibited broken compression/decompression in content negotiation while performing refresh_token grant flows. Any request that expected a GZIP’d response body against the /oauth/token
endpoint in the context of refreshing an access token would have failed due to the server not honoring the request by sending a compressed response.
We are prioritizing the following improvement actions to avoid repeating this type of incident:
Furthermore, we deploy our changes progressively to avoid broad impact but in this case, our continuous tests did not work as expected. To minimise the impact of breaking changes to our environments, we will implement additional preventative measures such as:
We apologize to customers and partners whose services were impacted during this incident; we are taking immediate steps to improve the platform’s performance and availability.