Post Mortem incident 25th of January
On Wednesday the 25th a major outage occurred on the HelloID platform. The outage caused long loading times, multiple redirects or failures to load pages.
The outage was caused by an issue at one of our cloud partners Microsoft. A detailed description of the issue can be found at: Azure status history | Microsoft Azure
How did we respond?
Tools4ever engineers started to investigate the issue at 6.15 AM UTC. This process was hindered due to loading problems with the administrative interface of the Microsoft cloud environment. Several communication issues between HelloID services were identified and mitigation actions were implemented. At 8 AM UTC Microsoft confirmed that they were experiencing a major incident causing communication issues between Microsoft cloud services.
At 10 AM UTC most services were restored. However several databases remained unavailable until 13:45 UTC due to the connection issue in the Microsoft datacenters.
At 13:45 UTC all issues were resolved.