Performance issues
Incident Report for HelloID
Postmortem

Status
Resolved, action items in progress.

Impact

There were intermittent performance issues for end-users of customers that were using HelloID in West-Europe.

Summary

Users that were using HelloID experienced intermittent performance issues for two hours. The problem occurred due to a misconfiguration for polling responses from the Active Directory Authentication. The misconfiguration caused longer wait times that cascaded into longer server response times for all European customers.

Another issue caused the plugin and the mobile app to not load any applications, due to changes to how the end-users language is determined.

Resolution

To mitigate the longer server response times, we have scaled up our services. We have reconfigured the setting that is responsible for the wait time on the communication between HelloID and the HelloID Agent in Europe. Eventually, the server response time decreased to the correct response time average.

A hotfix for the plugin and mobile app will be released. In this hotfix, we have reverted changes to language determination; the revert will make the plugin and mobile app usable again.

Detection

Our support department detected high response times through our monitoring software and via notifications from our customers.

Action items

Done

Mitigation – Scaled up our services in Europe

Mitigation – Scaled up our services in America

Resolution – Reconfigured the communication setting between HelloID and the Agent in Europa and America

Resolution – Reverted the change of language determination

In progress

Prevention – Add new monitoring on the client-side so that we can measure all response times, and investigate active sessions.

Posted Jan 06, 2020 - 11:20 UTC

Resolved
This incident has been resolved.
Posted Jan 06, 2020 - 11:19 UTC
Update
We are monitoring the affected components. A hotfix will be released within thirty minutes to resolve the issues that plugins and mobile apps experiencing (not loading applications)
Posted Jan 06, 2020 - 11:00 UTC
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Jan 06, 2020 - 09:49 UTC
Investigating
We are currently investigating issues related to degraded performance and timeouts
Posted Jan 06, 2020 - 08:42 UTC