App Timeouts and System Performance
Incident Report for Fleetio
Postmortem

On the morning of January 14th, Fleetio began to suffer from performance degradation that affected all users of our web app, API and mobile apps. We realize that our customers rely heavily on Fleetio to complete their work and any unexpected downtime or performance degradation leads to frustration. Please accept our apologies—we strive to make Fleetio a friendly and pain-free experience for you, but we missed the mark.

The performance problems peaked at 8:30 AM CST and lasted several hours. During this time 4% of all page requests failed to load, resulting in a “500” error for a small percentage of page loads. Average response times increased eightfold, causing a slow and sluggish user experience.

After careful review, we found that our database was struggling to process a subset of queries that were designed for our new offline sync feature that was introduced in Fleetio Go version 2.4.0. The database struggled to retrieve data for these users, and it would hang on to these queries for several minutes, trying to complete the request. Eventually, most of the database’s resources were allocated to these requests, leaving few resources for anything else. This then caused performance to be compromised system-wide.

The critical problems have been resolved, and we’re now focused on improving the underlying cause, which includes database optimizations, a reimagining of how Fleetio Go’s offline sync functions and better monitoring to ensure such long periods of degradation don’t occur again.

The biggest lesson learned was that our new offline sync feature in Fleetio Go is not up to our level of standards. We focused on too narrow a use case and failed to consider our user base as a whole. This not only lead to system performance issues but also to frustratingly long sync times for a subset of our users, most of which were not interested in offline mode.

Since our initial implementation of offline mode doesn’t meet the standards we set out to achieve at Fleetio, we’ve made the decision to temporarily remove offline functionality from Fleetio Go. Starting in version 2.5.0, Go users will no longer be able to use the offline mode feature. For users who were ready to use offline mode—we’re hard at work reimagining how these features work, and you can be confident that we’ll have a much more robust implementation in the near future.

Download the latest version of Fleetio Go in the App Store or on Google Play

Posted Jan 29, 2019 - 11:43 CST

Resolved
After monitoring the earlier implementation of a fix for this issue, we are marking the issue as resolved.

We again apologize for the inconvenience this issue caused and thank you for your patience while we worked on a resolution.
Posted Jan 11, 2019 - 20:35 CST
Monitoring
We’ve identified a slow-running query that has been affecting performance in Fleetio. Our Engineering Team has implemented a fix and is monitoring system performance. Thank you for your patience while we investigated this issue.
Posted Jan 11, 2019 - 14:46 CST
Investigating
Our Engineering Team is currently investigating decreased system performance in Fleetio. We apologize for any errors or system slowness you may be experiencing. We will update this status page as we have more information.
Posted Jan 11, 2019 - 09:25 CST
This incident affected: Fleetio Web Application & API.