One big reason I started implementing DevOps culture and practices at my company in 2018 was because we didn't want to fall behind our competitors when it came to software delivery.
That same year, the book Accelerate: the Science of Lean Software and DevOps was released, claiming that:
“Year over year, the best [software teams] keep getting better, and those who fail to improve fall further and further behind.”
Backed by studies spanning six years and over 2,000 companies from DORA's State of DevOps research program, the Accelerate book makes the case that organizations with the right mindset can achieve better speed and stability. It explains how high performers understand that they don't have to trade speed for stability — or vice versa — because they will get both by building in quality.
The book urges engineering organizations to first understand their current progress by establishing metrics. Determining which metrics to use is an opinionated decision, of course, but this is where the book shines: they've honed in on four key metrics that studies have proven to affect software delivery performance.
Known as DORA metrics - or thanks to the book, also popularized as Accelerate metrics, they are:
If you're not familiar, check out our explainer on what DORA metrics are and how to improve on them.
Even right now, you can use DORA's quick online test to see how you perform on the four metrics compared to other teams in your industry. More recently, LaunchDarkly partnered with Sleuth to publish an industry survey report on the adoption of frequent deployments and continuous delivery, from which you can glimpse how often other teams deploy to production.
However, to drive your improvement journey, you'll ultimately need to automate the data tracking of such metrics. That’s why we’re on this quest to find the best engineering metrics trackers in the market. In this article, we will compare five engineering productivity tools that claim to deliver the four key metrics from DORA and Accelerate:
We'll organize the comparison criteria into three sections.
The first section considers which tools track which Accelerate / DORA metrics, and how those metrics are calculated and displayed. We want to know if all four metrics are present and accurately measured. We also want to the tools to show the metrics clearly and comprehensively. Accurate measurement leads to better insights, conclusions, and results. Seeing the big picture and trends is also essential.
The second section tackles developer friendliness. We’re not talking about ease of use, but rather if the tools track individual metrics or dubious proxies for productivity that are "unfriendly" to developers as they could increase stress and be counterproductive.
Additionally, a good tool for developers does not solely display metrics. Effective tools should also provide actionable feedback to speed up development and reduce deployment pain.
To speed up development, we expect features that can help developers spend more time coding and less time in meetings, encourage devs to work in small batches, improve or simplify developer communications, make the flow of work more visible, and bring insights into bottlenecks and where they occur.
To speed up deployment and make deploys less painful, we expect features that include tracking deployments across environments, automation of deploy workflow and approval, correlation of deploys and their impact on application or service health, and anomaly detection.
In the last section of our comparison, we consider integration and customization. The more integrations are supported, the lower the implementation effort and the better the tracking. Also, it’s vital to assess how flexible the tool is in meeting your needs, so you can avoid wasting time later.
Now that we have a clear picture of our approach and strategy, let's dive into our comparison of DORA metrics trackers.
Only three out of five products calculate all four Accelerate metrics: Sleuth, Faros, and Haystack.
LinearB and Velocity mainly focus on throughput represented by cycle time and deployment frequency. They omit the failure rate metric. While LinearB does display MTTR, it doesn't seem as mature compared to the other two metrics.
Haystack lacks accuracy in calculating lead time because it infers deployment based solely on Git tags, merges, or branches. Haystack also lacks accuracy when accounting for failure, because it does not interface with monitoring systems; instead, it infers failures based on fixes and rollbacks.
Conversely, Sleuth and Faros offer complete and accurate metrics. Both interface with your CI/CD to incorporate deployment data as well as health monitoring data for quantifying failures.
User experience-wise, the UI for Faros is bulky and generic. It offers a lot of visualization but does not specifically focus on Accelerate metrics in their UI. Velocity displays its own set of metrics primarily.
In contrast, LinearB, Sleuth, and Haystack carefully design their UI around Accelerate metrics. Haystack has a polished UI that focuses on the essentials. It is nicely color-coded, focusing your attention on the important elements.
Sleuth has two main dashboards, Project Metrics and Project Status, which are specifically designed to present DORA metrics. Both dashboards provide rich insight into your processes, recent deployments, and performance compared to industry benchmarks.
When it comes to developer friendliness, it is crucial to avoid ranking or measuring individual performance. This is in line with fostering a DevOps culture.
Unfortunately, LinearB, Faros, and Velocity provide individual per-developer metrics such as lines of codes, commit frequency, and the number of pull requests. While LinearB and Faros do not necessarily encourage using these metrics, the metrics exist nonetheless and can be used. Velocity provides many options to track individual performances.
According to Accelerate, these particular individual metrics are questionable proxies for productivity, and can negatively impact culture and increase employee burnout.
Sleuth and Haystack chose not to track individual performance, but rather focus on team and organization performance. They carefully select the metrics made available in their products.
In terms of guidance, LinearB, Sleuth, and Haystack provide sufficient tools to help teams improve at delivering software. All of them provide Slack integrations with alerts and reminders.
LinearB focuses solely on cycle time and deployment frequency. Although LinearB has added tools to speed up delivery, it does not provide insights for reducing deployment pain.
Sleuth, on the other hand, provides both at the same time — an excellent environment to improve development speed and a mechanism to make deployments easier and less painful. Notably, Sleuth helps developers coordinate and track the deployment process. Rather than watching developer activity from a distance, Sleuth integrates with the development team workflow and approval process.
In comparison, Faros and Velocity have significantly less impact on developers. Faros encourages you to create your alerts with a no-code platform. Velocity does have a stand-up recap to help Scrum teams, but we can’t imagine developers engaging daily with Velocity. It’s more likely that a manager will glance at the metrics periodically to get a general sense of project progress and struggles.
All of the tools we considered put in a lot of effort to automate the data capture process. However, we find stark differences when we consider each tool’s ability to integrate with key systems in the modern software supply chain: issue tracking, the CI/CD toolchain, and monitoring. Stronger integration with these systems builds a richer overall picture of your metrics and performance.
Every one of the tools we evaluated — except for Haystack — has an issue tracker integration. Issue trackers are the backbone of collaboration between business and development. They enable traceability from an idea (the moment the ticket is created) all the way to the feature’s delivery to production.
Neither LinearB nor Haystack collect data from the CI/CD toolchain. Rather, they only infer information from Git or issue tracking systems, and this affects their accuracy. Faros, Sleuth, and Velocity integrate seamlessly with any CI/CD system. Each of these three products has an API that you call to signal when events, such as deployments or rollbacks, occur.
Only Sleuth and Faros provide integration with monitoring systems. Monitoring metrics are as important as delivery metrics. Delivery and monitoring metrics offer an actual feedback loop about the system's health and potential causes of failure. It’s important to remember that monitoring metrics are the source of truth when it comes to system health. Therefore, capturing monitoring metrics will impact how well you track MTTR and failure rate.
In terms of customization, Faros and Velocity are the leaders. Faros is the most customizable tool, designed to ingest data and create all kinds of metrics. Velocity offers a customizable dashboard.
In contrast, Sleuth and Haystack integrate very well within your ecosystem, but they’re not as customizable because they focus on a strong user experience around DORA's four key metrics. Because they limit the scope of the metrics they gather, having a highly customizable dashboard is not required.
On the journey toward becoming an elite performer in the software industry, both fast delivery and high stability are immensely important. Because the research is so clear, sacrificing even one of the four metrics from DORA/Accelerate is unacceptable.
In seeking a tool that tracks all four DORA metrics, we would necessarily eliminate LinearB and Velocity. Both focus only on speed. While they deliver good features to support their choice metrics, there is the grave risk of sacrificing quality for speed.
This leaves us with Sleuth, Faros, and Haystack. Sleuth stands out for combining high accuracy on the metrics with an excellent UI that focuses squarely on the Accelerate metrics. Haystack lacks accuracy, while the UI from Faros is too generic and cluttered with too many different metrics.
Embracing DevOps processes entails cultivating DevOps culture. Because the metrics from Velocity, LinearB, and Faros can be used for (or misconstrued as being used for) spying on and micromanaging developers, we have further reasons to consider other tools besides these. All three of these tools may foster a counterproductive mindset that work against Agile or Accelerate.
Sleuth and Haystack have adopted the opposite approach, gathering and displaying only those proven metrics as supported by DORA's research.
Only Sleuth provides features to support actionable feedback that can both speed up development and reduce deployment pain. This makes Sleuth more than just a metrics dashboard; it’s also a companion for developers.
In the end, we’re left with Sleuth and Haystack. Overall, however, Sleuth seems to win because of the added accuracy and actionable feedback it gives to developers.
We started our journey by wanting to find a tool that moves development teams toward greater engineering productivity. We aimed for a product that provides DORA / Accelerate metrics, because these are the prevailing metrics in the software industry backed by reliable research. While comparing several of the leading tools that provide all or some of these metrics, we focused on the quality of metrics, developer friendliness — increasing development speed and reducing deployment pain — and lastly, the ease of integrations and customization. Our comparison research led us to Sleuth as the overall leader of the pack among DORA metrics trackers.