If you're a DevOps team looking for ways to improve your performance, you've probably heard of the Accelerate and DORA metrics. Tracking your performance with these metrics is a challenge, requiring good tooling. You might be tempted to build your own tracking solution, but you don’t need to! Due to the growing popularity of DORA metrics, several tracker tools are available on the market.
But how do you choose the best DORA metrics tracker for you? This handy comparison guide will help!
In 2021, I reviewed five of the most popular tools out there at the time (Faros, Haystack, LinearB, Sleuth and Velocity). In less than a year, the number of competing products has exploded. This year, we’ll review four more trackers, for a total of nine candidates:
- Faros
- Haystack
- Jellyfish
- LinearB
- Propelo
- Sleuth
- Swarmia
- Uplevel
- Velocity
For each of these tools, we'll discuss their different features and help you decide which one works for your team. Also, we’ll focus particularly on customization, so you can find the tracker that best fits your unique needs.
Before jumping into our comparisons, let's review some basics, clarifying what the four DORA metrics are and why we care.
What are DORA metrics?
A fundamental claim of the DevOps approach is that we can achieve the fast delivery of reliable software. This may seem counter-intuitive; if you’re constantly updating your software to make it more reliable, then how can you deliver it quickly? However, when implemented well, continuous integration and delivery practices can help you achieve this goal.
To determine how well a team is implementing DevOps practices, we look to DORA metrics:
- Change Lead Time tracks the time from when a developer starts writing code for a feature or a change to when that change is released to end users.
- Deployment Frequency tracks how often code is deployed to production or otherwise deployed to end users at the highest level.
- Mean Time to Recovery is the time it takes to restore a service that impacts users, averaged across all incidents in an organization.
- Change Failure Rate is the ratio of the number of deployments that caused a failure to the total number of deployments.
The first two metrics—Change Lead Time and Deployment Frequency—are temporal metrics, and their objective is to measure speed or throughput. The last two—Mean Time to Recovery and Change Failure Rate—are quality metrics, and their objective is to measure the system's reliability.
Tracking these metrics enables your organization to see where it stands compared to other organizations in the industry. Every year, DORA (DevOps Research and Assessment) surveys and categorizes companies based on their performance in these metrics, and that research is summarized in an annual State of DevOps Report.
Tracker tools are important because they help you automate the process of measuring your performance. They help you stay on track, improve your process, and release fast and reliable software.
Evaluating DORA metrics trackers
In order to evaluate the numerous tracker tools available, we’ve divided our survey into three broad categories:
1. Metrics measurement
This first category validates whether or not a tool tracks each of the four DORA metrics and whether that tracking is performed accurately. Tracking the metrics alone is not enough, as the ideal tracker should show your metrics in an easy-to-read dashboard and provide proper reporting to identify trends and problems in your process. And it can only do that with an accurate model of actual work being done.
Modeling how a team works is not one size fits all, but basic components include:
- How you group code, infrastructure, feature flag, and manual changes together
- How change flows through your different environments
- The time spent and the real work done in the different phases of your software development life cycle
- Your overall developer deployment workflow or how an individual takes a piece of work from concept through to successful launch
For more on how to accurately model and understand your engineering efficiency and DORA metrics, check out our Accuracy Matters white paper.
2. Developer friendliness
The next broad category of evaluation is developer friendliness. Developers are at the core of your business operations, so a tracker tool must make the DORA metrics useful to them (instead of just managers and executives through dashboards and reports). The ideal tracker should empower developers with feedback in the development and deployment process, focusing on team performance over individual performance.
3. Integrations and customizations
This last category in our evaluation aims to help you find the tracker that fits your unique needs. The ideal tool integrates within the full DevOps loops (plan ➤ code ➤ build ➤ test ➤ release ➤ deploy ➤ monitor). However, we prefer tools that can be customized to fit with how we already work rather than those that force an organization to change its processes just to calculate metrics accurately. A tracker tool should serve the organization, not the other way around.
Now that we’ve provided a brief overview of our approach, let’s dive into the results.
Metrics measurement
When it comes to this category, we’re looking for those trackers that capture all of the DORA metrics accurately and display those metrics compellingly. We asked the following questions when we reviewed each tracker:
- Does this tool track all four DORA metrics?
- Does this tool track these metrics accurately?
- Does this tool provide dashboards to visualize an organization’s performance?
- Does this tool provide reporting to identify trends and issues?
The table below shows our assessment of how each tracker tool stands up to each of these questions. For reading the results in the table, we use the following key:
- ✅ = Meets the criteria
- 🟧 = Partially meets the criteria but has some minor issues
- ❌ = Does not meet the criteria
After answering the questions for each of the tracker tools, we assigned a grade based on how well each tool meets the different criteria overall.
Let’s explain how we arrived at the above assessment.
Top scorers: Sleuth, Jellyfish, Propelo, and Faros
The standouts in this category were Sleuth, Jellyfish, and Propelo, which scored A+ grades. They offer excellent features and care deeply about providing accurate DORA metrics.
Faros, coming in slightly behind, oversimplifies Change Failure Rate, calculating it as the ratio of incidents to deployments or bugs to releases. Determining the Change Failure Rate with Faros is difficult because it means filtering the cause of an incident or a bug. In addition, Faros encourages you to build your own analytics. Therefore, reporting in Faros is not as straightforward as the other tools.
Less focus on DORA Metrics: LinearB, Swarmia, and Haystack
LinearB, Swarmia, and Haystack all scored lower in this category, missing out on some of the key features that make a good DORA metrics tracker. While they all track and display the four DORA metrics, it’s clear that the DORA metrics are not the main focus of these tools.
Swarmia almost gets the metrics right, except that Change Lead Time is focused on pull requests. In addition, Swarmia focuses instead on calculating development Cycle Time. The DORA metrics are displayed in a separate dashboard, and there is no consolidated report for those metrics.
Haystack counts deployments based on Git events, which can lead to inaccurate metrics calculations. Although Haystack provides an interesting dashboard, they promote their own metrics instead of those from DORA.
LinearB calculates Mean Time to Recovery based on open and closed production bug tickets, and this approach brings some limitations when accounting for failures. Similarly to Faros, LinearB also oversimplifies its calculation of Change Failure Rate. Lastly, LinearB provides a good dashboard, but the analysis and reporting of the metrics seem oversimplified when compared to the other tracker tools.
Little focus on DORA Metrics: Uplevel and Velocity
Finally, we have Uplevel and Velocity, which seem to be popular engineering metrics tools. However, they do not focus on DORA metrics but rather on Cycle Time, emphasizing the speed of your delivery process over the reliability of your software.
As we evaluated the nine different trackers for this category, we found three tiers of tools. The top tier focuses on the DORA metrics and aims to provide the most accurate representation. Tools in the middle tier incorporate the DORA metrics in their system but don’t emphasize them. Lastly, we have those tracker tools that focus primarily on DevOps process speed rather than on the DORA metrics.
With this broad category covered, let’s proceed to consider how each tool scored regarding developer friendliness.
Developer friendliness
All the tools in our survey focus on providing development feedback. Collecting metrics about your development lifecycle is not enough. We expect those tools to deliver actionable feedback to developers. Here are the questions we asked:
- Does the tool provide actionable feedback for developers regarding the development process?
- Does the tool provide actionable feedback for developers regarding the deployment process?
- Does the tool refrain from providing individual metrics?
- Does the tool refrain from using proxy metrics?
Some engineering metrics tools track individual developer performance. With such metrics, it is tempting for managers to reduce team problems to a single individual. However, we recommend focusing on team performance to bring overall improvement. This approach fosters a blameless culture that nurtures team morale, avoiding an unhealthy focus on individual performance.
In addition, we want to look for tools that avoid increasing the toil on the developer that comes from tracking questionable performance metrics—what we call “proxy metrics”—like the number of lines of code changed or pull requests opened. These proxy metrics distort the view of your DevOps process and can lead to decisions that are not in the best interests of your team.
We start with the table showing our evaluation of each tool in this category, and then we’ll follow it up with a detailed explanation.
Development feedback
Every tool we evaluated uses email and/or Slack notifications to keep developers up-to-date. However, the way this is achieved differs from tool to tool.
Sleuth, Haystack, and Velocity provide an interesting Slack standup feature that captures for developers the significant events that happened the previous day. Similarly, Uplevel provides a daily update on sprint health and potential blockers.
Swarmia and LinearB focus on finding bottlenecks and notifying teams when issues and pull requests are idle for too long, helping teams to collaborate. Swarmia has an interesting feature called “working agreements” that lets you select limits and improvement targets to improve collaboration.
Propelo and Faros choose the same approach, letting users create their own notification workflow. While this can be seen as a positive thing in terms of customization, developers don’t get that out-of-the-box experience that the other tools offer.
Deployment feedback
While we see a lot of effort made to help developers ship code and close pull requests, only a few tools provide a concrete solution to help developers feel engaged with the deployment process. Propelo and Faros can achieve this feature via their configurable and flexible ChatOps system. However, Sleuth is the only platform that integrates directly with the deployment process with both approval workflow integrations and deployment notifications.
Individual and proxy metrics
It’s important to note that not all tools track individual metrics. Some tools—such as Sleuth, Jellyfish, and Swarmia—focus exclusively on tracking team and organization metrics. This is important because it allows developers to measure their performance and improve their processes without worrying about invading their privacy.
Propelo and Haystack are noteworthy in the sense that individual metrics are not available out of the box, but you enable them upon request.
Regarding proxy metrics, our assessment runs parallel to that of individual metrics. If a tool provides individual metrics, then it will undoubtedly provide every imaginable way to measure those individual performances, and that includes proxy metrics.
From my evaluation, it seems that every tracker tool provides actionable feedback to help developers during the development process. We could even say that most tools in our survey focus on development feedback. However, only a few provide concrete solutions for developers to feel engaged with the deployment process.
Additionally, managers should be careful not to reduce team problems to a single individual when looking at performance metrics. Instead, they should focus exclusively on the team's performance as a whole, avoiding tools that do not share that vision.
Let’s finish our comparison by looking at what these tools offer regarding integrations. In other words, how well would they fit in your stack?
Integrations and customization
We evaluated aspects in this final category according to the following criteria:
- Issue tracking integration: Does the tool help to bridge the gap between issues/stories/epics and the actual work behind them?
- Codebase integration: Does the tool integrate with your code repository (whether you prefer a mono repo/microservice, a Git flow, or a trunk-based approach)?
- CI/CD integration: Does the tool integrate with your CI/CD pipeline to accurately account for deployments?
- Monitoring integration: Does the tool help identify deployment issues and improve metrics accuracy?
- Automated data collection: How simple is the data ingestion process for this tool?
- Customization: Can we import additional data and create dashboards how we want?
Let’s look at how our tools performed for each of these criteria.
Issue tracking, codebase, and CI/CD
One of the main reasons teams use issue trackers is to manage the development process; collecting metrics about this process can help you measure performance. It’s worthwhile to note that every metrics tracker tool we evaluated integrates with the most popular issue trackers.
The codebase and CI/CD integrations are the most important criteria to consider, as they can directly impact data gathering for DORA metrics calculations. Ideally, you want a tool that can adapt to your workflow.
For instance, Swarmia only works with GitHub and assumes you are using what is commonly called GitHub Flow. As you can see, this is an important consideration. In contrast, we have trunk-based development, which is one of the key recommendations from DORA’s State of DevOps Report. However, many tools don’t support it, which should be a red flag. Tools like LinearB, Velocity, and Uplevel all integrate with your codebase but are very rigid, basing metrics on pull requests, and they do not support a trunk-based development approach.
Top integration performers: Propelo, Faros, and Sleuth
Overall, for integrations and customization, three tools stand out: Propelo, Faros, and Sleuth. They all support trunk-based development. They all have CI/CD integration via webhooks or plugins, letting you choose which jobs in your pipeline represent deployments. They also all offer integration with a monitoring and alerting system. Finally, they can all ingest data from the monitoring system to provide an accurate representation of the different sources of failure.
Jellyfish enables you to achieve metrics collection from different systems; however, it does not provide as much out-of-the-box integration as its competitors.
Propelo, Faros, and Sleuth all have their respective strengths and weaknesses. Propelo’s support for trunk-based development seems like it is still evolving. For Faros, integration with monitoring is available but not straightforward, as it forces you to define the data structure before ingesting the data into the platform. Meanwhile, Sleuth intentionally does not support the creation of custom dashboards or ingesting other types of metrics besides DORA metrics.
Conclusion
In order to accurately measure the performance of your DevOps team, it is important to use a tracker that integrates well with your codebase and your development process. Integration with CI/CD and monitoring systems is essential to provide accurate measurement of the DORA metrics.
The final table below summarizes the grades for each tool within each category.
The trackers reviewed in this article offer a variety of features, but some are more suited for DORA metrics calculations than others. Sleuth and Propelo are good choices for teams that believe in using those metrics to improve their adoption of the DevOps approach.
Faros and Jellyfish are close contenders; however, the tools fall short when it comes to providing actionable feedback. Other tools may claim to track DORA metrics because of the popularity of those metrics, but they don’t give DORA metrics the place they deserve within the tool.
Propelo and Faros could be considered data platforms designed to process, ingest, and display your engineering metrics, leaving the configuration of alerts and developer interactions up to you. However, both of those tools provide individual performance metrics, and the misuse of those metrics can kill morale.
Sleuth alone has adopted the approach of gathering and displaying only those proven metrics as supported by DORA's research. This enables Sleuth to provide actionable feedback that can both speed up development and reduce deployment pain.
Go from zero to one hundred deploys a day.
.webp)
DORA metrics: Best trackers comparison guide [Feb 2023]
Go from zero to one hundred deploys a day.
.webp)