Does accuracy matter for tracking DORA metrics?
Does accuracy matter for tracking DORA metrics?
You might be excited about tracking DORA metrics, but have you ever thought about the ways in which you track them, and how important accuracy is in your methods?
You might think that Sleuth, as a DORA metrics tracking platform that talks about accuracy, is a bit biased on this topic, but it's not as simple as that — because, as any seasoned developer would say, “it depends.”
You can hear or read below an in-depth discussion between Sleuth’s CTO and co-founder, Don Brown, and our new head of engineering, Kate Bierbaum, on this topic of to what degree accuracy matters when using DORA metrics to track and improve engineering efficiency.
Let's start at the beginning. To summarize the DORA metrics, they come from the State of DevOps report and a book called “Accelerate.” They’re a measurement of engineering efficiency.
The DORA organization surveyed 30,000+ organizations and found that their responses for four key metrics statistically correlate to how well they achieve their business goals. These academic insights showed that if you do well on these metrics, you do better as a company.
As a leader of an organization, if you ask your managers, team leads, and tech leads the same questions published in the State of DevOps Reports, you’ll generally be able to see how your team compares to others according to the DORA metrics. The questions are:
- How often do you deploy code to production or release it to end users (Deployment Frequency)?
- How long does it take to go from code committed to code successfully running in production (Change Lead Time)?
- How long does it take to restore service when an unplanned outage or service impairment occurs (Mean Time to Recovery)?
- What percentage of changes made or released require remediation (Change Failure Rate)?
The point being, if you are only interested in knowing where you sit with DORA metrics compared to other companies, you don't have to do anything more complicated than simply asking your team the DORA survey questions. From their responses, you'll have some information to baseline against DORA metrics based on the definitions published in the State of DevOps Reports for how low-, medium- and high-performing teams answer the same questions.
So, strictly from the standpoint of the DORA metrics, if you get the same information in the same way that the survey was conducted, accuracy doesn't matter.
But if you want a deeper understanding, then the opposite is true.
DORA metrics in the real world
When you're looking at the DORA metrics, you want to look at what your numbers are for frequency rate, lead time, MTTR and change failure rate. It's nice to know if these are going up or down.
Here at Sleuth, we find the DORA metrics are useful in two key ways:
- We have alerts set up to tell us when things are not going right. And that's useful to know when something is wrong and needs attention.
- When we already suspect something is wrong, we can look at the data behind it. Sometimes people think measuring metrics will solve all their problems. But metrics don't give you the answer; they help you know what questions to ask to get a useful answer.
And so when we look at our DORA metrics dashboard, we will hopefully come up with questions, not answers.
Let’s get a real-world sense of what it looks like to track DORA metrics by looking at our own Sleuth team. You’ll see our data isn’t perfect. (You can view this data live for yourself anytime at https://app.sleuth.io/sleuth/sleuth/metrics.)
When looking at the DORA metrics, the types of questions we want to have answered are:
- How is my team doing at scale?
- Are we still efficient and getting things out quickly? Are we not?
- Are we breaking things more or less often?
This is a good place to ask whether accuracy matters, because we might assume we know answers to these questions, when in fact assumptions are often wrong. Anyone who has ever done performance optimization on a large system or even a small piece of code knows that your assumptions going in are almost always wrong. So, accuracy does matter because you can’t operate off of assumptions. You need actual information to really assess how things are going over time.
Take the deployment frequency DORA metric as an example. When we measure deployment frequency, we're measuring code that actually hits production. We're not measuring when a pull request is merged and assuming it went to production. We want to know if our team is batching things and doing big deploys but not doing too many, or if we’re keeping deployments small — because, according to the State of DevOps reports, the size of the deployment corresponds to better DORA metrics.
So, it’s useful to not just guess at when a deployment happens, but rather to know precisely when it happens, what's in it, and the size of it — that can all help us make better information.
Change lead time and accuracy
Another DORA metric where accuracy matters is change lead time — how long it takes to go from a change all the way out to production. This can be especially insightful when adding people to a team, like Sleuth is. It indicates if you’re getting faster or slower as a growing team, and whether things get stuck in review more often.
For example, if you saw your change lead time increase during a period when you onboarded new developers, you’d want to investigate that. Your assumption might be the increase was because of new team members, but maybe it’s something else.
Sleuth sorts deploys by change lead time, which allows you to see which ones took the longest. Upon further investigation, you might find that a change was coded by someone who normally doesn’t do much dev work on your team. In that case, accuracy would help you determine that one long review time was a one-off. Then, you can go back to the data and see that across everyone else, the review time isn't bad — and the increase wasn’t because of your new developers.
Accurate DORA metrics allow you to dig down, ask better questions, and keep digging until you can find answers. Sometimes the answer is it was a one-off, and sometimes it's something more systemic — but you don't know if you don't have the accuracy.
If you want to get those deeper insights to actually change things, then the more accurate they are, the more effective you will be in that pursuit.
A newcomer’s take on DORA metrics
To get another perspective on DORA metrics and accuracy from someone who hadn’t worked with them before joining Sleuth, our CTO and co-founder, Don Brown, talked with Kate Bierbuam, Sleuth’s new head of engineering.
Don Brown: As someone who's been involved in DORA metrics for about a month, what's your takeaway so far?
Kate Bierbaum: So far, the two things that have been at the top of mind since I joined Sleuth — because they reflect what's happening on the engineering team — are Change Lead Time and Deployment Frequency. The whole engineering team is growing, so it’s natural for lead time to increase and deployment frequency to decrease because of new people joining. It's pretty intuitive that when you onboard to a new code base, there's a ton of discovery and experimentation that happens to figure out where to make your changes to get the intended results of your code.
When a new team is created, people are figuring each other out and what processes work best. As this happens, it takes a little longer to get the context of the work at hand. And sometimes it's initially slow, but with the DORA metrics, it’s been reassuring to have my intuition around those things checked by data. DORA metrics have provided that opportunity to ask deeper questions and get to a process that will help the team mature and move quickly with the right context.
Don: You've been on the management side for a long time at other companies, so it’s interesting to hear what you find interesting about DORA metrics. I come from the dev side, so I look at them more from what tool can I build to write this, where you're looking at the team and how to get the team productive.
Accurate DORA metrics save time and provide context
Don: When you look at adopting DORA metrics and wanting to improve a team, what do you think is the lowest hanging fruit?
Kate: When you join as a new manager, you don't make sweeping changes immediately. You ask a lot of questions to figure out what reality is. And interestingly, Sleuth as a tool has short-cut a lot of the answers to some very basic questions, which is great because it allows me to dive into the more in-depth conversations quicker.
For example, some of it's surface level and simplistic, like what metrics are important to the team. I can go into Sleuth and see what's being measured from our impact sources. All of those things usually would take a fair amount of digging and conversations with multiple people. But now I can start to figure out what cultural and automation things need to change in order to improve as a group.
Our team did see increased Change Lead Time, and seeing that led to some conversations with the new people on the team to help them learn the context and bring that lead time down. That led to doing a kickoff every week to align on the work and make sure it's understood, because that was valuable information for new engineers. A lot of people have been at the company for a while, so they didn't really need it, but the new people did. So, we were able to change the process, and hopefully we'll start to see that number come down.
Don: I like that the conversations weren’t about the metric itself. They were about making changes to help the people, and the metric helped inform that process. That's something that often people get backward. They think it's all about hitting certain numbers so that they make their OKR, get a raise and move on. But really, the goal is to improve the team and make it more efficient.
Accurate DORA metrics help you become more informed so that you're spending your time and corrective measures on the right things. It's easy to do corrective measures, but if you're not doing a corrective measure on something that's gonna make a difference, then you're wasting people's time.
So, prior to coming to Sleuth and working with DORA metrics, if you were looking at a team that you thought was underperforming, how did you know?
Kate: I used issue tracker data, commit stats, complexity of commits, lines of code change, stories completed, bugs that happened that have been tied to previous releases — and trying to piece that into something in order to ask intelligent questions of individuals. Once upon a time I did manually track review lag time, company OKRs, and developer happiness via surveys.
Don: I remember this one director of engineering I worked with had a spreadsheet that had something like 150 different metrics that he tracked across all the different teams, and only about 10% to 20% of it was automated. He just made it his mission to update it frequently. It was very impressive, but a lot of work.
Kate: You're making my hands sweat just thinking about it. It is a lens into the data, and it can be helpful, but not as helpful as I’ve found using Sleuth to be.
People first, then metrics tracking tools
Don: With the wealth of experience you have in the management field, but without DORA metrics, can you talk about improving engineering efficiency and taking a team from point A to point B? What was your approach? What did you learn?
Kate: I don't think I've had a team that has ever just been unproductive without having a cultural problem. Those have always been linked, in my experience. There’s never been a case of, for example, if we had this automation or this other process, we'll have higher output.
But one thing that comes to mind was a group of people who really lacked empowerment and didn't have a say over how to do their work. Things were fed in and they were told to add data in a certain format and then show it in a specific place. The process was very prescriptive.
Also, the team didn't have context about how what they were doing was important to the business or the impact it was going to have on customers. At the heart of things, engineering is a creative job and when people feel stifled, they're not going to be happy and they're not going to be productive.
What was important here was giving this team the space to own what they were doing, how they were doing it, and pushing back on a lot of the processes that led to that state where they lacked empowerment. Eventually, the team was able to come up with new ideas about how things could be automated, how to remove drudgery from their day to day. And because of that, productivity started to turn around. It was all linked.
Don: Totally. If you have a dysfunctional thing, the answer is never to start with tools. The answer is always to start with people and culture, and then once you have them going in the right direction, then find a tool that can help remove impediments and get a more automated process. It's so much more effective that way.
That's too often lost. A team might want to move to CD, and then they think they need to adopt tools, metrics, new technology — but none of that changes anything. It's more about fundamental culture. I'd love to sell a tool that solves culture, but you just can't because people are complicated.
An example that stands out in my mind is a team of superstars we built at Atlassian. We pulled the best of the best from different teams in the company. We got a lot of stuff done, but it took a little while to build that culture because you can't just take one good person and another great person and throw them in and expect great things.
That's not how teams work. You need to create relationships. You need to have a coffee together. You need to ask people questions about their life so they feel comfortable, they feel energized, get them aligned, and then things change.
What makes a productive team?
Don: What would you say is a key ingredient of a team that was very productive, going in the same direction, and delivering on their goals?
Kate: It's when people have the business context and autonomy over their work and feel empowered. That's a lot of it. It’s also a lot of what is being driven from those key principles. For example, you understand what's happening in production, you see your work through. Also, part of the secret sauce is the combination of team gelling along with technical mastery.
I also love a team that has a mix of different levels. There's a really symbiotic relationship that happens when you get junior-, mid- and senior-level developers together because everyone learns and grows together. Those have been my favorite teams.
Don: Yeah. That's a great point. You know, we started the conversation talking about metrics and getting all this hard data. And then as we keep talking about it, it all comes down to people. So many things just come down to people at the end of the day. We can tend to undervalue the value of relationships, but social connections are so important.
Kate: And I’ll say it's been interesting joining a couple companies in this remote work space and realizing the amount of time it takes to form relationships with people over the internet, and how to do that in a way that doesn’t impact people's work. It's a tricky balance.
Takeaways on accuracy in DORA metrics
Don: To summarize what we talked about, we talked about the DORA metrics, we talked about accuracy, how to dig into the metrics, and how to change. When you are in a leadership position, you are responsible for how that team executes and how efficiently they execute. Accurate metrics are a tool to dig into the data, but they’re just one tool. Relationships are really important, too.
A key takeaway is to find the tool that works for you with what you need. If metrics are it and you need to dig in and actually look at review time and find ways to improve that, then find a tool that's accurate. If your challenge is cultural, don't start with the tool. Start with helping people. What’s your takeaway, Kate?
Kate: I'd say the most important thing is whatever tool or methodology you pick, always have an eye out for improvement and building meaning with the people around you.