Why 50% Developers Hate DORA Metrics?

Table Of Contents

I spent the better part of the last 4 weeks rolling through the mud that is being thrown in discussions around DORA metrics on the internet. From blog comments to HackerNews threads to full blown heavyweight throwdowns on Reddit.

I saw a huge divide and some interesting insights between people who think they are a game changer and people who think they are nothing but a glorified form of basic common sense metrics that don’t deserve the pedestal they’ve been put on.

Well, I’m your chef today and my job is to create a beautiful DORA dish for you that lets you taste the sweet and savory of this heavy debate.

Now, you probably can guess already what side we are on, right?

I will take you through all the points from DORA haters out there and also touch on how to mitigate these.

The Illusion of Precision

On the surface, the appeal of DORA is clear: hard data to measure the performance of something as notoriously difficult to quantify as software delivery. 

“This reductionist approach obscures the nuances and often unpredictable dynamics of the software delivery pipeline”

Great point, however here is our question: Why do you think DORA metrics encompass everything? They don’t claim to do so, right?

I'll highlight a few points being made further in the article.

Success Is Not Just Release Counts

Yes, frequent deployments can be a good sign, but what if those constant changes are filled with bugs or frustrating user experience regressions? User satisfaction and the actual impact of new features are absent from standard DORA calculations.

DORA Metrics Ignore Complexity

DORA doesn't differentiate between a minor CSS fix and a complex refactor of core architecture. This makes it difficult to compare teams, and can create incentives to avoid those necessary-but-risky projects for the sake of short-term metrics.

Are Metrics the Goal?

A common point across multiple Reddit discussions I found is “It’s easy to game these metrics, especially in a distributed large org”

Focusing just on hitting DORA targets can lead to devs taking shortcuts. Devs may split tasks into artificially small pieces for deployment padding, or skip thorough testing to reduce lead times. It's gaming the system that ultimately sacrifices software quality for the illusion of progress.

Solution

  1. DORA metrics are not everything, so just relying on the 4 DORA metrics doesn’t cut it. Teams are made up of human beings and we are dynamic, you need to expand your horizons.
  2. Rohit Khatana, VP of Eng at Qoala was a speaker at our recent event, The Middle Ground and he shared that he divides DORA into two categories when it comes to the software delivery team.
    • Quality/Reliability indicators: CFR & MTTR
    • Speed/Velocity indicators: Lead Time & Deployment Frequency

As you can guess, optimizing for just one of these would mean disaster sooner or later. For example: If you optimize just for CFR & MTTR, your team won’t be able to meet demands on time losing user trust and brand value.

The Missing Human Element in DORA

Software development isn't just about lines of code. We both agree on this, right?

“DORA provides zero visibility into the factors that truly make or break teams” Well, to this I won’t say these factors alone “truly” make or break teams, but yes these are extremely important.

Engineers are a breed that likes to solve problems while staying away from as much bureaucracy as possible.
So as an engineering leader we are anyways required to leverage frameworks as DORA but then also other things to make sure that our developers get what they want and need to be the best developers & solve problems!

Here are a few points and possible ways to think about them.

Team Morale

A software delivery team overwhelmed by burnout can artificially inflate their DORA metrics through back to back all nighters. 

Yet, the low morale is a huge threat to sustained software delivery success. It's essential to recognize that DORA metrics alone may not signal this impending crisis. 

As a leader you must understand your team’s motivation levels, morale and of course also remember they are humans and need rest!

Toxic Culture

A blame-heavy toxic culture where mistakes are hidden instead of addressed can look amazing by DORA standards. 

That is, until everything catches fire.

DORA won’t make you a better engineering leader who facilitates great culture, that is subjective and you must take ownership.

Baseline & Benchmark Metrics

DORA benchmarks developed for a Silicon Valley startup experimenting with bleeding-edge tech are irrelevant to a team maintaining a legacy financial system under strict regulatory compliance. 

Nuance and understanding of context is crucial to leverage DORA properly.

Blanket advice/metrics don’t work well for anyone. 

This is where it’s essential to sometimes ignore the benchmark reports and rather focus on your own baseline and process improvement on top of that.

Finding the Middle Ground

DORA isn't inherently evil, but it's a tool with limitations, easily misused, easily misunderstood.

Here are a few ideas that can help you strike a healthy balance.

Root Cause Analysis is Key

Another consistent theme I saw while going through tons of comments was "these metrics make it easy to throw blame when the devs might not be the ones to blame at all"

Ideally sudden changes in your DORA metric trends should ignite an investigation, not automatic blame.

For example increased lead times might signal a knowledge gap that can be solved with training.

Or an uptick in your CFR might highlight a testing deficiency. The key is to compare the metrics to your baseline then based on metric changes you can spot things one by one and try to find the root cause.

Track DORA Alongside Other Metrics

I’ve probably said this at least 5 times already in this article, but I’ll say it again. 

DORA metrics provide valuable signals but should never be your sole source of truth for engineering team health.

You can correlate DORA trends with customer satisfaction metrics such as NPS, CSAT, feature adoption rates, or conversion metrics. 

For example, did that rapid deployment pace actually make users happier, or cause confusing UX regressions?

Ideally you would need to track code coverage alongside DORA. If deployment frequency rises, but test coverage goes down, you're accumulating technical debt that will harm future velocity.

Leadership Should Set the Tone

A common pushback I see from developers around developer productivity is “I don’t want everything I do getting tracked by my manager” or “I don’t want to get micromanaged”

Well, the culture and expectations are to be set by the leadership. For example, DORA doesn’t really give you much about individual devs, but it gives you team health.

So the call might be to first explore training and education within the team so everyone is on the same page and aligned to get the product and business moving forward.

Also, DORA's success hinges on leadership understanding its strengths and limitations.

As the engineering leader your job is to communicate clearly that DORA is a diagnostic tool to spark conversations, not a tournament leaderboard to spark competition between teams or people.

One example would be to publicly recognize efforts to improve metrics, not just the highest absolute numbers.

You can recognize engineers who identify workflow flaws, showing that system functionality trumps over temporary metric boosts.

Encourage sustainable practices over unsustainable efforts.

Keep things transparent by sharing DORA data openly, addressing fluctuations, and prioritizing system improvement in retrospectives even when things might not be going your way.

The DORA Hate is Good

The ever so passionate arguments around DORA highlight one thing: Measuring software engineering productivity is a tough & complex task.

It's a sign that we want to make data driven decisions as engineering leaders but we yet don’t have a broader engineering management manual in place that takes everything into account!

DORA metrics are a very important piece of the puzzle, but they’re not the whole picture.

What do you think?

Recent on Middleware blog