Secure, Reliable Software

Measure Engineering Velocity Without Demoralizing

March 5, 2026

•

7 mins

Measure Engineering Velocity Without Demoralizing

Most teams that start measuring engineering velocity end up with worse data than before they started. Not because velocity is a bad metric, but because the standard implementation triggers the exact problem it was meant to solve: a measurement that becomes a target stops being a good measure. This is Goodhart's Law, and engineering velocity is one of its most reliable victims.

This guide covers what engineering velocity actually measures, where standard approaches break down, how to measure engineering velocity at the team level without triggering gaming or disengagement, and what to pair it with to build a complete delivery picture.

What Engineering Velocity Measures

Engineering velocity is the amount of work a software engineering team completes in a given sprint or time period. In Scrum teams, velocity is measured in story points: the sum of points assigned to user stories completed within the sprint. In non-Scrum contexts, it may be expressed as tickets closed, pull requests merged, or features shipped per period.

Velocity is a throughput measure. It tells you how much the team typically gets done in a sprint, which makes it useful for release forecasting: if a team consistently delivers 40 story points per sprint and the backlog holds 200 points of work, you have roughly five sprints of work remaining. That is the original purpose of the metric, and it is a reasonable one.

The problem starts when velocity gets used for something it was never designed to do: comparing teams, evaluating individual engineers, or judging whether a team is working hard enough.

Three Ways Velocity Measurement Backfires

1. Tracking velocity per engineer

Individual velocity tracking is the most common mistake and the most damaging. When engineers know their individual story point output is being measured, two things happen reliably: they inflate estimates to protect their numbers, and they stop helping teammates because time spent reviewing a colleague's PR does not appear in their own velocity.

The metric degrades from both directions. You get higher velocity numbers that bear no relationship to actual throughput, and you get a team that has stopped collaborating. DORA research consistently finds that high-performing teams share information freely and maintain strong collaboration practices. Individual velocity tracking works against both.

2. Comparing velocity across teams

Story points are not a standardised unit. A point at Team A means something different to a point at Team B, because each team calibrates their own scale based on their own reference tasks and complexity judgments. Comparing raw velocity numbers across teams is measuring in different currencies without a conversion rate.

The Scrum Alliance is explicit on this: velocity is an internal planning tool for a single team, not a cross-team performance metric. Using it to rank teams, set competitive benchmarks, or justify resourcing decisions misuses the metric.

3. Goodhart's Law: when the metric becomes the target

Charles Goodhart observed in 1975: when a measure becomes a target, it ceases to be a good measure. In engineering velocity, this manifests predictably. Estimates inflate to hit velocity targets. Work gets broken into smaller tickets to boost the count of completed items. Engineers deprioritise complex but high-value tasks in favour of work that closes quickly.

The team is now optimising for the metric rather than for delivery. Velocity rises. Actual output may fall. And you have lost the accurate baseline you needed for sprint forecasting.

The teams that go through this cycle rarely stop measuring. They end up with numbers they know are wrong but keep reporting anyway, because acknowledging the gaming problem means admitting the measurement approach failed.

Velocity vs. Flow Metrics vs. DORA Metrics

Understanding where velocity fits in the broader measurement landscape clarifies what to keep and what to add alongside it.

Metric	What It Measures	Gameable?	Primary Use
Velocity	Work completed per sprint (story points)	Yes: estimate inflation, task fragmentation	Sprint forecasting
Cycle time	Time from work start to production delivery	Hard: requires actually shipping faster	Flow efficiency and bottleneck identification
Deployment frequency	Successful production releases per period	Hard: deployments are recorded events	Delivery cadence and CI/CD health
Lead time for changes	Commit to production elapsed time	Hard: timestamp-based	End-to-end delivery speed
Change failure rate	Percentage of deployments causing incidents	Hard: production failures are recorded	Quality and stability signal

The key difference: velocity requires estimation, which creates the gaming opportunity. Cycle time and DORA metrics are recorded events. A deployment either happened or it did not. Code either reached production in six hours or it did not. There is no estimation layer to inflate.

How to Measure Engineering Velocity Without the Backfire

Track at the team level only

Never break velocity down by individual engineer. Track the team's total sprint velocity as a single number. This preserves the metric's forecasting value while removing the incentive for individual gaming and the competitive dynamics that erode collaboration.

Track trend direction, not absolute numbers

A team's velocity will fluctuate: sprint goals vary, team membership changes, unexpected complexity surfaces. The signal is in the trend across eight to twelve sprints. Is velocity stable, improving, or declining? A consistent downward trend warrants investigation. A single low sprint is noise.

Comparing a team's current velocity to its own historical average is meaningful. Comparing it to another team's velocity is not.

Let the team own the metric

Teams that participate in defining and calibrating their own measurement approach are more likely to trust the output and less likely to game it. Involve the team in how velocity gets tracked, what counts, and how it feeds planning. The conversation itself is valuable: it surfaces disagreements about estimation before those disagreements show up as planning failures.

The positioning matters practically: if the team understands that velocity data feeds sprint forecasting rather than performance reviews, gaming the metric primarily hurts the team's own planning. That reframing changes the incentive structure.

Separate delivery metrics from performance assessment

Delivery metrics should feed planning, retrospectives, and process improvement. They should not feed individual performance reviews, compensation decisions, or headcount planning.

DORA research found that psychological safety predicts software delivery performance. Teams where engineers feel safe to raise concerns, report problems, and experiment without punishment consistently outperform teams where metrics are used as evaluation tools. Using delivery metrics in performance reviews directly undermines that safety.

If senior stakeholders require velocity data for reporting, share team-level aggregates only, and frame them with trend direction rather than absolute comparisons. "The team's sprint velocity has been stable for six quarters" is a useful statement. "Team A's velocity is 40% lower than Team B's" is not.

A Practical Rollout Approach

If you are implementing velocity measurement for the first time, or re-implementing after a poor rollout:

Start with context, not a dashboard: Before collecting data, be explicit about what the metric is for (sprint forecasting and planning) and what it is not for (comparing engineers or ranking teams). This conversation shapes how the metric lands.
Baseline first, improve second: Run three to four sprints without any targets to establish an honest baseline. A baseline collected under target pressure is not a baseline.
Review in retrospectives, not management reports: Velocity data belongs in sprint retrospectives where the team can discuss what affected it. When it appears in management reporting before the team has seen it, trust breaks down quickly.
Add a quality signal: Pair velocity with change failure rate or defect rate. A team shipping faster but producing more rework is not actually faster. The quality signal keeps velocity honest.

For teams that want DORA metrics alongside velocity, see our deployment frequency benchmarks guide for tier thresholds and measurement guidance, and our full DORA metrics guide for how all four metrics fit together. For common DORA measurement errors that apply to velocity tracking too, see why DORA metrics mislead teams. For building an executive-facing dashboard with velocity context, see the engineering metrics dashboard guide.

Frequently Asked Questions

What is engineering velocity and how is it measured?

Engineering velocity is the amount of work a software team completes in a sprint, typically measured in story points. Teams sum the story points of completed user stories at the end of each sprint. The result is used for release forecasting: how many sprints of work remain in the backlog at the current delivery pace. Velocity is a planning tool, not a performance metric.

Why does velocity measurement often backfire?

Velocity backfires when it becomes a target rather than a planning tool. Goodhart's Law applies directly: when teams are evaluated on velocity, they adjust behaviour to protect the number. This shows up as estimate inflation, task fragmentation to boost ticket counts, and deprioritisation of complex work. The metric improves while actual delivery quality and throughput decline.

Should you compare engineering velocity across teams?

No. Story points are not a standardised unit. Each team calibrates their own scale, so a point at one team is not equivalent to a point at another. The Scrum Alliance is explicit that velocity is an internal planning tool for a single team, not a cross-team comparison metric.

What should you measure instead of velocity?

Cycle time and the four DORA metrics (deployment frequency, lead time for changes, change failure rate, mean time to recovery) complement velocity and are harder to game. Cycle time measures objective elapsed time from work start to production delivery. DORA metrics measure outcomes at the team level and have a ten-year evidence base linking them to organisational performance.

How does velocity measurement affect team morale?

Poorly implemented velocity tracking undermines psychological safety and collaboration. When engineers believe individual velocity is monitored, they shift toward protecting their numbers rather than collaborating or taking on difficult problems. DORA research found that psychological safety predicts software delivery performance: teams where engineers feel safe to raise concerns consistently outperform teams where metrics are used as surveillance tools.

Good delivery metrics measure systems, not people. Teams where that distinction is clear produce more accurate data and consistently better delivery outcomes. The goal is measurement that makes work more visible, not measurement that makes engineers more cautious.

Want to track engineering velocity, cycle time, and DORA metrics at the team level? The Scrums.com engineering intelligence platform gives engineering leaders delivery visibility without individual surveillance: team-level aggregates, trend analysis, and DORA tier tracking in one place. Or start with our engineering operations guide to see how delivery metrics fit into the broader picture.