As much as at its core programming revolves around 1's and 0's, quantifying the performance of development teams is a far more complicated story than one number can tell. Measuring and tracking development efficiency has been an ongoing topic of debate and one of the most difficult parts of any engineering manager’s job. The long-held belief is that development processes come with too many moving parts and that there’s often not a clear link between input and output which makes development more or less a black box.
But in a world where to compete more and more companies need to think and act as software companies themselves, this view of software development is no longer sustainable. For modern engineering leaders, it has become clear that the only way to align developer efforts with business goals is to adopt a reliable set of KPIs. According to a PwC survey, data-driven organizations are 3x more likely to report improvements in their decision-making.
By adopting software development metrics, organizations can measure development performance against business objectives, monitor progress, and make more informed decisions. Metrics encourage your development team to work smarter, not harder and to foster a culture of continuous improvement. In this article, we’ve compiled a list of 15 software development metrics every data-driven team needs. Let’s dive in.
Development velocity indicates the amount of work your team can complete in a given time (usually a sprint) based on how quickly they solved similar work in the past. Most teams calculate velocity using story points, which express the overall effort required to fully implement an item from the backlog or other piece of work. By grouping these story points and adding the time spent on them, you can get a sense of how realistic your development timelines are.
Suppose during three consecutive sprints, your team completed 120, 100, and 140 story points respectively. Now we know that on average your team can complete 120 story points per iteration. So you can reasonably assume that it will take another five sprints, give or take, to complete an additional 600 story points from the backlog.
You should track velocity for at least a few sprints before drawing any conclusions. Also, keep in mind that these are averages, so the raw numbers will not reveal much. It’s the trends that will help you foresee your performance on other recurring tasks.
Velocity is great for planning but it doesn’t give you an actual breakdown of what has and hasn’t been done. Unlike velocity’s reliance on averages, the scope completion ratio looks at the total number of tickets completed in a sprint. Keep an eye on this software development metric to ensure your engineering teams are properly staffed and are working towards achievable goals.
A low scope completion ratio can signal issues in the development process or resource allocation such as:
For the most part, changes are inevitable when you build software. But if scope changes happen too often or are not adequately planned, they can seriously impact your project. Scope added is a critical metric that allows you to see how much new work was added (either tickets or story points) after the sprint already started. A high scope-added rate means your planning for the sprint didn’t account for all the work that is ahead of you.
If you want to improve this metric you should focus on engaging stakeholders and clarifying requirements as early as possible, as well as implementing stricter change control procedures.
It’s no mystery why understanding how tasks move within your team is important. You simply cannot track engineering progress and plan software delivery without first knowing where and how most of your development time is spent.
Cumulative flow is one of the key software development metrics that shows you at a glance the number of tasks that are approved, in progress, or in the backlog. Cumulative flow typically maps your tasks in a color-coded chart. If a task spends too much time in either of these stages, you should take a closer look at your delivery pipeline. Are “In progress” tickets really in progress or are they just stalled? What is blocking more tasks from being approved?
Flow efficiency drills down even deeper into how well your workflow is operating. This metric measures the period of time tickets are in active development versus the time they are blocked or waiting in queue for review. To calculate flow efficiency, you can use the following formula:
Flow efficiency = (Active Development Time/Total Time) x 100
Cycle time measures the time spent from the moment development work started until it’s ready to be shipped. Simply put, cycle time measures how long it takes for your development team to complete a specific task. Despite its simplicity, cycle time paints a clear picture of your speed of delivery. It’s one of the most useful metrics in software development because it gives you a baseline to assess the efficiency of all the stages the task goes through.
In comparison, lead time is measured as the time between when a change request was submitted and when that change is up and running in production. As one of the key DORA metrics (DevOps Research and Assessment), lead time for changes is a useful indicator of the health of your deployment processes and how quickly you can deliver improvements and new features. Lead time for changes is calculated as X - Y, where X is the deployment date and Y is the date and time of the commit.
Deploying early and often is one of the best strategies for a productive development team. Deployment frequency is another DORA metric you should track, especially if you are already struggling with putting out fires from large releases.
Even if something goes wrong, small deployments mean you don’t have to go through millions of lines of code to find the culprit. And since you can better control the changes you introduce at a low level, deployments become less scary and lower risk as the frequency increases. This metric is usually measured in releases per day, week, or month.
Alongside deployment frequency, change failure rate (CFR) is one of the most widely tracked engineering metrics. And for good reason. As much as software development teams employ dozens of tools and best practices to maintain code quality, sometimes code changes will result in unintended consequences like downtime, errors, or a negative impact on users. Luckily, there is a way to bounce back and that is by carefully monitoring the percentage of deployments that caused a failure or what’s known as change failure rate.
Suppose your organization made 100 changes to its codebase over the course of one month and 5 resulted in system disruptions. This puts your change failure rate for the given month at 5%. A low failure rate, usually between 5-10%, points to a higher quality of your source code, while a high CFR means your code needs further testing and debugging.
If CFR looked at how often systems break, MTTR tells you how long it will take you to get things back on track after a failure in production. In other words, Mean Time to Repair calculates the average time it takes your team to get services up and running after an unexpected incident.
Mean Time to Repair is an essential metric in our modern world. Just consider new technologies like driverless cars where any hiccup in availability or reliability can result in serious injury. Even in less high-stakes situations, like in e-commerce or SaaS, failing to deliver services can equate with loss of credibility and revenue. With this said, regularly monitoring MTTR is the first step to understanding where problems come from and putting in place an incident response plan.
Code coverage is yet another one of the essential KPIs for software development. This metric determines how much of your source code is actually being checked during automated tests. It’s a very useful metric if you want to assess the quality of your test suite and understand what areas of your code are being neglected and might risk introducing bugs into your system. Of course, the more lines of code are executed, the more reliable your tests and code become. With this said it’s generally accepted that 80% coverage is a good benchmark to aim for.
Defect escape refers to the amount of issues that evaded the QA process and ended up in production or in front of the users. Experienced Agile teams will closely track this metric and attempt to identify areas for improvement in their development and testing processes. For example, you can track the percentage of defects found before release and the number of defects found post-release to:
Pull request size might be a surprising metric to see on this list. How does it relate to efficiency or performance? In a nutshell, it reflects the amount of code changes introduced by a single pull request. Smaller pull requests are easier to review, allowing developers to properly comb through each line of code, give more specific feedback, and catch issues faster without blocking other developers. While there isn’t a general consensus on how many lines of code a pull request should be limited to, small pull requests leave less room for bugs and provide a clearer history of changes.
Developed by GitHub and Microsoft, these metrics look at developer productivity from a holistic perspective, factoring in aspects that relate to the human aspect of building software. How satisfied are developers with the work they are putting in the project? How happy are they with the review process or the quality of documentation? If developers report a high level of dissatisfaction, rework, and inefficiency, this can provide some important clues on areas of improvement and how you can redefine standard working practices.
Compared to the SPACE metrics which dig deep into the different factors that impact developer productivity and satisfaction, eNPS assesses employee experience at a glance. It’s a widely used metric that can help you quickly understand whether your development team would endorse you as an employer and encourage others to join your organization.
The eNPS survey usually consists of a simple but powerful question: “On a scale from 0 to 10, how likely are you to recommend your workplace?”. Answers from 0 to 6 are considered detractors, while 9 to 10 are promoters, with the final score being calculated as eNPS = %Promoters - %Detractors.
Satisfied customers are the best advocates. Measuring customer experience metrics can provide critical insights into how they feel about your product and how much they are willing to invest in your software over time, whether it’s renewing subscriptions, upgrading to premium versions, or simply spreading the word. Here are two of the most popular metrics:
Net promoter score (NPS): Net promoter score quantifies user loyalty and how likely they are to recommend your software product on a scale of 0 to 10. Ratings over 8 usually resonate with an excellent user experience.
Customer satisfaction score (CSAT): CSAT evaluates customers’ overall satisfaction with your app, calculated as a percentage of the most satisfied users out of the total number of respondents.
To paint a more accurate picture of how well your product is received and optimize your development approach, it’s important to also look at customer satisfaction from a usability perspective. The System Usability Scale assesses how easy it is for users to interact with your product by answering a series of questions they must agree or disagree with. For example “I think most people can learn this app quickly” or “I feel confident while using the app”. With this data in hand, you can plan improvements based on real-time feedback and foster a culture of rapid iteration.
Software development has become the foundation for modern, customer-centric businesses. Whether you are outsourcing or overseeing development in-house, these KPIs will provide you with the hard data behind your development team’s performance, allowing you to align with business goals and achieve higher levels of efficiency, quality, and customer satisfaction.