EmbeddedRelated.com
Forums

Software Metrics (cat flame > /dev/null)

Started by Don Y July 11, 2011
Hi,

The subject of "software metrics" has limped across my desk.

<frown>

I don't dislike metrics for what they may show/fail-to-show.
But, rather, because they usually don't have a stated "purpose".

I.e., What are you trying to measure?  Why are you trying to
measure it?  What do you plan to do with the result (besides
put it in a 3-ring notebook)?

I can argue many points *against* the (seemingly arbitrary)
point of tracking metrics...  but, I'd rather approach this
from the *other* side of the fence:  defining *realistic*
goals to which metrics can serve as VALUABLE insights, the
appropriate metric(s) to use to measure progress towards/away
that goal, and the ACKNOWLEDGED shortcomings inherent in a
particular measurement strategy.

E.g., you can use metrics to measure productivity, complexity,
reliability, maintainability, cost, completion, etc.  But, often
you only get a snapshot of *one* of these -- at the expense of
all the *others*.

So, rather than arguing on N "fronts" (and appearing "obstructionist"),
what guidance (firsthand experience!) can folks offer to bend the
debate into one that will produce meaningful results (instead of
just "pages of numbers")?

Thx,
--don
On 07/11/2011 02:30 PM, Don Y wrote:
> Hi, > > The subject of "software metrics" has limped across my desk. > > <frown> > > I don't dislike metrics for what they may show/fail-to-show. > But, rather, because they usually don't have a stated "purpose". > > I.e., What are you trying to measure? Why are you trying to > measure it? What do you plan to do with the result (besides > put it in a 3-ring notebook)? > > I can argue many points *against* the (seemingly arbitrary) > point of tracking metrics... but, I'd rather approach this > from the *other* side of the fence: defining *realistic* > goals to which metrics can serve as VALUABLE insights, the > appropriate metric(s) to use to measure progress towards/away > that goal, and the ACKNOWLEDGED shortcomings inherent in a > particular measurement strategy. > > E.g., you can use metrics to measure productivity, complexity, > reliability, maintainability, cost, completion, etc. But, often > you only get a snapshot of *one* of these -- at the expense of > all the *others*. > > So, rather than arguing on N "fronts" (and appearing "obstructionist"), > what guidance (firsthand experience!) can folks offer to bend the > debate into one that will produce meaningful results (instead of > just "pages of numbers")?
When, in the past, I've been a software lead, the only two metrics that I found of any fundamental value were "it makes Tim happy" and "it makes the boss happy". I'll admit when I have a coworker who's way better at software than I am, I'll surreptitiously use "makes Nancy (or Bill)* happy" to heavily weigh "makes Tim happy". I think the real problem is that you're trying to make a metric that somehow maps to "makes the project complete on time and under budget" or "makes the company money" or "saves the company the embarrassment of appearing on the 6:00 news hour next to a picture of a smoking hole in the ground". But those metrics are really difficult to map, so people end up using metrics that are -- however well intentioned -- smoke and mirrors. * These are actual names, and if one of you is reading this you know who you are, and thanks for making me a better software engineer! -- Tim Wescott Wescott Design Services http://www.wescottdesign.com Do you need to implement control loops in software? "Applied Control Theory for Embedded Systems" was written for you. See details at http://www.wescottdesign.com/actfes/actfes.html
Hi Tim,

On 7/11/2011 3:31 PM, Tim Wescott wrote:
> On 07/11/2011 02:30 PM, Don Y wrote: >> The subject of "software metrics" has limped across my desk. >> >> I.e., What are you trying to measure? Why are you trying to >> measure it? What do you plan to do with the result (besides >> put it in a 3-ring notebook)? >> >> I can argue many points *against* the (seemingly arbitrary) >> point of tracking metrics... but, I'd rather approach this >> from the *other* side of the fence: defining *realistic* >> goals to which metrics can serve as VALUABLE insights, the >> appropriate metric(s) to use to measure progress towards/away >> that goal, and the ACKNOWLEDGED shortcomings inherent in a >> particular measurement strategy. >> >> So, rather than arguing on N "fronts" (and appearing "obstructionist"), >> what guidance (firsthand experience!) can folks offer to bend the >> debate into one that will produce meaningful results (instead of >> just "pages of numbers")? > > When, in the past, I've been a software lead, the only two metrics that > I found of any fundamental value were "it makes Tim happy" and "it makes > the boss happy". I'll admit when I have a coworker who's way better at > software than I am, I'll surreptitiously use "makes Nancy (or Bill)* > happy" to heavily weigh "makes Tim happy". > > I think the real problem is that you're trying to make a metric that > somehow maps to "makes the project complete on time and under budget" or > "makes the company money" or "saves the company the embarrassment of > appearing on the 6:00 news hour next to a picture of a smoking hole in > the ground". But those metrics are really difficult to map, so people > end up using metrics that are -- however well intentioned -- smoke and > mirrors.
This shows both sides of the issue, IMO. On the one hand, the "numbers" one comes up with are often only relevant in that particular Industry and/or organization. I.e., even if you settle on a *particular* ("standardized") metric, comparing observations of that metric in a desktop application vs. an embedded vs. a real-time, etc. application is essentially meaningless. While the metric might be "standardized", the problem to which it is *applied* renders it apples-and-oranges. On the other hand, without *some* numerical "score", there is no way for an organization to evaluate its own progress. How do you know if quality is improving? Or productivity? etc (depending on what your actual goal happens to be). It's hard to draw a parallel to any other aspect of a business. E.g., imagine if *your* accounting department tracked everything in terms of dollars... and another accounting department tracked everything in terms of loopholos. I.e., comparing between departments is meaningless (since loopholos and dollars are orthogonal measurement units) -- yet, comparing current to previous within a department *does* have value! IMO, the actual metric(s) chosen have to be *easy* to measure unambiguously (automated), not easily "subverted" *and* only used for relative comparisons within an organization/individual (i.e., a person writing device drivers would exhibit different metrics than a person working on GUI's -- even within the same Industry/Organization)
Op Mon, 11 Jul 2011 23:30:43 +0200 schreef Don Y <nowhere@here.com>:
> The subject of "software metrics" has limped across my desk. > > <frown> > > I don't dislike metrics for what they may show/fail-to-show. > But, rather, because they usually don't have a stated "purpose".
Do statistics have a purpose? Does telemetry have an inherent purpose?
> I.e., What are you trying to measure?
Clarity, maintainability and testability, usually.
> Why are you trying to measure it?
If the metrics are bad, then the code is likely to be bad. Bad code is a liability.
> What do you plan to do with the result (besides > put it in a 3-ring notebook)?
Use it to discipline developers. Either by showing the bad numbers or if they don't listen, by whacking them with the 3-ring notebook.
> I can argue many points *against* the (seemingly arbitrary) > point of tracking metrics... but, I'd rather approach this > from the *other* side of the fence: defining *realistic* > goals to which metrics can serve as VALUABLE insights, the > appropriate metric(s) to use to measure progress towards/away > that goal, and the ACKNOWLEDGED shortcomings inherent in a > particular measurement strategy. > > E.g., you can use metrics to measure productivity, complexity, > reliability, maintainability, cost, completion, etc. But, often > you only get a snapshot of *one* of these -- at the expense of > all the *others*.
I don't understand that. How can a measurement of one metrics group affect the results of other metrics groups?
> So, rather than arguing on N "fronts" (and appearing "obstructionist"), > what guidance (firsthand experience!) can folks offer to bend the > debate into one that will produce meaningful results (instead of > just "pages of numbers")?
Metrics provide objective insight, that can be debated and reasoned, unlike subjective insight. It allows prioritization of coding and testing effort. But in the end, all decisions are subjective. They come down to an incomplete view of the product development effort that will make some random human feel good enough about a certain decision. -- Gemaakt met Opera's revolutionaire e-mailprogramma: http://www.opera.com/mail/ (Remove the obvious prefix to reply.)
Hi Boudewijn,

On 7/12/2011 1:48 AM, Boudewijn Dijkstra wrote:

>> I don't dislike metrics for what they may show/fail-to-show. >> But, rather, because they usually don't have a stated "purpose". > > Do statistics have a purpose? Does telemetry have an inherent purpose?
Sorry, I meant "folks *pushing* for the metrics don't have a clear understanding of what they want to get *from* those metrics. I.e., *what* specifically are you trying to measure and what do you want to do with the data?
>> I.e., What are you trying to measure? > > Clarity, maintainability and testability, usually.
Or, productivity, complexity, etc.
>> Why are you trying to measure it? > > If the metrics are bad, then the code is likely to be bad. Bad code is a > liability.
But what constitutes a bad metric? Too few LoC/man-day? Too many function points? etc. This dovetails with the above question...
>> What do you plan to do with the result (besides >> put it in a 3-ring notebook)? > > Use it to discipline developers. Either by showing the bad numbers or if > they don't listen, by whacking them with the 3-ring notebook.
Ha! <grin> I actually don't think they know *what* they want the numbers for (which is why they don't know *which* numbers they want!) -- but, somehow, think having numbers is better than *not* having numbers... <frown>
>> I can argue many points *against* the (seemingly arbitrary) >> point of tracking metrics... but, I'd rather approach this >> from the *other* side of the fence: defining *realistic* >> goals to which metrics can serve as VALUABLE insights, the >> appropriate metric(s) to use to measure progress towards/away >> that goal, and the ACKNOWLEDGED shortcomings inherent in a >> particular measurement strategy. >> >> E.g., you can use metrics to measure productivity, complexity, >> reliability, maintainability, cost, completion, etc. But, often >> you only get a snapshot of *one* of these -- at the expense of >> all the *others*. > > I don't understand that. How can a measurement of one metrics group > affect the results of other metrics groups?
If you are trying to measure productivity, people will learn to "game" the numbers to keep their totals "up": LoC/man-day. Does this come at the expense of *quality*? I.e., is a buggy-LoC the same as a nonbuggy-LoC? How do you differentiate? Is the developer keeping his productivity numbers up but at the expense of long-term maintenance issues? etc.
>> So, rather than arguing on N "fronts" (and appearing "obstructionist"), >> what guidance (firsthand experience!) can folks offer to bend the >> debate into one that will produce meaningful results (instead of >> just "pages of numbers")? > > Metrics provide objective insight, that can be debated and reasoned, > unlike subjective insight. It allows prioritization of coding and > testing effort.
Yes -- assuming you have a stated goal/objective.
> But in the end, all decisions are subjective. They come down to an > incomplete view of the product development effort that will make some > random human feel good enough about a certain decision.
<frown> That reflects the bean-counter aspect of metrics and why they are so often disparaged. I believe there is a place for (various) metrics in the development process. But, doing so without adequate forethought and planning (and buy-in) makes an arbitrary imposition just a thorn to be worked-around. I suspect everyone moaned about being "graded" on papers, exams, etc. in school -- but, no one (realistically) would claim those grades were without merit (?)
On Tue, 12 Jul 2011 17:39:26 -0700, Don Y wrote:

> If you are trying to measure productivity, people will learn to "game" > the numbers to keep their totals "up": LoC/man-day. Does this come at > the expense of *quality*? I.e., is a buggy-LoC the same as a > nonbuggy-LoC? How do you differentiate?
I don't think that you'll find many people suggesting that LoC/day or even function points is a useful metric, these days. I think that the shape of the number-of-open-issues vs time graph is proably a fairly reasonable handle on project trajectory. Often goes something like a bell-curve. When issues are being closed faster than they're being opened you're probably more than half way there, and as it tails asymptotically towards zero you're nearly there... Of course you need to be (diligently) using an issue tracking system to be able to easily use that metric. A well maintained regression/unit test suite run automatically on check- in or overnight should keep a lid on the "buggy" issue. Cheers, -- Andrew
Op Wed, 13 Jul 2011 02:39:26 +0200 schreef Don Y <nowhere@here.com>:
> On 7/12/2011 1:48 AM, Boudewijn Dijkstra wrote: > >>> I don't dislike metrics for what they may show/fail-to-show. >>> But, rather, because they usually don't have a stated "purpose". >> >> Do statistics have a purpose? Does telemetry have an inherent purpose? > > Sorry, I meant "folks *pushing* for the metrics don't have a clear > understanding of what they want to get *from* those metrics. I.e., > *what* specifically are you trying to measure and what do you want > to do with the data?
Indeed, they are just numbers and not a goal in itself. As usual.
>>> Why are you trying to measure it? >> >> If the metrics are bad, then the code is likely to be bad. Bad code is a >> liability. > > But what constitutes a bad metric? Too few LoC/man-day? > Too many function points? etc.
Negative LoC change can be good if it reduces dead or duplicate code. Positive LoC change while not changing functionality can be good if it increases clarity. Metrics should always be combined to produce a useful indicator into an aspect of code quality, like clarity, maintainability and testability, or into productivity.
>>> What do you plan to do with the result (besides >>> put it in a 3-ring notebook)? >> >> Use it to discipline developers. Either by showing the bad numbers or if >> they don't listen, by whacking them with the 3-ring notebook. > > Ha! <grin> I actually don't think they know *what* they want > the numbers for
The developers themselves may not have any interest in the numbers, but the higher-ups certainly do.
> (which is why they don't know *which* numbers > they want!) -- but, somehow, think having numbers is better > than *not* having numbers... > > <frown> > >>> I can argue many points *against* the (seemingly arbitrary) >>> point of tracking metrics... but, I'd rather approach this >>> from the *other* side of the fence: defining *realistic* >>> goals to which metrics can serve as VALUABLE insights, the >>> appropriate metric(s) to use to measure progress towards/away >>> that goal, and the ACKNOWLEDGED shortcomings inherent in a >>> particular measurement strategy. >>> >>> E.g., you can use metrics to measure productivity, complexity, >>> reliability, maintainability, cost, completion, etc. But, often >>> you only get a snapshot of *one* of these -- at the expense of >>> all the *others*. >> >> I don't understand that. How can a measurement of one metrics group >> affect the results of other metrics groups? > > If you are trying to measure productivity, people will learn to > "game" the numbers to keep their totals "up": LoC/man-day. Does > this come at the expense of *quality*? I.e., is a buggy-LoC the > same as a nonbuggy-LoC? How do you differentiate?
As Andrew said, LoC/man-day is just one indicator of productivity. As you are hinting to, more factors should be weighed in.
> Is the developer keeping his productivity numbers up but at the > expense of long-term maintenance issues? etc.
If maintainability is not less important than productivity, then why is the developer focussing so much on productivity?
>>> So, rather than arguing on N "fronts" (and appearing "obstructionist"), >>> what guidance (firsthand experience!) can folks offer to bend the >>> debate into one that will produce meaningful results (instead of >>> just "pages of numbers")? >> >> Metrics provide objective insight, that can be debated and reasoned, >> unlike subjective insight. It allows prioritization of coding and >> testing effort. > > Yes -- assuming you have a stated goal/objective.
Not necessarily. Will the testers start testing any code, or will they start with the highest 'testability' number and give the developers some time to improve on the lower end? So, the stated goals/objectives (as derived from the high-level business objectives) get crystallized with experience. It's a feedback loop.
>> But in the end, all decisions are subjective. They come down to an >> incomplete view of the product development effort that will make some >> random human feel good enough about a certain decision. > > <frown> That reflects the bean-counter aspect of metrics and why > they are so often disparaged.
How can you claim (or even aspire to) progression without indicators of progress?
> I believe there is a place for > (various) metrics in the development process. But, doing so > without adequate forethought and planning (and buy-in) makes > an arbitrary imposition just a thorn to be worked-around.
As almost any arbitrary imposition would.
> I suspect everyone moaned about being "graded" on papers, exams, etc. > in school -- but, no one (realistically) would claim those grades > were without merit (?)
-- Gemaakt met Opera's revolutionaire e-mailprogramma: http://www.opera.com/mail/ (Remove the obvious prefix to reply.)
The book "Making Software" does a great job
(if a bit voluminous and hard to read) of
debunking pretty much all published
metric-based "productivity" papers, and
explaining why metrics aren't generally
useful.

At least, that was my interpretation ;-)

Might help disabuse anyone tempted to believe
too much in metrics...

Hope this helps,
Best Regards, Dave

http://www.amazon.com/Making-Software-Really-Works-Believe/dp/0596808321/ref=sr_1_1?s=books&ie=UTF8&qid=1310563561&sr=1-1
We have some pretty good metrics for measuring productivity. They are
accurate and good info for future proposals.

The problem is management comes to the conclusion that we have learned
from our mistakes and can therefore decrease the hours by 40% or so.
It's usually displayed in the form that we should challenge ourselves
and drive out inefficiencies with some song and dance meeting. So they
underbid projects, basically throwing out the metrics. In reality, the
original numbers turn out to be correct for new programs as new
problems always arise that are different or unexpected then the
previous project.
Hi Andrew,

On 7/12/2011 10:44 PM, Andrew Reilly wrote:
> On Tue, 12 Jul 2011 17:39:26 -0700, Don Y wrote: > >> If you are trying to measure productivity, people will learn to "game" >> the numbers to keep their totals "up": LoC/man-day. Does this come at >> the expense of *quality*? I.e., is a buggy-LoC the same as a >> nonbuggy-LoC? How do you differentiate? > > I don't think that you'll find many people suggesting that LoC/day or > even function points is a useful metric, these days.
Sorry, I wasn't trying to infer that it was. Rather, I was trying to address Boudewijn's comment re: "How can a measurement of one metrics group affect the results of other metrics groups" This seemed like the most obvious way of making that point :>
> I think that the shape of the number-of-open-issues vs time graph is > proably a fairly reasonable handle on project trajectory. Often goes > something like a bell-curve. When issues are being closed faster than > they're being opened you're probably more than half way there, and as it > tails asymptotically towards zero you're nearly there... > > Of course you need to be (diligently) using an issue tracking system to > be able to easily use that metric.
It also doesn't help if your goal was, for example, to estimate resources required to *undertake* a particular project (before you've even gone to the trouble of preparing specications, etc.).
> A well maintained regression/unit test suite run automatically on check- > in or overnight should keep a lid on the "buggy" issue.
My point is that you need to understand *why* you want the metrics and what you expect to get *from* them before you can even address *which* metrics are worth gathering and how to gather them! :-/