EmbeddedRelated.com
Forums
The 2026 Embedded Online Conference

Instrumentation question.

Started by Les Cargill September 7, 2016
I've always suspected there was something clever that I could use for a 
thing like this, but ... I was too afraid to ask about it.

Suppose I am on an extremely deterministic platform.

Suppose I can calculate the number of microsconds it takes to call 
something[1].

Suppose I can check and calculate things about this count every <n> 
seconds.

Suppose also that there are two cases for "something" :

- There is nothing to do.
- There is something to do. "Something" is constrained to "one thing" by 
design.

Let's also assume I can (cheaply) calculate the minimum and maximum 
number of microseconds to do either of "nothing" or "something".

Obviously a histogram is in order. But wait!.

It seems to me that I should be able to estimate a histogram based on:
- minimum time
- maximum time
- average time.

What is this sort of ... inquiry called, and how do I stop being quite
so dull about it?

[1] I suspect that one answer is to extend the number of buckets
into more finely grained measurements of sub-operations of "something".
Example: a check for "serial port has data" vs. "relieve one unit from 
serial port" vs the overhead from all the dance surrounding those two.

--
Les Cargill


Les Cargill <lcargill99@comcast.com> writes:
> Obviously a histogram is in order. But wait!. > It seems to me that I should be able to estimate a histogram based on: > - minimum time > - maximum time > - average time. > What is this sort of ... inquiry called, and how do I stop being quite > so dull about it?
You want a probability distribution of the time it takes to do an operation? You might look at the docs for Haskell's Criterion benchmarking library: http://www.serpentine.com/criterion/tutorial.html You presumably won't be able to use it directly (unless you're coding in Haskell), and lots of the doc is very Haskell-specific, but it can give you an idea of how a fancy package of this sort goes about things.
On 07/09/16 06:50, Les Cargill wrote:
> I've always suspected there was something clever that I could use for a thing > like this, but ... I was too afraid to ask about it. > > Suppose I am on an extremely deterministic platform. > > Suppose I can calculate the number of microsconds it takes to call something[1]. > > Suppose I can check and calculate things about this count every <n> seconds. > > Suppose also that there are two cases for "something" : > > - There is nothing to do. > - There is something to do. "Something" is constrained to "one thing" by design. > > Let's also assume I can (cheaply) calculate the minimum and maximum number of > microseconds to do either of "nothing" or "something". > > Obviously a histogram is in order. But wait!. > > It seems to me that I should be able to estimate a histogram based on: > - minimum time > - maximum time > - average time. > > What is this sort of ... inquiry called, and how do I stop being quite > so dull about it? > > [1] I suspect that one answer is to extend the number of buckets > into more finely grained measurements of sub-operations of "something". > Example: a check for "serial port has data" vs. "relieve one unit from serial > port" vs the overhead from all the dance surrounding those two.
Latencies are usually displayed in the form of a cumulative distribution function, where the x-axis is time and the y-axis range is 0 to 1 indicating the probability that the latency is less than the time on the x-axis. Effectively the CDF is the integral of the PDF. Either the PDF or CDF will show multimodal distributions and extreme values. For hard real-time systems either the absolute min or max is necessary. For soft real-time systems the 5th or 95th percentile is more useful. (or 1/99th percentile etc). With the CDF it is trivial to read off the various percentiles.
Paul Rubin wrote:
> Les Cargill <lcargill99@comcast.com> writes: >> Obviously a histogram is in order. But wait!. >> It seems to me that I should be able to estimate a histogram based on: >> - minimum time >> - maximum time >> - average time. >> What is this sort of ... inquiry called, and how do I stop being quite >> so dull about it? > > You want a probability distribution of the time it takes to do an > operation? > > You might look at the docs for Haskell's Criterion benchmarking library: > > http://www.serpentine.com/criterion/tutorial.html > > You presumably won't be able to use it directly (unless you're coding in > Haskell), and lots of the doc is very Haskell-specific, but it can give > you an idea of how a fancy package of this sort goes about things. >
The Haskell reference is quite fruitful. Thanks for that. The core mechanism of "bench" is a KDE estimator. I'd forgotten about those. I don't think a min, max and mean are sufficient to create a KDE estimator off board unless yoo know precisely the number of modes. There is a C++ library called libkde. It won't fit, and I don't have enough bandwidth to use it off-target. In this case, the better part of valor is to add more and more fine grained buckets until the buckets become more constant - mix, max and average are closer together. That yields a mildly complicated model for each path through the code. Since more buckets mean mo' problems, I've split the benchmarking into two types controlled by compiler options. That seems to have done the trick. Since the thing is essentially a protocol converter, the four buckets[1] are enough to make design decisions. I also now have a plan for estimating dropped "packets". [1] read from A, read from B, write to A, write to B. -- Les Cargill
Tom Gardner wrote:
> On 07/09/16 06:50, Les Cargill wrote: >> I've always suspected there was something clever that I could use for >> a thing >> like this, but ... I was too afraid to ask about it. >> >> Suppose I am on an extremely deterministic platform. >> >> Suppose I can calculate the number of microsconds it takes to call >> something[1]. >> >> Suppose I can check and calculate things about this count every <n> >> seconds. >> >> Suppose also that there are two cases for "something" : >> >> - There is nothing to do. >> - There is something to do. "Something" is constrained to "one thing" >> by design. >> >> Let's also assume I can (cheaply) calculate the minimum and maximum >> number of >> microseconds to do either of "nothing" or "something". >> >> Obviously a histogram is in order. But wait!. >> >> It seems to me that I should be able to estimate a histogram based on: >> - minimum time >> - maximum time >> - average time. >> >> What is this sort of ... inquiry called, and how do I stop being quite >> so dull about it? >> >> [1] I suspect that one answer is to extend the number of buckets >> into more finely grained measurements of sub-operations of "something". >> Example: a check for "serial port has data" vs. "relieve one unit from >> serial >> port" vs the overhead from all the dance surrounding those two. > > Latencies are usually displayed in the form of a > cumulative distribution function, where the x-axis > is time and the y-axis range is 0 to 1 indicating > the probability that the latency is less than the > time on the x-axis. Effectively the CDF is the > integral of the PDF. > > Either the PDF or CDF will show multimodal > distributions and extreme values. > > For hard real-time systems either the absolute min > or max is necessary. For soft real-time systems the > 5th or 95th percentile is more useful. (or 1/99th > percentile etc). > > With the CDF it is trivial to read off the various > percentiles.
That may be something I try, but for now, even a simple CDF requires a lot of RAM for a target with 8K main memory:) But I'll continue playing with it. -- Les Cargill
On 07/09/16 19:10, Les Cargill wrote:
> Tom Gardner wrote: >> On 07/09/16 06:50, Les Cargill wrote: >>> I've always suspected there was something clever that I could use for >>> a thing >>> like this, but ... I was too afraid to ask about it. >>> >>> Suppose I am on an extremely deterministic platform. >>> >>> Suppose I can calculate the number of microsconds it takes to call >>> something[1]. >>> >>> Suppose I can check and calculate things about this count every <n> >>> seconds. >>> >>> Suppose also that there are two cases for "something" : >>> >>> - There is nothing to do. >>> - There is something to do. "Something" is constrained to "one thing" >>> by design. >>> >>> Let's also assume I can (cheaply) calculate the minimum and maximum >>> number of >>> microseconds to do either of "nothing" or "something". >>> >>> Obviously a histogram is in order. But wait!. >>> >>> It seems to me that I should be able to estimate a histogram based on: >>> - minimum time >>> - maximum time >>> - average time. >>> >>> What is this sort of ... inquiry called, and how do I stop being quite >>> so dull about it? >>> >>> [1] I suspect that one answer is to extend the number of buckets >>> into more finely grained measurements of sub-operations of "something". >>> Example: a check for "serial port has data" vs. "relieve one unit from >>> serial >>> port" vs the overhead from all the dance surrounding those two. >> >> Latencies are usually displayed in the form of a >> cumulative distribution function, where the x-axis >> is time and the y-axis range is 0 to 1 indicating >> the probability that the latency is less than the >> time on the x-axis. Effectively the CDF is the >> integral of the PDF. >> >> Either the PDF or CDF will show multimodal >> distributions and extreme values. >> >> For hard real-time systems either the absolute min >> or max is necessary. For soft real-time systems the >> 5th or 95th percentile is more useful. (or 1/99th >> percentile etc). >> >> With the CDF it is trivial to read off the various >> percentiles. > > > That may be something I try, but for now, even a simple CDF > requires a lot of RAM for a target with 8K main memory:) > > But I'll continue playing with it.
You store the PDF, and compute the CDF when needed. The range and resolution translates directly into the number of "buckets", and the number of bits in each bucket is determined by the number of samples. No surprises there. Sometimes pulsing an output line high during the interval, and averaging with an RC filter is sufficient.
The 2026 Embedded Online Conference