Stack analysis tool that really work?| page 5

On 8/9/2021 12:50 AM, Niklas Holsti wrote:
> If you are not satisfied with Don's approach (extreme over-provision of 
> processor power)

It's not necessarily "extreme".  I only grossly underestimate the processor's
ability to do the "must do" aspects of the design.  This ensures that they
actually *will* get done.

And, what's *left* will likely exceed the needs of what I'd *like* to (also)
get done -- but, as that is (by definition) not a "must do" aspect of the
design, there are no criteria as to HOW MUCH must get done (including
"none" and "all")

In my world, I can defer execution of some tasks, *move* tasks from an
overburdened processor to a less-burdened one, *add* physical processors
to the mix (on demand), etc.

So, trying to match (at design time) the load to the capabilities is
neither necessary nor desireable/economical.

There's no (a priori) "hard limit" on the number/types of applications that
you can run on your PC, is there?  Instead, you dynamically review the
performance you are experiencing and adjust your expectations, accordingly.

Maybe kill off some application that isn't *as* useful to you (at the
moment) as some other.  Or, defer invoking an application until some
*other* has exceeded its utility.

You could, of course, just buy a bigger/faster PC!  Yet, you don't (up
to a point).  Because that changes the economics of the solution.
And, because you can "make do" with your current investment just by
more intelligently scheduling its use!

By minimizing the "must do" aspect of the problem, you give yourself
the most flexibility in how you address that/those requirements.
The rest is "free", figuratively speaking.

For example, one of my applications handles telephony.  It screens
incoming callers, does speaker identification ("who is calling"),
etc.  To train the recognizer, I have it "listen in" on every call
given that it now KNOWS who I am talking with and can extract/update
speech parameters from the additional "training material" that is
present, FOR FREE, in the ongoing conversation.  (instead of explicitly
requiring callers to "train" the recognizer in a "training session")

But, there's no reason that this has to happen:
- in real time
- while the conversation is in progress
- anytime "soon"

So, if you *record* the conversation (relatively inexpensive as you
are already moving the audio through the processor), you can pick
some *later* time to run the model update task.  Maybe in the middle
of the night when there are less demands (i.e., no one telephoning
you at 3AM!)

And, if something happens to "come up" that is more important than
this activity, you can checkpoint the operation, kill off the process
and let the more important activity have access to those resources.

Yes, it would be a much simpler design if you could just say:
"I *need* the following resources to be able to update the
recognizer models AS THE CONVERSATION WAS HAPPENING".  But,
when the conversation was *over*, you'd have all those extra
re$ource$ sitting idle.

Turn "hard" requirements into *soft* ones -- and then shift them,
in time, to periods of lower resource utilization.

[The "hard real-time is hard but soft real-time is HARDER" idiom]

> you could try the hybrid WCET-estimation tools (RapiTime or 
> TimeWeaver) which do not need to model the processors, but need to measure 
> fine-grained execution times (on the basic-block level). The problem with such 
> tools is that they cannot guarantee to produce an upper bound on the WCET, only 
> a bound that holds with high probability. And, AIUI, at present that 
> probability cannot be computed, and certainly depends on the test suite being 
> measured. For example, on whether those tests lead to mispredictions in chains 
> of conditional branches.

The equivalent mechanism in my world is monitoring *actual* load.
As it increases, you have a mismatch between needs and capabilities.
So, adjust one, the other, or both!

Shed load (remove processes from the current node)

and/or

Add capacity (*move* process to another node, including bringing
that other node "on-line" to handle the load!)

If you keep in mind the fact that you only have to deal with
the MUST DO tasks, this is a lot easier to wrap your head
around.  E.g., eventually, there *will* be a situation where
what you *want* to do exceeds the capabilities that you
have available -- period.  So, you have to remember that only
some of those are MUST DOs.

Yeah, it would be nice to have 3000 applications loaded and ready
at a mouse-click... but, that wouldn't be worth the cost of
the system required to support it!

On 2021-08-09 21:26, Paul Rubin wrote:
> Niklas Holsti <niklas.holsti@tidorum.invalid> writes:
>> And, AIUI, at present that probability cannot be
>> computed, and certainly depends on the test suite being measured. For
>> example, on whether those tests lead to mispredictions in chains of
>> conditional branches.
> 
> Maybe architectural simulation of the target cpu can help, if the
> architecture is known (i.e. exact workings of the pipelines, branch
> predictors etc).

For timing, one needs *micro* -architectural simulation (as the term is 
commonly used, for example in comp.arch). But I think you mean that, so 
this is a terminological quibble.

> And maybe forthcoming RISC-V cpus will be more open about this than
> the current ARM stuff is.

I suspect that the microarchitecture will be where the various RISC-V 
implementors will compete (that, and peripherals and packaging), so I'm 
not optimistic that they will be very open about their 
microarchitectures. However, I don't know how far into the micro-level 
the RISC-V standardization and open-source licensing extends.

Stack analysis tool that really work?

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About EmbeddedRelated.com

Social Networks

The Related Media Group