EmbeddedRelated.com
Forums
Memfault Beyond the Launch

Delayed "printf"

Started by Tim Wescott April 24, 2017
On 25/04/17 07:19, Robert Wessel wrote:
> On Mon, 24 Apr 2017 19:42:30 -0500, Tim Wescott > <seemywebsite@myfooter.really> wrote: > >> Sometimes it's necessary to float a notion just so that someone can >> affirm your suspicion that yes, you really are an idiot. >> >> On the off chance that I'm a genius, though, here goes: >> >> I don't like putting "printf" in fast tasks for obvious reasons -- even >> with infinite spooling, it can take time to process a format string. In >> my usual context, this can result in audible sounds from motors when I >> print some bit of debugging information from a high-priority task. The >> problem can't be alleviated by spooling the output of the printf, because >> it's the execution time of the printf itself that's the problem. >> >> Just how dumb is it to have "printf" queue up the format string AND the >> parameters into a buffer, to be printed out in some low-priority task? >> For a system with sufficient memory, this seems like an astonishingly >> sensible thing to me. >> >> Clearly you'd have to wrap it up somehow so that if the buffer was full >> no printf at all would happen -- but I'm kinda liking the notion. >> >> So -- whatcha think? > > > A problem is that finding all the input parameters, and copying them > someplace, may well be a considerable amount of work. *Output* > parameters (%n, the return value), have already been mentioned, and > pose significant additional problems. > > While copying something like an integer or float is going to be fairly > quick (assuming you have some place to put them already allocated), > and likely rather faster than than actually formatting them, you're > still going to have to do a non-trivial amount of parsing to actually > *find* those parameters. And remember you have not just the > parameters, but the precision and min size can be specified as a > parameter ("*"). With strings, you have a worse allocation problem, > and copying the strings to a temporary area is going to be as much > work as actually processing them. > > So at best, you're likely to only save a modest amount of CPU time in > the issuing thread formatting the output string, and quite possibly > that CPU usage in the main thread may be a non-trivial amount worse. > Overall CPU usage will certainly be worsened considerably. > > OTOH, your real goal is to prevent the printf from blocking because it > lacks output space. > > A simple way to do that is to perform an snprintf to a pre-allocated, > and fairly fixed size, buffer, and then pass *that* buffer to a > separate thread for output. A better option may be to issue an > asynchronous I/O request (although you'll have to step outside of C > streams at that point). If the available buffer is too small, you > have the choice of ignoring the output, or truncating it, and you have > several options for dealing with a new output request from the main > thread that happenes before the old buffer is disposed of (ignore, > block, have multiple available output buffers). > > An advantage of doing the snprintf in the main thread, is that you > largely leave the semantics of printf unchanged.
The problem with this approach is that snprintf is still doing most of the work that printf does - it is still a slow function.
> > You can sometimes also check (OS dependent) if there is space in the > stream output buffer (__fbufsize and __fpending in *nix, for example), > and use that to judge whether you're going to have room for the output > (you could do a trial with snprintf and a zero length buffer to get > the length required, for example). You then need to compensate for > things like expansion of newlines (OS dependent, of course, not an > issue in *nix*). >
On 25/04/17 02:42, Tim Wescott wrote:
> Sometimes it's necessary to float a notion just so that someone can > affirm your suspicion that yes, you really are an idiot. > > On the off chance that I'm a genius, though, here goes: > > I don't like putting "printf" in fast tasks for obvious reasons -- even > with infinite spooling, it can take time to process a format string. In > my usual context, this can result in audible sounds from motors when I > print some bit of debugging information from a high-priority task. The > problem can't be alleviated by spooling the output of the printf, because > it's the execution time of the printf itself that's the problem. > > Just how dumb is it to have "printf" queue up the format string AND the > parameters into a buffer, to be printed out in some low-priority task? > For a system with sufficient memory, this seems like an astonishingly > sensible thing to me. > > Clearly you'd have to wrap it up somehow so that if the buffer was full > no printf at all would happen -- but I'm kinda liking the notion. > > So -- whatcha think? >
printf is a horrible function for such purposes - with its variable arguments, etc. Consider having delayed_print_string(), delayed_print_int(), delayed_print_hex(), etc., so that you can have a queue with of structs with a fixed format (a tag for the type to be printed, a union of the data types, etc.). It is a lot easier to keep control of a fixed set of types of a specific number, than a general variable argument set. If you have a C11 compiler and can grok _Generic, you can wrap the whole thing in variable argument macros and get a programming interface such as: delayed_print("Current pos: ", position, " register: 0x", hex(reg, 4), endline); But if you do go for a printf-style function, don't forget gcc's "format" __attribute__ on your function to check that the formatting and the arguments match up - it will save much wailing and gnashing of teeth.
On Tue, 25 Apr 2017 10:44:38 +0200, David Brown
<david.brown@hesbynett.no> wrote:

>On 25/04/17 07:19, Robert Wessel wrote: >> On Mon, 24 Apr 2017 19:42:30 -0500, Tim Wescott >> <seemywebsite@myfooter.really> wrote: >> >>> Sometimes it's necessary to float a notion just so that someone can >>> affirm your suspicion that yes, you really are an idiot. >>> >>> On the off chance that I'm a genius, though, here goes: >>> >>> I don't like putting "printf" in fast tasks for obvious reasons -- even >>> with infinite spooling, it can take time to process a format string. In >>> my usual context, this can result in audible sounds from motors when I >>> print some bit of debugging information from a high-priority task. The >>> problem can't be alleviated by spooling the output of the printf, because >>> it's the execution time of the printf itself that's the problem. >>> >>> Just how dumb is it to have "printf" queue up the format string AND the >>> parameters into a buffer, to be printed out in some low-priority task? >>> For a system with sufficient memory, this seems like an astonishingly >>> sensible thing to me. >>> >>> Clearly you'd have to wrap it up somehow so that if the buffer was full >>> no printf at all would happen -- but I'm kinda liking the notion. >>> >>> So -- whatcha think? >> >> >> A problem is that finding all the input parameters, and copying them >> someplace, may well be a considerable amount of work. *Output* >> parameters (%n, the return value), have already been mentioned, and >> pose significant additional problems. >> >> While copying something like an integer or float is going to be fairly >> quick (assuming you have some place to put them already allocated), >> and likely rather faster than than actually formatting them, you're >> still going to have to do a non-trivial amount of parsing to actually >> *find* those parameters. And remember you have not just the >> parameters, but the precision and min size can be specified as a >> parameter ("*"). With strings, you have a worse allocation problem, >> and copying the strings to a temporary area is going to be as much >> work as actually processing them. >> >> So at best, you're likely to only save a modest amount of CPU time in >> the issuing thread formatting the output string, and quite possibly >> that CPU usage in the main thread may be a non-trivial amount worse. >> Overall CPU usage will certainly be worsened considerably. >> >> OTOH, your real goal is to prevent the printf from blocking because it >> lacks output space. >> >> A simple way to do that is to perform an snprintf to a pre-allocated, >> and fairly fixed size, buffer, and then pass *that* buffer to a >> separate thread for output. A better option may be to issue an >> asynchronous I/O request (although you'll have to step outside of C >> streams at that point). If the available buffer is too small, you >> have the choice of ignoring the output, or truncating it, and you have >> several options for dealing with a new output request from the main >> thread that happenes before the old buffer is disposed of (ignore, >> block, have multiple available output buffers). >> >> An advantage of doing the snprintf in the main thread, is that you >> largely leave the semantics of printf unchanged. > >The problem with this approach is that snprintf is still doing most of >the work that printf does - it is still a slow function.
My point was that the work of actually saving the parameters someplace for another thread to use was also a substantial amount of work, and, depending on how general you were being (for example, actually doing that based on the variable input to a printf-like function), you might well be doing *more* work on the issuing thread on many possible printf calls. While a front-end to printf might well be faster when parsing and saving something like "xprintf("%f-%f", f1, f2)" compared to actually executing the snprintf, it would almost certainly be *slower* than the actual snprintf handing "xprintf("%s-%s", s1, s2)". And that's ignoring all of the synchronization issues, and any work involved in those. Now if you had non-variable functions, where you didn't have to parse the format string to find the parameters, and you then applied the format only in the worker thread, that should help, but it's a much less general solution, and will help only modestly in many cases. In any event I thing I was reading the OPs message as being more concerned with the possibility of the printf call blocking.
On 25/04/17 16:22, Robert Wessel wrote:
> On Tue, 25 Apr 2017 10:44:38 +0200, David Brown > <david.brown@hesbynett.no> wrote: > >> On 25/04/17 07:19, Robert Wessel wrote: >>> On Mon, 24 Apr 2017 19:42:30 -0500, Tim Wescott >>> <seemywebsite@myfooter.really> wrote: >>> >>>> Sometimes it's necessary to float a notion just so that someone can >>>> affirm your suspicion that yes, you really are an idiot. >>>> >>>> On the off chance that I'm a genius, though, here goes: >>>> >>>> I don't like putting "printf" in fast tasks for obvious reasons -- even >>>> with infinite spooling, it can take time to process a format string. In >>>> my usual context, this can result in audible sounds from motors when I >>>> print some bit of debugging information from a high-priority task. The >>>> problem can't be alleviated by spooling the output of the printf, because >>>> it's the execution time of the printf itself that's the problem. >>>> >>>> Just how dumb is it to have "printf" queue up the format string AND the >>>> parameters into a buffer, to be printed out in some low-priority task? >>>> For a system with sufficient memory, this seems like an astonishingly >>>> sensible thing to me. >>>> >>>> Clearly you'd have to wrap it up somehow so that if the buffer was full >>>> no printf at all would happen -- but I'm kinda liking the notion. >>>> >>>> So -- whatcha think? >>> >>> >>> A problem is that finding all the input parameters, and copying them >>> someplace, may well be a considerable amount of work. *Output* >>> parameters (%n, the return value), have already been mentioned, and >>> pose significant additional problems. >>> >>> While copying something like an integer or float is going to be fairly >>> quick (assuming you have some place to put them already allocated), >>> and likely rather faster than than actually formatting them, you're >>> still going to have to do a non-trivial amount of parsing to actually >>> *find* those parameters. And remember you have not just the >>> parameters, but the precision and min size can be specified as a >>> parameter ("*"). With strings, you have a worse allocation problem, >>> and copying the strings to a temporary area is going to be as much >>> work as actually processing them. >>> >>> So at best, you're likely to only save a modest amount of CPU time in >>> the issuing thread formatting the output string, and quite possibly >>> that CPU usage in the main thread may be a non-trivial amount worse. >>> Overall CPU usage will certainly be worsened considerably. >>> >>> OTOH, your real goal is to prevent the printf from blocking because it >>> lacks output space. >>> >>> A simple way to do that is to perform an snprintf to a pre-allocated, >>> and fairly fixed size, buffer, and then pass *that* buffer to a >>> separate thread for output. A better option may be to issue an >>> asynchronous I/O request (although you'll have to step outside of C >>> streams at that point). If the available buffer is too small, you >>> have the choice of ignoring the output, or truncating it, and you have >>> several options for dealing with a new output request from the main >>> thread that happenes before the old buffer is disposed of (ignore, >>> block, have multiple available output buffers). >>> >>> An advantage of doing the snprintf in the main thread, is that you >>> largely leave the semantics of printf unchanged. >> >> The problem with this approach is that snprintf is still doing most of >> the work that printf does - it is still a slow function. > > > My point was that the work of actually saving the parameters someplace > for another thread to use was also a substantial amount of work, and, > depending on how general you were being (for example, actually doing > that based on the variable input to a printf-like function), you might > well be doing *more* work on the issuing thread on many possible > printf calls.
Absolutely true. It depends on many factors, currently only know to the OP.
> > While a front-end to printf might well be faster when parsing and > saving something like "xprintf("%f-%f", f1, f2)" compared to actually > executing the snprintf, it would almost certainly be *slower* than the > actual snprintf handing "xprintf("%s-%s", s1, s2)". > > And that's ignoring all of the synchronization issues, and any work > involved in those. > > Now if you had non-variable functions, where you didn't have to parse > the format string to find the parameters, and you then applied the > format only in the worker thread, that should help, but it's a much > less general solution, and will help only modestly in many cases.
It would perhaps be less general, but it could well be good enough for the job. And it does not /have/ to be less general - you could build something akin to C++'s output stream system. In fact, you could make your queue simply hold a pair with a pointer to a function, and a data member (which would be interpreted as either a value directly, a string pointer, or a pointer to a struct with more data). This would be /more/ general than a printf, while keeping the shared queue structure simple. Of course, you still have issues with synchronisation, interleaving of messages printed from different threads or contexts, and the challenge of making sure your debug data stays alive between the queuing operation and the dequeuing operation.
> > In any event I thing I was reading the OPs message as being more > concerned with the possibility of the printf call blocking. >
That is one possible interpretation, but not the only one. In my own code, I have made printf functions that use snprintf and then pump the output directly into a DMA buffer for a UART, skipping the layers of library calls for file I/O. In such cases, it is the formatting of the data that takes time, rather than the output of the data. But I have also seen printf's that boil down to polled blocking UART transmission after layers of _write() functions and other internal library code - in which case it is the output part that is slow.
On Tue, 25 Apr 2017 00:19:24 -0500, Robert Wessel wrote:

> On Mon, 24 Apr 2017 19:42:30 -0500, Tim Wescott > <seemywebsite@myfooter.really> wrote: > >>Sometimes it's necessary to float a notion just so that someone can >>affirm your suspicion that yes, you really are an idiot. >> >>On the off chance that I'm a genius, though, here goes: >> >>I don't like putting "printf" in fast tasks for obvious reasons -- even >>with infinite spooling, it can take time to process a format string. In >>my usual context, this can result in audible sounds from motors when I >>print some bit of debugging information from a high-priority task. The >>problem can't be alleviated by spooling the output of the printf, >>because it's the execution time of the printf itself that's the problem. >> >>Just how dumb is it to have "printf" queue up the format string AND the >>parameters into a buffer, to be printed out in some low-priority task? >>For a system with sufficient memory, this seems like an astonishingly >>sensible thing to me. >> >>Clearly you'd have to wrap it up somehow so that if the buffer was full >>no printf at all would happen -- but I'm kinda liking the notion. >> >>So -- whatcha think? > > > A problem is that finding all the input parameters, and copying them > someplace, may well be a considerable amount of work. *Output* > parameters (%n, the return value), have already been mentioned, and pose > significant additional problems. > > While copying something like an integer or float is going to be fairly > quick (assuming you have some place to put them already allocated), and > likely rather faster than than actually formatting them, you're still > going to have to do a non-trivial amount of parsing to actually *find* > those parameters. And remember you have not just the parameters, but > the precision and min size can be specified as a parameter ("*"). With > strings, you have a worse allocation problem, and copying the strings to > a temporary area is going to be as much work as actually processing > them. > > So at best, you're likely to only save a modest amount of CPU time in > the issuing thread formatting the output string, and quite possibly that > CPU usage in the main thread may be a non-trivial amount worse. Overall > CPU usage will certainly be worsened considerably. > > OTOH, your real goal is to prevent the printf from blocking because it > lacks output space.
You assume. My real goal is to prevent printf from blocking because it is slooooooooooooooooow. Lacking output space is just an anticipated failure mode that I wish to avoid. -- Tim Wescott Control systems, embedded software and circuit design I'm looking for work! See my website if you're interested http://www.wescottdesign.com
Am 25.04.2017 um 02:42 schrieb Tim Wescott:
> I don't like putting "printf" in fast tasks for obvious reasons -- even > with infinite spooling, it can take time to process a format string. In > my usual context, this can result in audible sounds from motors when I > print some bit of debugging information from a high-priority task. The > problem can't be alleviated by spooling the output of the printf, because > it's the execution time of the printf itself that's the problem. > > Just how dumb is it to have "printf" queue up the format string AND the > parameters into a buffer, to be printed out in some low-priority task?
The first question to ask is: do you really need a full-blown 'printf'? Or do you just need something that expands '%d' and/or '%x', '%s', and that's it? Such a function can be implemented in a screenful of code, using a 'putchar' function as the back-end, which in turn writes into an asynchronous buffer. Such a function is fast so you might get away with it. Compared to 'printf', it lacks locking and synchronisation with the rest of stdio, and it also doesn't need the bajillions of ifs which 'printf' uses to implement left/right-pad, pad-with-zero, float, thousands-separators, etc. If you wish to keep using a 'printf'-style interface and just stash away the parameters, you'd still need to pre-parse the format string and I'd guess that doesn't save much compared to actually formatting. I worked with such a 'printf' workalike function in the past and had no problems using it from things like audio DMA interrupts. Another question to ask is whether you want to log essays like "foo is not recognized as an internal or external command, operable program or batch file. Try reinstalling the application if the program persists", or just brief messages like "foo: file not found". :-) Stefan
Clifford Heath <no.spam@please.net> wrote:
> We did this with an error message system in an enterprise system > written mostly in C++. I had produced a type-safe ArgList class, > which allowed you to << any supported data type to append to the > list. Supporting printf-style formats with positional argument > numbers allowed us to fetch the format text translated into > foreign languages. Having an allocated copy of the typed argument > list made printf type-safe, but also meant that we could ship the > format and the arguments across a network interface, to be > rendered using an international format string in the language of > the client's locale. > > It was very nice, if rather a lot of work (mostly by me)
My idea, for a system which is supposed to run code written by arbitrary users who expect printf, but is limited in message size that you don't want to send the text, is to abuse the tools for handling localisation. In other words the user writes: printf("Hello world foo=%d, bar=%s, baz=%f blah blah blah\n", foo, bar, baz); and the locale tools transform that into a text file containing strings like: STR42: "Hello world foo=%d, bar=%s, baz=%f blah blah blah\n" and code like: _locale_print("STR42", foo, bar, baz); a function which packages STR42 and binary representations of STR42 and the parameters into a packet and sends it on its way. (You might need to annotate the _locale_print with the types of the parameters somehow) At the other end, the PC uses the text file so that when it receives STR42 in a packet it knows how to interpret the packet and prints it out (or logs it to a file for printing later in some GUI tool). We haven't actually implemented it yet. Theo
Am 25.04.2017 um 02:42 schrieb Tim Wescott:

> Just how dumb is it to have "printf" queue up the format string AND the > parameters into a buffer, to be printed out in some low-priority task?
Exactly as dumb as using printf() at all while being concerned about CPU cycles at all. I.e. the first rule of adding actual printf()s in the hot path is: DON'T. Using the real printf for this effectively impossible, if only because there's essentially no way to know, in the general case, what needs to be stored in that printf job queue to be able to execute it later, without parsing the entire format string first. That parse alone will likely already take longer than you can afford. You need to use something else; if it's to resemble printf(), it'll have to be _considerably_ pared down to forbid all the impossible-to-implement features: no * widths, no * precisions, no %s format, no %n format, best no int128_t, long doubles, etc. A much saner function whose calls can be queued up for later playback would look like this: void deb_out(t_typeToPrint type, /* enum of allowed types */ size_t size, /* refuse if size > some limit */ void *data /* copied at call time! */ ); Based on that one might wish to make convenience (inline?) wrappers like: void deb_out_u8(uint8_t value) { deb_out(E_TYPE_TO_PRINT_U8, 1, &value); } void deb_out_u16(uint16_t vlaue) ... and maybe, if you have near-perfect control over who gets to use this, even some rather risky ones: void deb_out_f_float(const char *fmt, float value) { deb_out(E_TYPE_TO_PRINT_FORMATSTRING, strlen(fmt), fmt); deb_out(E_TYPE_TO_PRINT_FLOAT, sizeof(float), &value); } The worker thread would synthesize a call of snprintf() or similar from those two entries. Entries without a preceding E_TYPE_TO_PRINT_FORMATSTRING one are printed with a default format.
On Mon, 24 Apr 2017 19:42:30 -0500, Tim Wescott wrote:

> Sometimes it's necessary to float a notion just so that someone can > affirm your suspicion that yes, you really are an idiot. > > On the off chance that I'm a genius, though, here goes: > > I don't like putting "printf" in fast tasks for obvious reasons -- even > with infinite spooling, it can take time to process a format string. In > my usual context, this can result in audible sounds from motors when I > print some bit of debugging information from a high-priority task. The > problem can't be alleviated by spooling the output of the printf, > because it's the execution time of the printf itself that's the problem. > > Just how dumb is it to have "printf" queue up the format string AND the > parameters into a buffer, to be printed out in some low-priority task? > For a system with sufficient memory, this seems like an astonishingly > sensible thing to me. > > Clearly you'd have to wrap it up somehow so that if the buffer was full > no printf at all would happen -- but I'm kinda liking the notion. > > So -- whatcha think?
The responses so far are really making me think. I guess the better way to frame my desire is thus: I want some way to print free-form debug data to a serial port in such a way that anything slow (formatting and waiting on the UART) is performed in a low-priority task. Some printf-like thing may not be the best way to do it -- but I still want to do it!! -- Tim Wescott Wescott Design Services http://www.wescottdesign.com I'm looking for work -- see my website!
I recently had a similar problem. What I did was:
- Create a virtual base class with a printf pure virtual function
- derived classes add required parameters and appropriate printf
- fast/time-sensitive task creates+enqueues derived-class objects
- slow path dequeues and prints
This is with FreeRTOS where elapsed time to create/enqueue is
very fast (wrt my application, your mileage...).

Hope that helps!
Best Regards, Dave

PS: C code would do the same with enum for type, union parameter sets, etc.
Not as easy and neat as C++...

Memfault Beyond the Launch