Reply by Dan N February 27, 20072007-02-27
On Tue, 27 Feb 2007 16:31:29 +0000, Paul Burke wrote:


> Sod knows what was going on, but I hope gcc isn't often like > this as i'm going to have to live with it for a while.
I used gcc on a Renesas processor. I've had hard bugs that I couldn't figure out, and in those cases you start to wonder if there is something wrong with the compiler. But in the end I've always found that it's something that I've done. Gcc is very good. Dan
Reply by CBFalconer February 27, 20072007-02-27
Paul Burke wrote:
> > I'm converting an application from Windows console to Linux, and > the changeover has gone remarkably easily (considering that I know > very little about Linux), until now. > > No problem installing GCC, KDevelop, FTDI USB drivers, remarkably > few changes to recompile the code... but printf fails after about > half-a-dozen calls. A float value prints as "nan"- not a number I > asume, rather than what I eat with an Indian takeout. This value > is computed from two int values (actually a weight and a tare > reading) and multiplied by a scale factor (1.0 for the tests). > > The funny thing is that I can't see anything different about the > weight or the tare value between instances that print and those > that fail. There is the expected one-or-two bits wobble in the > weight reading, but the values only oscillate between plus and > minus one relative to the tare. Once it fails, it seems to be > sticky- it doesn't recover even when the readings are identical > to before the nan. > > So, please you Linux/ GCC experienced people- what absolutely > basic item of knowledge am I lacking?
Since you didn't publish any code, we have no idea. The only thing I can point out is that any operation on a NAN yields another NAN. You don't need to know anything about Linux if you stick to ISO standard C. Cut the program down to a minimum that demonstrates the problem, and doesn't use non-standard C (you can fake inputs by using files) and publish that if the process hasn't made you solve the actual problem. -- Chuck F (cbfalconer at maineline dot net) Available for consulting/temporary embedded and systems. <http://cbfalconer.home.att.net>
Reply by tbro...@hifn.com February 27, 20072007-02-27
> My guess is that some other part of your program is corrupting memory, > and that after 7 or 8 iterations, the corruption has reached the code > or variables used in this function. There's nothing wrong with the > code you posted.- Hide quoted text -
...and this fits with your finding that the problem "went away" when you moved the code into a different scope. The buffers / variables would have moved, causing the overrun to smush something else instead. - Tim.
Reply by Arlet February 27, 20072007-02-27
On Feb 27, 4:40 pm, Paul Burke <t...@scazon.com> wrote:

> Boudewijn Dijkstra wrote: > > Looks like an optimization problem. The compiler is seeing a constant > > where it shouldn't, and/or has modified the type of a variable to > > double. Without seeing the code that deals with TareWeight, I'd guess > > this is it. Try making it volatile. > > > Another possibility is a bug that modifies ScaleFactor, which BTW isn't > > declared const. > > I've found that it doesn't matter what the ADC reading is, and changing > ScaleFactor and TareWeight to constants doesn't change the behaviour. If > I run it as originally shown, it does 7 conversions before GDB shows > fweight as a nan (value 0x8000000000000). If I split out the > calculation, so that: > > fweight = (Weight - xTareWeight); > fweight *= ScaleFactor; > > it still does 7 printfs, but if I comment out the Scalefactor line, it > does 8 before nan shows up! It also behaves like this if I replace > variables ScaleFactor, Weight and TareWeight with explicit constant values.
Depending on the CPU you're using, the compiler may implement constant doubles by creating a 'constant pool' in memory, and loading the values from there. My guess is that some other part of your program is corrupting memory, and that after 7 or 8 iterations, the corruption has reached the code or variables used in this function. There's nothing wrong with the code you posted.
Reply by tbro...@hifn.com February 27, 20072007-02-27
Paul, I noticed that the buffer you're sprintf'ing into is immediately
adjacent to fweight on the locals list, and hence, presumably, on the
stack.

Is somebody is overrunning the end of the buffer?

    - Tim.

Reply by tbro...@hifn.com February 27, 20072007-02-27
Paul, I noticed that the buffer you're sprintf'ing into is immediately
adjacent to fweight on the locals list, and hence, presumably, on the
stack.

Is somebody is overrunning the end of the buffer?


Reply by Paul Burke February 27, 20072007-02-27
Paul Burke wrote:
> AAARGH! The latest is that if I cut the SAME CODE and put it in the same > file as main(), with the same header files, I don't get the fp problem! >
And now I've deleted the original file, created a new blank one with the same name, cut-and-pasted all the stuff back again, and well bugger me! it behaves. Sod knows what was going on, but I hope gcc isn't often like this as i'm going to have to live with it for a while. Paul Burke
Reply by Paul Burke February 27, 20072007-02-27
Paul Burke wrote:

> TareWeight is set (in this rather attenuated version of the program) by > the first read of the ADC on startup. Scale factor is a constant (i.e. > I've taken out the bit that changes it) while I'm getting the basics > going with the changed OS.
AAARGH! The latest is that if I cut the SAME CODE and put it in the same file as main(), with the same header files, I don't get the fp problem! It's a single thread command line app, there's nothing going on in the background that would affect any of the variables, I'm getting a headache with this. Time for a pot of tea. Paul Burke
Reply by Paul Burke February 27, 20072007-02-27
Boudewijn Dijkstra wrote:

> Looks like an optimization problem. The compiler is seeing a constant > where it shouldn't, and/or has modified the type of a variable to > double. Without seeing the code that deals with TareWeight, I'd guess > this is it. Try making it volatile. > > Another possibility is a bug that modifies ScaleFactor, which BTW isn't > declared const. >
I've found that it doesn't matter what the ADC reading is, and changing ScaleFactor and TareWeight to constants doesn't change the behaviour. If I run it as originally shown, it does 7 conversions before GDB shows fweight as a nan (value 0x8000000000000). If I split out the calculation, so that: fweight = (Weight - xTareWeight); fweight *= ScaleFactor; it still does 7 printfs, but if I comment out the Scalefactor line, it does 8 before nan shows up! It also behaves like this if I replace variables ScaleFactor, Weight and TareWeight with explicit constant values. If I change the types from double to float, it behaves the same, except that the nan value now becomes 0x400000. Well, I've been doing C since 1983, and I've had trouble before (like forgetting #includes or screwing up the formatting commands), but never one such as this! It all seems so basic - just multiply long 1 by float 1.0, and get an error after doing it OK several times. For now, it's got me bet. Paul Burke
Reply by Paul Gotch February 27, 20072007-02-27
Vladimir Vassilevsky <antispam_bogus@hotmail.com> wrote:
> If it is a multithreaded application, this can be the issue with > printf() reentrancy. Many of the stdio.h functions are not reentrant by > default, unless you are linking the appropriate libraries.
..yes you need to pass "-pthread" to GCC which has the effect of defining _REENTRANT and linking against libpthread. This is documented as required on some platforms but strangely not for x86 I don't know if this is a long standing oversight or if it really isn't needed. -p -- "Unix is user friendly, it's just picky about who its friends are." - Anonymous --------------------------------------------------------------------