EmbeddedRelated.com
Forums
The 2024 Embedded Online Conference

Greatest Hits from Tech Support!

Started by Dave Nadler February 17, 2017
You guys might appreciate this, received almost 2 months after posting a bug:
https://community.nxp.com/message/878673?commentID=878673

Yikes.

Customer is suggesting we shouldn't use Freescale (NXP?) in future;
Kinetis parts may go away in product rationalization...

See ya, Dave
On 2/17/2017 8:57 AM, Dave Nadler wrote:
> You guys might appreciate this, received almost 2 months after posting a bug: > https://community.nxp.com/message/878673?commentID=878673 > > Yikes.
(sigh) Clearly folks who don't understand the question(s) being asked/issues being presented -- ditto the referenced SO post :<. (/caveat emptor/) "In no case is returning a pointer outside the heap acceptable" +42 "NXP examples should set sensible default heap size" Meh... heap size should be appropriate for the *example*, nothing more. "a trap on out-of-memory would be better" Not possible with all platforms. I'm not fond of having to rewrite code because it relies on specific "undocumented" behavior. My _DEBUG version of malloc/free aggressively examine all I/O's. I rely on dynamic allocation extensively (e.g., support multiple arenas, allocation strategies, etc. concurrently). So: ptr = malloc(...) free(ptr) is ALWAYS safe -- even if ptr == NULL. But, attempting: ptr = malloc(...) free(ptr) free(ptr) will trigger an invariant and stop the program in its tracks (because the memory free'd in the second call isn't allocated at the time of the call -- an extra condition I impose on free() that the standard library doesn't) Likewise: ptr = malloc(HEAP_SIZE+1) is guaranteed to return NULL (and tickle another invariant in -DEBUG) and: ptr = malloc(0) provides an effective way of knowing that the heap *is* exhausted. [This is subtle as it relies on the caller knowing the memory available for their use, for a non-NULL return value, is constrained to [ptr,ptr+size) ] The standard allows no other way of "returning an error" so if you don't like the arguments, you have to return NULL.
> Customer is suggesting we shouldn't use Freescale (NXP?) in future;
Why? Because they are wary of the quality of the support available? Farming your tech support out to users is always a dubious business strategy!
> Kinetis parts may go away in product rationalization...
Because the expected support will be better?
> See ya, Dave >
On Friday, February 17, 2017 at 12:54:16 PM UTC-5, Don Y wrote:
> On 2/17/2017 8:57 AM, Dave Nadler wrote: > > You guys might appreciate this, received almost 2 months after posting a bug: > > https://community.nxp.com/message/878673?commentID=878673 > > > > Yikes. > > (sigh) Clearly folks who don't understand the question(s) being asked/issues > being presented -- ditto the referenced SO post :<. (/caveat emptor/) > > "In no case is returning a pointer outside the heap acceptable" > +42 > > "NXP examples should set sensible default heap size" > Meh... heap size should be appropriate for the *example*, nothing more. > > "a trap on out-of-memory would be better" > Not possible with all platforms. I'm not fond of having to rewrite > code because it relies on specific "undocumented" behavior.
Don, sorry if the context wasn't entirely clear. Freescale/NXP provide many dozens of examples. Typically a good place to start when using a new tool chain! I pared down one of their examples show that they've misconfigured memory management and heap size for their examples... In the context of their examples, typically run under a debugger, traps would be very helpful to sort out configuration issues. I'm quite astonished at the answer though: "It's really OK that malloc returns a pointer outside the heap, because in my blinky example it didn't cause a problem..." It gets worse though. Their examples for FreeRTOS are misconfigured so that: - they will blow the heap, - the FreeRTOS-aware debugger crashes, - newlib isn't set up compatibly with FreeRTOS memory management, - etc. See ya, Dave
On 2/17/2017 12:29 PM, Dave Nadler wrote:
> On Friday, February 17, 2017 at 12:54:16 PM UTC-5, Don Y wrote: >> On 2/17/2017 8:57 AM, Dave Nadler wrote: >>> You guys might appreciate this, received almost 2 months after posting a bug: >>> https://community.nxp.com/message/878673?commentID=878673 >>> >>> Yikes. >> >> (sigh) Clearly folks who don't understand the question(s) being asked/issues >> being presented -- ditto the referenced SO post :<. (/caveat emptor/) >> >> "In no case is returning a pointer outside the heap acceptable" >> +42 >> >> "NXP examples should set sensible default heap size" >> Meh... heap size should be appropriate for the *example*, nothing more. >> >> "a trap on out-of-memory would be better" >> Not possible with all platforms. I'm not fond of having to rewrite >> code because it relies on specific "undocumented" behavior. > > Don, sorry if the context wasn't entirely clear. > Freescale/NXP provide many dozens of examples.
Yes. I'd expect part of the example to HIGHLIGHT the differences in the execution environment. E.g., "need big heap to support the allocations in this example (other example doesn't need ANY heap, etc.)"
> Typically a good place to start when using a new tool chain!
I usually start with crt0.s and the "helper routines". This removes a lot of uncertainty from "what I'm seeing" (no need to wonder what lies UNDER the examples) as well as gives me a general idea as to what the vendor THINKS that I will be doing. And, as I'll inevitably have to tweek these things for the OS that will be supporting the executables, I can nail down those customizations sooner rather than later (I like to have an OS up and running from day one)
> I pared down one of their examples show that they've misconfigured > memory management and heap size for their examples...
I think you are being generous saying "misconfigured". While I can't see how all these parameters are defined/established (code elided), I can't imagine any way that the result (0x20005c0 IIRC) can rationally come from the start/end/request of your example. Were you able to track down the source of the error?
> In the context of their examples, typically run under a debugger, > traps would be very helpful to sort out configuration issues.
The invariants give me a (portable) hook to a debugger and/or run-time exception handler. Redefine the macros/ftns and I can "throw" the error in a variety of different ways as well as take a variety of different remedial actions. (kill the application, suspend the offending task, log to a blackbox, etc.)
> I'm quite astonished at the answer though: > > "It's really OK that malloc returns a pointer outside the heap, > because in my blinky example it didn't cause a problem..."
fsub(subtrahend, minuend) { result = subtrahend * 3 + minuend / pi result = 2 return( result ) } "Works fine -- cuz subtrahend is always two less than minuend in my examples!" I found the MISRA "excuse" even more alarming -- I didn't see any indication that it even APPLIED (i.e., as a constraint imposed by a client/employer/industry).
> It gets worse though. > Their examples for FreeRTOS are misconfigured so that: > - they will blow the heap,
because FreeRTOS makes demands that can't be met (and doesn't verify that they HAVE been met)? Or, because of bugs like the above (that FreeRTOS can't do anything about)?
> - the FreeRTOS-aware debugger crashes, > - newlib isn't set up compatibly with FreeRTOS memory management, > - etc.
Sounds like the folks who pieced together the support didn't do a very thorough job. IME, the folks who write the app notes (hardware/software) and these sorts of "examples" aren't typically a "cherished resource". Perhaps summer interns, fresh hires, etc. Company is focused on selling silicon, not software. Only pays grudging lip service to the software in order to get the hardware sold! [Nowadays, even worse as so many chip designs are pieced together] I had a client show me his first pass of a hardware design to give me an idea of where his project/product was headed. It was hard to politely say, "This will NEVER work!" (it wasn't like it could be tweaked A LITTLE to be viable). After gently pressing for details of portions of the circuitry, ("So, what are you trying to do, here?") he eventually folded and said he'd pieced it together from app notes. (OK, maybe he transcribed things inappropriately or made unfounded assumptions at the boundaries of each sub-design). When I later examined the "source app notes", the flaws were present there, as well. I.e., no one had apparently ever BUILT the circuit that was published! AND, no one had ever proofread the document with even a rudimentary understanding of the material presented (lest they would have found the errors before making it into print). I learned that when I document things, I have to cut and paste the ACTUAL source materials into the final/formal document -- to ensure no transcription errors AND that the design/circuit/code is EXACTLY as I'd implemented. I try to "own" all of the IP in my designs so I *know* how they all work as well as how they are likely to interact. E.g., my standard libraries were crafted before the "reentrant" versions of the functions that have internal state (e.g., strtok()) came to be. So, they magically work when hosted by any of my OS's -- because my OS's are integrated with their implementations (and automagically handle the thread-specific state). The bottom line is that I can have a real application running on bare metal very quickly -- I just need to get the interface to the target up and running. Fewer unknowns, fewer surprises. [They're TOOLS. They're supposed to FACILITATE, not HINDER!] Returning to my final queries: is the reason the client wants to move to Kinetis/FS parts because of the "support quality" on the NXP parts?
Il giorno venerd&igrave; 17 febbraio 2017 16:57:10 UTC+1, Dave Nadler ha scritto:
> You guys might appreciate this, received almost 2 months after posting a bug: > https://community.nxp.com/message/878673?commentID=878673 > > Yikes.
The community is a forum for users, the fact that sometimes people from Freesc...ehm NXP answers is not granted. If you want a fast and detailed answer you need to send an email to the support. The community is not very good, answers from NXP (usually from Alice_Yang) are usually not very useful. 2-3 years ago I found a very nasty bug in the Processor Expert [1] I2C driver: if you shorted to ground the SDA line the driver got stuck in an infinite loop, a reset was needed. Spoke (through community) with a bunch of people, and a certain point also a guy from Freescale (supposely the guy who wrote the driver): for him it was OK because the short to GND could never happens... Don't know if they solved the issue in the next update. [1]: yes, I know, but I had to put together something very fast, so...
> Customer is suggesting we shouldn't use Freescale (NXP?) in future; > Kinetis parts may go away in product rationalization...
Mmhh not sure, some are in the longevity program (the EA family for automotive as an example). And in any case the kinetis are far better for motor control than the LPCs. Also the IDE seems to be in active developement so who knows. Bye Jack
On Friday, February 17, 2017 at 8:40:03 PM UTC-5, Don Y wrote:
> ... I'd expect part of the example to HIGHLIGHT the differences > in the execution environment. E.g., "need big heap to support > the allocations in this example (other example doesn't need ANY heap, etc.)"
And you will be sadly disappointed. I would love to see this in all examples, but even partial notes like this are pretty rare.
> I usually start with crt0.s and the "helper routines". This removes > a lot of uncertainty from "what I'm seeing" (no need to wonder what lies > UNDER the examples) as well as gives me a general idea as to what the > vendor THINKS that I will be doing.
The problem in this instance (and similar), is that there are many other parts of the puzzle...
> I think you are being generous saying "misconfigured". While I can't > see how all these parameters are defined/established (code elided), > I can't imagine any way that the result (0x20005c0 IIRC) can rationally > come from the start/end/request of your example.
There is no excuse for this kind of error. "Misconfigured" because this error is a combination of errors in ALL of: - linker file - library configuration (newlib) - external hooks provided as required to support library - overrides in project files for things like heap and stack size - unwitting use of library routines that use free storage - etc.
> Were you able to track down the source of the error?
Of course, I had to get something working to the customer months ago! I will try to put together a web page explaining the components, and providing working examples for: - bare - FreeRTOS
> Sounds like the folks who pieced together the support didn't do > a very thorough job.
Now you are being extremely charitable ;-)
> IME, the folks who write the app notes (hardware/software) and these > sorts of "examples" aren't typically a "cherished resource". Perhaps > summer interns, fresh hires, etc. Company is focused on selling silicon, > not software. Only pays grudging lip service to the software in order > to get the hardware sold!
In this case the support operation appears to be Chinese. Race to the lowest cost, quality be damned...
> I had a client show me his first pass of a hardware design to give me > an idea of where his project/product was headed. It was hard to > politely say, "This will NEVER work!" (it wasn't like it could be > tweaked A LITTLE to be viable). > > After gently pressing for details of portions of the circuitry, ("So, > what are you trying to do, here?") he eventually folded and said he'd > pieced it together from app notes. (OK, maybe he transcribed things > inappropriately or made unfounded assumptions at the boundaries of > each sub-design). > > When I later examined the "source app notes", the flaws were present > there, as well. I.e., no one had apparently ever BUILT the circuit that > was published! AND, no one had ever proofread the document with even > a rudimentary understanding of the material presented (lest they would > have found the errors before making it into print).
I'll raise you one: I've seen the all of the above in a shipping product! Best app note blunder I've seen: serializer-deserializer note said the parallel lines had to have equal length traces! Think about that one for a bit. Witless non-engineer designed a board using low-end free CAD trying to do this...
> I learned that when I document things, I have to cut and paste the > ACTUAL source materials into the final/formal document -- to ensure > no transcription errors AND that the design/circuit/code is EXACTLY > as I'd implemented.
And deliver the PDF component spec sheets with annotations appropriate to the part usage in the design...
> I try to "own" all of the IP in my designs so I *know* how they all work > as well as how they are likely to interact. E.g., my standard libraries > were crafted before the "reentrant" versions of the functions that > have internal state (e.g., strtok()) came to be. So, they magically > work when hosted by any of my OS's -- because my OS's are integrated > with their implementations (and automagically handle the thread-specific > state).
newlib also contains thread-aware malloc family for this purpose. In the Freescale/NXP case here, they failed to properly set these up.
> Returning to my final queries: is the reason the client wants to move > to Kinetis/FS parts because of the "support quality" on the NXP parts?
A combination of factors: tools and documentation not updated for 2+ years, bugs everywhere, extremely poor examples, miserable support, and worries about the future of Kinetis product line after years of Freescale austerity, NXP merger, and upcoming potential Qualcomm merger. Utter disarray with the tools: Freescale started with their "Processor Expert" component system, discontinued it for a new SDK, released an updated SDK with numerous undocumented incompatibilities, the examples and documentation do not use the current tool set, etc. The hardware is extremely capable, but challenging to use with the above issues... NXP is threatening to release support based on their Expresso Eclipse (replacing the Kinetis Development System Eclipse), and rationalizing their tool support this spring. We'll see how that goes. I did a couple products using NXP Cortex M0 a few years ago and had good experience with the NXP tools (and support when their were tool issues)... See ya, Dave
On Monday, February 20, 2017 at 3:03:22 AM UTC-5, Jack wrote:
> The community is a forum for users, the fact that sometimes people from > Freesc...ehm NXP answers is not granted. > If you want a fast and detailed answer you need to send an email to > the support.
NXP "encourages" posting to the user group first. I've had tech support tickets responded to equally slowly.
> The community is not very good,
One of the worst I've seen...
> answers from NXP (usually from Alice_Yang) are usually not very useful.
Alice is truly special. Usually fails to understand the question, and often just says "re-install all the software and maybe it will work".
> > Customer is suggesting we shouldn't use Freescale (NXP?) in future; > > Kinetis parts may go away in product rationalization... > > Mmhh not sure, some are in the longevity program (the EA family for > automotive as an example). And in any case the kinetis are far better > for motor control than the LPCs.
We're not doing motor-control.
> Also the IDE seems to be in active developement so who knows.
KDS will be replaced with NXP's Expresso shortly...
On 2/20/2017 7:38 AM, Dave Nadler wrote:
> On Friday, February 17, 2017 at 8:40:03 PM UTC-5, Don Y wrote: >> ... I'd expect part of the example to HIGHLIGHT the differences >> in the execution environment. E.g., "need big heap to support >> the allocations in this example (other example doesn't need ANY heap, etc.)" > > And you will be sadly disappointed.
I wouldn't doubt it. Rather, I'm indicating how examples can be used *effectively*. Every (virtual or otherwise) "call to tech support" bears a cost -- for the provider AND consumer! And, while consumers ultimately pay all of the costs they "impose" on the provider, the reverse is also true: providers eventually end up paying the costs that they've forced/coerced on their consumers (in terms of product satisfaction, loyalty, etc.) Its not hard to create good examples -- it just requires the attitude that there is intrinsic VALUE in the examples.
> I would love to see this in all examples, > but even partial notes like this are pretty rare. > >> I usually start with crt0.s and the "helper routines". This removes >> a lot of uncertainty from "what I'm seeing" (no need to wonder what lies >> UNDER the examples) as well as gives me a general idea as to what the >> vendor THINKS that I will be doing. > > The problem in this instance (and similar), is that there > are many other parts of the puzzle...
I'm a hardware person, by nature. So, I immediately look at the lowest layers of the software system to see what I can/should "expose" of the hardware and what I should deliberately obscure. (e.g., in some cases, exposing the underlying page size of the MMU is a win; in others, a useless detail).
>> I think you are being generous saying "misconfigured". While I can't >> see how all these parameters are defined/established (code elided), >> I can't imagine any way that the result (0x20005c0 IIRC) can rationally >> come from the start/end/request of your example. > > There is no excuse for this kind of error. > "Misconfigured" because this error is a combination of errors in ALL of: > - linker file > - library configuration (newlib) > - external hooks provided as required to support library > - overrides in project files for things like heap and stack size > - unwitting use of library routines that use free storage > - etc.
Undoubtedly "misconfigured". But, the code could have SCREAMED of this instead of letting you "trip over" it. // success ASSERT( allocation != NULL ) ASSERT( allocation >= heap_start ) ASSERT( allocation+allocated_size-1 <= heap_start+heap_size-1 ) return( allocation ) This reinforces the module's contract with the caller (in ways that are a lot more precise and explicit than a verbose textual "SYNOPSIS"). *And*, acts as a pair of watchful eyes during development ("How the hell did *this* happen??")
>> Were you able to track down the source of the error? > > Of course, I had to get something working to the customer > months ago! I will try to put together a web page explaining > the components, and providing working examples for: > - bare > - FreeRTOS
The above sort of defensive coding would probably have rendered the problem moot. But, if you're not the author/owner of the module... One advantage to "owning the IP" is that you can impose this sort of consistency on *everything*. So, if (when!) you reuse an algorithm, you don't worry (as much) about these sorts of subtleties biting you. [How would you document the requirements that malloc imposes on the linker, support library, etc.? And, in a way that a new developer would be sure to notice and implement? I.e., my solution is to let the code do it for me lest some detail be overlooked (or unmaintained). Esp valuable as you can "specify" lots of constraints when you have a programming language available to you! :> ]
>> Sounds like the folks who pieced together the support didn't do >> a very thorough job. > > Now you are being extremely charitable ;-)
Who is there to act as a "check" on them? The fab line, no doubt, has inspectors ensuring the quality of the components they are producing. Where is the equivalent for their documentation/support? (Ans: user community. And, is anyone/corporate actively watching the community's grumblings as a crude indication of the quality of the support THEY are providing?)
>> IME, the folks who write the app notes (hardware/software) and these >> sorts of "examples" aren't typically a "cherished resource". Perhaps >> summer interns, fresh hires, etc. Company is focused on selling silicon, >> not software. Only pays grudging lip service to the software in order >> to get the hardware sold! > > In this case the support operation appears to be Chinese. > Race to the lowest cost, quality be damned...
There can also be an impedance mismatch -- folks focused on building silicon may not have the skillsets to understand *using* it! We were car shopping, recently. The quality of the "tech" in modern vehicles is abominable. You have to wonder how a "professional" could produce such crap -- regardless of the hardware/cost constraints! Obviously, the car manufacturers don't have that expertise and farm it out to a third (fourth?) party. OK, that's understandable and likely makes good business sense. *But*, because they don't have the skillsets to understand what is possible, they can't adequately evaluate the quality of the resulting product! And, the chosen vendor (assuming he is NOT being unscrupulous) can rationalize away any behaviors as LOGICAL consequences of the problem addressed. E.g., I recently "misplaced" the local JCPenney store; couldn't recall on which of several parallel streets it was located. "Ah! Let the car's GPS give me the address -- no need for 'directions' once I know which street its on!" Type in JCPENNY [sic] and select from the menu presented... but, the street names are all wrong! "What the hell did I do wrong??" Long story short (_Clue_: "Too late!"), the locations found were located in OH (I'm in AZ). There's a difference between JCPENNeY and JCPENNY and the search algorithm isn't smart enough to think: "What are the chances that he really is looking for a destination 1500 miles away?" Had this been demoed to an auto executive, the implementation would look *perfect* -- it FOUND the "JCPENNY" that the user specified! Would the auto executive have had the initiative to suggest this would likely NOT have been what the user sought? Would he have thought of imposing some sort of distance/travel time criteria on the result to check for sanity? "No results (JCPENNY) found within 100 miles. Expand search: Yes/No?" [Or, better yet, a SOUNDEX implementation?]
>> I had a client show me his first pass of a hardware design to give me >> an idea of where his project/product was headed. It was hard to >> politely say, "This will NEVER work!" (it wasn't like it could be >> tweaked A LITTLE to be viable). >> >> After gently pressing for details of portions of the circuitry, ("So, >> what are you trying to do, here?") he eventually folded and said he'd >> pieced it together from app notes. (OK, maybe he transcribed things >> inappropriately or made unfounded assumptions at the boundaries of >> each sub-design). >> >> When I later examined the "source app notes", the flaws were present >> there, as well. I.e., no one had apparently ever BUILT the circuit that >> was published! AND, no one had ever proofread the document with even >> a rudimentary understanding of the material presented (lest they would >> have found the errors before making it into print). > > I'll raise you one: I've seen the all of the above in a shipping product! > Best app note blunder I've seen: serializer-deserializer note said the > parallel lines had to have equal length traces! > Think about that one for a bit. Witless non-engineer designed a board > using low-end free CAD trying to do this...
I watched an engineer design a ~8x10 multilayer board covered with DIP switches to configure his product (which was another board) -- in a MICROPROCESSOR CONTROLLED DEVICE (didn't it occur to him that the processor could do this stuff *for* him and improve the UX?) "Impedance mismatch"
>> I learned that when I document things, I have to cut and paste the >> ACTUAL source materials into the final/formal document -- to ensure >> no transcription errors AND that the design/circuit/code is EXACTLY >> as I'd implemented. > > And deliver the PDF component spec sheets with annotations appropriate > to the part usage in the design...
I approach that by creating a "circuit description" document. This lets me speak less formally -- but more specifically -- about what I did and why I made specific design choices. Often, The Next Guy may not have the focus to think-ahead to all of the potential issues that he (now "his" product) may encounter. Having a forum where I can speak to him avoids his faulty "optimization" of my design (i.e., removing things whose purpose he doesn't immediately grasp) :> Of course, the same is true of software: "Here, there be dragons" *should* ratchet-up the reader's attention!
>> Returning to my final queries: is the reason the client wants to move >> to Kinetis/FS parts because of the "support quality" on the NXP parts? > > A combination of factors: tools and documentation not updated for 2+ > years, bugs everywhere, extremely poor examples, miserable support, > and worries about the future of Kinetis product line after years of > Freescale austerity, NXP merger, and upcoming potential Qualcomm merger. > Utter disarray with the tools: Freescale started with their "Processor > Expert" component system, discontinued it for a new SDK, released an > updated SDK with numerous undocumented incompatibilities, the examples > and documentation do not use the current tool set, etc.
So, you're (they're) unhappy with the *provider*, not the *product*. But, mainly from the support side of the picture (i.e., its not like you're having a hard time getting parts or defects in the parts themselves). I.e., why buy from Walmart when you can buy from Costco? Sad as they *could* fix this by throwing some resources at the problem. But, until they see a cost to NOT doing that, it is unlikely that they ever will! :-(
> The hardware is extremely capable, but challenging to use with the > above issues... NXP is threatening to release support based on their > Expresso Eclipse (replacing the Kinetis Development System Eclipse), > and rationalizing their tool support this spring. We'll see how that > goes. I did a couple products using NXP Cortex M0 a few years ago > and had good experience with the NXP tools (and support when their were > tool issues)...
I'm pretty much "stuck" having to roll my own tools as my execution environment isn't typical (distributed multiprocessors, task migration, etc.). So, I lose a lot of "pretty" (that would be available in fleshier toolsets) but gain a lot of "essential" (that would have been absent, for my environment, in those verysame toolsets)! [Eventually, I'll have to make this stuff "ready for primetime" but that's low on the priority list... only so many hours in a day!]

The 2024 Embedded Online Conference