Forums

PXA255 Good Emulator Wanted

Started by Jose Luis Marchetti August 20, 2004
Hi,

I am using a JTAG based emulator, but as I could see the JTAG emulator
can not give what we want when we are looking for very complex bugs,
it has a limited trace buffer and we can not set watch pointers,
etc...

The conclusion is that we have to have a better tool, but it is being
difficult to find something out there, what amazing me, because the
PXA is largelly used.
I thought in using a logic analyzer, but our cache is ON and how can I
say what is being executed inside the cache ?

The tool I am looking for should:

1) Work with Xscale PXA255 processor.
2) Capability of save a large trace buffer and I should be able to
qualify what should go to the buffer.
3) Work like a logic analyzer, so I can have complex trigger and I can
also set watch pointers.
4) Must not change the software timing. No changes in the software
must be done.
5) Should work with the cache turned on and show what is being
executed inside the cache.

I appreciate any answer and Thanks for your attention.
Dreams are wonderful things, aren't they.

I've been in the embedded business for about 25 years, working for one 
of the major emulator and logic analyzer vendors, for 23 of them. 
There are many reasons that such a tool is not available, and probably 
never will be...

See below.

Jose Luis Marchetti wrote:
> Hi, > > I am using a JTAG based emulator, but as I could see the JTAG emulator > can not give what we want when we are looking for very complex bugs, > it has a limited trace buffer and we can not set watch pointers, > etc... > > The conclusion is that we have to have a better tool, but it is being > difficult to find something out there, what amazing me, because the > PXA is largelly used. > I thought in using a logic analyzer, but our cache is ON and how can I > say what is being executed inside the cache ?
Towards the end of the days of full-function emulators this was one of the biggest problems. The reason that the cache is inside the processor is because of speed. There is no known way to get full-speed visibility into the cache. For a while, about 15-20 years ago, some manufacturers would build special 'bondout' chips that gave visibility into chip internals, but those were for much slower chips, such as the original 8051, running at a few MHz.
> > The tool I am looking for should: > > 1) Work with Xscale PXA255 processor. > 2) Capability of save a large trace buffer and I should be able to > qualify what should go to the buffer.
This is what LA's do very well
> 3) Work like a logic analyzer, so I can have complex trigger and I can > also set watch pointers.
Again, an LA would do very well
> 4) Must not change the software timing. No changes in the software > must be done.
In a processor with enough visibility, this can work
> 5) Should work with the cache turned on and show what is being > executed inside the cache.
I don't know where you're from, but they must have some great pharmaceuticals there! See above.
> > I appreciate any answer and Thanks for your attention.
Other things to think about: --Even if you could turn the cache off, without penalty, there would be problems setting 'watchpoints'. The memory bus can connect to ROM, SRAM, and SDRAM. The cycles for each of them are very different. If you wanted to set a watchpoint on an address in SDRAM, the 'emulator' would have to be able to break that address into Row and Column addresses, and know the timing of the SDRAM so that it could capture just accesses to a specific address or set of addresses. To do this it would need visibility into the internal registers where the SDRAM parameters are stored. -- There are, generally, no new emulators being made. When my company got out of the emulator business, it was costing about $1.5M to $2.5M for each new emulator. Emulator sales are tied to design starts, not total chip volume. If we could sell 100, then we would have to add $15,000 to $25,000 to each one, just to cover development costs. What happened was that the number of design starts per processor continued to drop, so sales dropped. Each new processor requires a new design start, even within a family. There are new buses to support, new speed grades, changes in timing, more (or fewer chip selects). There was some design reuse, but the costs were still astronomical. Once the speeds got above 100MHz, it became impossible to build emulators at all. --Some members of the ARM family have built-in 'real-time' debug ports. If I remember correctly, it's called the Embedded Trace Module (ETM). I don't believe that Intel has chosen to implement this in any of its ARM-based processors. ETM provides code tracing and some limited watchpoint capability. Properly done, it removes the need for an emulator. Alan
Al Gosselin <algst@cox.net> wrote in message news:<1aIVc.159$M67.29@fed1read01>...
> Dreams are wonderful things, aren't they. > > I've been in the embedded business for about 25 years, working for one > of the major emulator and logic analyzer vendors, for 23 of them. > There are many reasons that such a tool is not available, and probably > never will be... > > See below. > > Jose Luis Marchetti wrote: > > Hi, > > > > I am using a JTAG based emulator, but as I could see the JTAG emulator > > can not give what we want when we are looking for very complex bugs, > > it has a limited trace buffer and we can not set watch pointers, > > etc... > > > > The conclusion is that we have to have a better tool, but it is being > > difficult to find something out there, what amazing me, because the > > PXA is largelly used. > > I thought in using a logic analyzer, but our cache is ON and how can I > > say what is being executed inside the cache ? > > Towards the end of the days of full-function emulators this was one of > the biggest problems. The reason that the cache is inside the > processor is because of speed. There is no known way to get full-speed > visibility into the cache. For a while, about 15-20 years ago, some > manufacturers would build special 'bondout' chips that gave visibility > into chip internals, but those were for much slower chips, such as the > original 8051, running at a few MHz. > > > > The tool I am looking for should: > > > > 1) Work with Xscale PXA255 processor. > > 2) Capability of save a large trace buffer and I should be able to > > qualify what should go to the buffer. > > This is what LA's do very well > > > 3) Work like a logic analyzer, so I can have complex trigger and I can > > also set watch pointers. > > Again, an LA would do very well > > > 4) Must not change the software timing. No changes in the software > > must be done. > > In a processor with enough visibility, this can work > > > 5) Should work with the cache turned on and show what is being > > executed inside the cache. > > I don't know where you're from, but they must have some great > pharmaceuticals there! See above. > > > > > I appreciate any answer and Thanks for your attention. > > Other things to think about: > > --Even if you could turn the cache off, without penalty, there would > be problems setting 'watchpoints'. The memory bus can connect to ROM, > SRAM, and SDRAM. The cycles for each of them are very different. If > you wanted to set a watchpoint on an address in SDRAM, the 'emulator' > would have to be able to break that address into Row and Column > addresses, and know the timing of the SDRAM so that it could capture > just accesses to a specific address or set of addresses. To do this it > would need visibility into the internal registers where the SDRAM > parameters are stored. > > -- There are, generally, no new emulators being made. When my company > got out of the emulator business, it was costing about $1.5M to $2.5M > for each new emulator. Emulator sales are tied to design starts, not > total chip volume. If we could sell 100, then we would have to add > $15,000 to $25,000 to each one, just to cover development costs. What > happened was that the number of design starts per processor continued > to drop, so sales dropped. Each new processor requires a new design > start, even within a family. There are new buses to support, new speed > grades, changes in timing, more (or fewer chip selects). There was > some design reuse, but the costs were still astronomical. Once the > speeds got above 100MHz, it became impossible to build emulators at all. > > --Some members of the ARM family have built-in 'real-time' debug > ports. If I remember correctly, it's called the Embedded Trace Module > (ETM). I don't believe that Intel has chosen to implement this in any > of its ARM-based processors. ETM provides code tracing and some > limited watchpoint capability. Properly done, it removes the need for > an emulator. > > Alan
Alan, Thank you very much for your explanation, it was very usefull. We started to evaluate a called Sophia Full Ice Emulator and I will let all know if it is really a full Ice Emulator for the PXA255. What I do not understand is: PXA maybe is today the most popular processor for PDAs. How People debug really hard bugs ? I and other senior engineer are trying to find a bug for more than 2 weeks. We have to insert log code or track code trying to get it, but the OS has tasks, MMU, etc ... and most of the times the JTAG buffer can not show anything. Other thing: this processor has a lot of registers, what make it fast, but when debugging, you do not know what is happening some instructions ealy in the trace buffer. This amazings me. Thanks again
> PXA maybe is today the most popular processor for PDAs. > How People debug really hard bugs ? I and other senior engineer are
Using the same techniques as one uses for debugging "really hard" bugs in other black-box systems like PCs. ICEs are practically unheard of for fast 32-bit microcontrollers.
larwe@larwe.com (Lewin A.R.W. Edwards) wrote in message news:<608b6569.0408240949.72e82fae@posting.google.com>...
> > PXA maybe is today the most popular processor for PDAs. > > How People debug really hard bugs ? I and other senior engineer are > > Using the same techniques as one uses for debugging "really hard" bugs > in other black-box systems like PCs. ICEs are practically unheard of > for fast 32-bit microcontrollers.
What are those techniques? Is there any good source for that ? We just found the bug, so I can try to see if those techniques could help us or not. The bug was: At one point the OS was calling a function that was flushing the cache and after invalidating the cache. The problem happened when an interrupt happened between the two actions ( flush and invalidate ). When the interrupt happens exactly there ( I am talking about few assembler instructions ) it start to use the cash again, but when it returns the cache get invalidated. The behaviour after that was totaly unpredicatable. Just disabling the interrupts in this function solved the problem. The bug was found just reading the source code. I am wondering if even with an full ICE emulator this bug could be found. Thanks,
Jose Luis Marchetti wrote:

> larwe@larwe.com (Lewin A.R.W. Edwards) wrote in message news:<608b6569.0408240949.72e82fae@posting.google.com>... > >>>PXA maybe is today the most popular processor for PDAs. >>>How People debug really hard bugs ? I and other senior engineer are >> >>Using the same techniques as one uses for debugging "really hard" bugs >>in other black-box systems like PCs. ICEs are practically unheard of >>for fast 32-bit microcontrollers. > > > What are those techniques? Is there any good source for that ? > > We just found the bug, so I can try to see if those techniques could > help us or not. > > The bug was: > > At one point the OS was calling a function that was flushing the cache > and after invalidating the cache. The problem happened when an > interrupt happened between the two actions ( flush and invalidate ). > When the interrupt happens exactly there ( I am talking about few > assembler instructions ) it start to use the cash again, but when it > returns the cache get invalidated. > The behaviour after that was totaly unpredicatable. > Just disabling the interrupts in this function solved the problem. > > The bug was found just reading the source code. > > I am wondering if even with an full ICE emulator this bug could be > found. > > Thanks,
A full ICE, if available will give you more visibility. You still have to know what to look for. There are a variety of tricks that people use in debugging. Most of the involve leaving 'breadcrumbs' behind. This means writing values out on a bus, or into memory, at appropriate points in the code, then looking at the log and determining what happened just before the failure. Becasue any ICE will probably require you to turn off the cache, I doubt that it would have helped. Alan
On 25 Aug 2004 12:08:03 -0700, joseluismarchetti@yahoo.com.br (Jose
Luis Marchetti) wrote:

>larwe@larwe.com (Lewin A.R.W. Edwards) wrote in message news:<608b6569.0408240949.72e82fae@posting.google.com>... >> > PXA maybe is today the most popular processor for PDAs. >> > How People debug really hard bugs ? I and other senior engineer are >> >> Using the same techniques as one uses for debugging "really hard" bugs >> in other black-box systems like PCs. ICEs are practically unheard of >> for fast 32-bit microcontrollers. > >What are those techniques? Is there any good source for that ?
Experience ? Actually you gave the key information in your first post - CPU cache and very very hard to find. Out of my experinece I would have examined the code design for spots where being interupted could cause problems. If you flush the cache, this obviousely is the case. I know, after knowing the cause of a bug it's always obvious and easy to spot on it but still. Genrerally I found these two kind of problems the most difficult to track down: a) memory corruption - with freeing an area twice being on top of the hit list :-) b) issues bound to code getting interupted which should not. a) is easy to catch by replaceing malloc and free with your own debuging library keepting track of what got allocated and freed where. Fairly hard to debug though if you don't have this option. Even if I have to use other peoples libraries which can't be patched with regard to malloc and free I still use my own versions if I run into such problems and if it's only to be sure the corruption did not have its sources in my code. Obviousely this kind of bug only happens with systems using malloc and free for memory management. Most often only true with "fater" embedded designs - and of course with PC software. b) The only good way I found so far to fight this kind of errors is code review with focusing on which functions change things that are also changed in irq routines (data in the widest sense shared by both kind of code). If you do this consequently such bugs are soon found. If not, you can start adding marks to shared objects to get a clue what was last changed where when looking at a crash dump etc. I somtimes use "allocation" and "dealocation" counters or tickers or even store a string within such an object showing by which module it got "alocated" or "dealocated" (in the widest sense of course). After having to chase down a bug like you did now. I recomend you now to think if you have other spots in your code where being interupted could cause problems and review them in odred to see if they are protected against interupts. In other words, if a bug is found, it's always good practize to generalize the bug and then try to find those spots which match the same pattern and check if a similar bug could be there. Just my 2&#2013266082; of course Markus
Markus Zingg <m.zingg@nct.ch> wrote in message news:<mcd0j01q36p4giqdcq4gfo8r66deuh6lp2@4ax.com>...
> On 25 Aug 2004 12:08:03 -0700, joseluismarchetti@yahoo.com.br (Jose > Luis Marchetti) wrote: > > >larwe@larwe.com (Lewin A.R.W. Edwards) wrote in message news:<608b6569.0408240949.72e82fae@posting.google.com>... > >> > PXA maybe is today the most popular processor for PDAs. > >> > How People debug really hard bugs ? I and other senior engineer are > >> > >> Using the same techniques as one uses for debugging "really hard" bugs > >> in other black-box systems like PCs. ICEs are practically unheard of > >> for fast 32-bit microcontrollers. > > > >What are those techniques? Is there any good source for that ? > > Experience ? > > Actually you gave the key information in your first post - CPU cache > and very very hard to find. Out of my experinece I would have examined > the code design for spots where being interupted could cause problems. > If you flush the cache, this obviousely is the case. I know, after > knowing the cause of a bug it's always obvious and easy to spot on it > but still. > > Genrerally I found these two kind of problems the most difficult to > track down: > > a) memory corruption - with freeing an area twice being on top of the > hit list :-) > b) issues bound to code getting interupted which should not. > > a) is easy to catch by replaceing malloc and free with your own > debuging library keepting track of what got allocated and freed where. > Fairly hard to debug though if you don't have this option. Even if I > have to use other peoples libraries which can't be patched with regard > to malloc and free I still use my own versions if I run into such > problems and if it's only to be sure the corruption did not have its > sources in my code. Obviousely this kind of bug only happens with > systems using malloc and free for memory management. Most often only > true with "fater" embedded designs - and of course with PC software. > > b) The only good way I found so far to fight this kind of errors is > code review with focusing on which functions change things that are > also changed in irq routines (data in the widest sense shared by both > kind of code). If you do this consequently such bugs are soon found. > If not, you can start adding marks to shared objects to get a clue > what was last changed where when looking at a crash dump etc. I > somtimes use "allocation" and "dealocation" counters or tickers or > even store a string within such an object showing by which module it > got "alocated" or "dealocated" (in the widest sense of course). After > having to chase down a bug like you did now. I recomend you now to > think if you have other spots in your code where being interupted > could cause problems and review them in odred to see if they are > protected against interupts. In other words, if a bug is found, it's > always good practize to generalize the bug and then try to find those > spots which match the same pattern and check if a similar bug could be > there. > > Just my 2&#2013266082; of course > > Markus
Markus, Thanks for your answer. Yes we use those things as well. The problem with this bug was that nothing was making sense. After the problem happens I looked into the data structures in memory and they do not make any sense. But I could track that was not a crazy pointer overwriting the area too. We moved the data area from one location to another, we separeted them with several zeros before and after, and they never got corrupted. We placed checks in the stack. We were also spreading log code, but nothing was making sense. The OSs where the problem is located is not our code so we do not know it 100%, this is other factor that made things difficult. For example you correctly asked me to look into similar problems that I maight still have in the code, but I have to answer you, I do not have a clue, I do not know the code ( very big code ) this will demand alot of time to look for that, but I think you are right. But in one monday I said, I will build a flow chart of the part of the software where I see the problem happening often and them place more log code. When I was building it I saw the problem and as you said, now looks easy to spot it, but took about 3 weeks of 2 senior Software Engineers. My original question is still here. I do not like to write log code, a good emulator should do that for me. Simple does not make sense for me. Thanks,