Reply by Dan Lyke June 16, 20092009-06-16
On Tue, 16 Jun 2009 10:49:37 -0000
"stef33d" wrote:
> --- In A..., Dan Lyke wrote:
> >
> > Just a follow-up, the official suggestion for deterministic timing
> > was to run the critical code in SRAM.
>
> That was the unofficial suggestion in this thread as well if read it
> correctly. ;-)

Yeah, Michael ("nutleycottage") suggested that I "Place the function in
non-cacheable memory.".

> Just another suggestion: It should be possible to program the Timer
> Counter for single pulse generation (I have never tried it, but seems
> possible from the datasheet). But this will require your output to be
> a TIOx pin.

When I'm not bound by various NDAs I'll write a tell-all, but I'm
pretty sure it's nothing that more experienced embedded developers
haven't heard before.

Anyway, yeah, I (one of the software guys) was supposed to have a single
signal pin to trigger the gate for a bit mess of state signals: Set up
all the state bits, toggle that signal pin with the precise timing.
However, it turns out that by the time all the cable length and
capacitance issues fall out that the state signals have much squarer
waves at the end device than the gating signal.

Wonderful theories smashed hard by the realities of hardware design
spread across several continents and languages.

Dan

Reply by stef33d June 16, 20092009-06-16
--- In A..., Dan Lyke wrote:
>
> Just a follow-up, the official suggestion for deterministic timing was
> to run the critical code in SRAM.

That was the unofficial suggestion in this thread as well if read it correctly. ;-)

Sorry to just walk in like this ...

Just another suggestion: It should be possible to program the Timer Counter for single pulse generation (I have never tried it, but seems possible from the datasheet). But this will require your output to be a TIOx pin.

Regards,

Stef

Reply by Dan Lyke June 8, 20092009-06-08
Just a follow-up, the official suggestion for deterministic timing was
to run the critical code in SRAM. Using the stock linker files that
come with the Softpack examples, that's:

__attribute__ ((section (".ramfunc)))
void MyFunctionThatAlwaysNeedsConsistentTiming()
{
}
Reply by micr...@virginbroadband.com.au June 5, 20092009-06-05
Hi Dan,

My guess is that the way the literal (7) is yielded changes at ASM level.
"Close-to-a-power-of-2" permutations are easily yielded, whereas some might
require a bit of twiddling around
with multiple instructions. Weirder literals will come from a pool, which
changes instruction further, but I don't think that's the case here.
I don't think GCC would unroll the loop when Opts are off .. ?
Can't you say "fook it", and have a bunch of NOPs inline which you then
jump into at some point, depending on the delay ?
Just throwing some thoughts around...

Best regards,
Kris
On Wed, 3 Jun 2009 14:38:29 -0700, Dan Lyke wrote:
> On Wed, 3 Jun 2009 16:44:23 -0400
> Eric Haver wrote:
>> Hi Dan, First thought is that you have to turn off any interrupts,
>> secondly, set the Assembler to save it's output and look at the
>> assembly code.
>
> Yeah, interrupts are definitely off. No room for them in the hard real
> time stuff, and I spent half an hour this morning cleaning off my desk,
> mouse and half of my keyboard with denatured alcohol because of the kind
> of thing that happens if I go too long on that pulse, definitely not a
> customer experience we can allow...
>
> And I haven't dug into the assembly yet because its a single function
> that hasn't changed, and yet the timing changes, which makes me think
> there's something about alignment and linking that anyone who's worked
> with the ARM before probably knows immediately, but that I don't.
>
> Dan
>
Reply by "Frog Twissell, Blue Sky Solutions" June 4, 20092009-06-04
Hi Dan,
I had in mind something along the lines of a monostable. The extra $1 on
the BOM is perhaps worthwhile vs spontaneous combustion.

Cheers,
Frog

_____

From: A... [mailto:A...] On Behalf Of
Dan Lyke
Sent: Friday, 5 June 2009 10:41 a.m.
To: A...
Subject: Re: [AT91SAM] Super accurate short delays, SAM9XE and GCC?

On Thu, 4 Jun 2009 09:56:33 +1200
"Frog Twissell, Blue Sky Solutions" ns.co.nz> wrote:
> Surely if the pulse timing is that critical you should use dedicated
> hardware?

That's what the 9XE is for: handling all the timing critical bits. It'd
be nice to have an ASIC to do all this stuff, but that's not in the
cards. Could probably have spec'd out an FPGA, but there's enough
general purpose compute and memory shuffling that needs to happen in
this subsystem that just throwing a processor at it seemed like the
right (ie: fastest and least expensive with quantity flexibility) thing
to do.

So Paul suggests that I need to learn a few things about alignment, and
probably delve into my linker files a bit to make sure that functions
are landing where I intend them to. And keep shuffling those no-ops
around and learn a little bit more about ARM assembly language. Sigh,
more bedtime reading...

Dan

No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 8.5.339 / Virus Database: 270.12.52/2153 - Release Date: 06/04/09
17:55:00
Reply by Dan Lyke June 4, 20092009-06-04
On Thu, 4 Jun 2009 16:53:15 +0200
"Eric Pasquier" wrote:
> What about writting these lines in an assembler file linked with your
> project ?

Now that I've got it in its own separate C file the changes on rebuild
seem to have settled down, but if it is, as several have suggested, a
linking alignment issue, then the assembler code wouldn't fix the
problem.

I've got a request for a little clarification on this in to Atmel,
hopefully I'll get all the caching and alignment issues ironed out soon.

Dan
Reply by Dan Lyke June 4, 20092009-06-04
On Thu, 4 Jun 2009 09:56:33 +1200
"Frog Twissell, Blue Sky Solutions" wrote:
> Surely if the pulse timing is that critical you should use dedicated
> hardware?

That's what the 9XE is for: handling all the timing critical bits. It'd
be nice to have an ASIC to do all this stuff, but that's not in the
cards. Could probably have spec'd out an FPGA, but there's enough
general purpose compute and memory shuffling that needs to happen in
this subsystem that just throwing a processor at it seemed like the
right (ie: fastest and least expensive with quantity flexibility) thing
to do.

So Paul suggests that I need to learn a few things about alignment, and
probably delve into my linker files a bit to make sure that functions
are landing where I intend them to. And keep shuffling those no-ops
around and learn a little bit more about ARM assembly language. Sigh,
more bedtime reading...

Dan
Reply by Eric Pasquier June 4, 20092009-06-04
Hi Dan,

What about writting these lines in an assembler file linked with your project ?

Eric.
----- Original Message -----
From: nutleycottage
To: A...
Sent: Thursday, June 04, 2009 3:42 PM
Subject: [AT91SAM] Re: Super accurate short delays, SAM9XE and GCC?

Hi Dan,

Assuming you can't poll a hardware counter perhaps you could

1) Always compile the function with -O0/Write the function in assembly code.
2) Place the function in non-cacheable memory. Alternatively you may be able to lock the function into the icache but I've never tried this.

Regards
Michael
>
> I'm driving some hardware with the SAM9XE that has some hard real time
> limits: I need to lower some pins, wait for *two microseconds* then
> raise those pins.
>
> Any longer and I let the magic smoke out. Any shorter and the right
> thing fails to happen.
>
> I'm doing this on the AT91SAM9XE-EK with code that looks like:
>
> void Strobe( unsigned int bits )
> {
> LowerPins(...); // an inline function, pin->pio->CODR = bits;
> for (int i = 0; i < 7; ++i)
> {
> __asm__ __volatile__( "nop\n" );
> }
> __asm__ __volatile__( "nop\n" );
> __asm__ __volatile__( "nop\n" );
> __asm__ __volatile__( "nop\n" );
> __asm__ __volatile__( "nop\n" );
> RaisePins(...); // an inline function, pin->pio->SODR = bits;
> }
>
> This function is off in its own object module.
>
> Changing the "7" in there is distinctly non-linear, but I have the
> feeling that that's because the optimizer sometimes chooses to unroll
> the loop. However, it also seems like there's some sort of difference
> in how long this takes to run depending on linking: The delay seems to
> change every time I compile the project.
>
> Anyone have any idea what's going on? I'm way too acquainted with the
> 'scope recently.
>
> Dan
>
Reply by nutleycottage June 4, 20092009-06-04
Hi Dan,

Assuming you can't poll a hardware counter perhaps you could

1) Always compile the function with -O0/Write the function in assembly code.
2) Place the function in non-cacheable memory. Alternatively you may be able to lock the function into the icache but I've never tried this.

Regards
Michael
>
> I'm driving some hardware with the SAM9XE that has some hard real time
> limits: I need to lower some pins, wait for *two microseconds* then
> raise those pins.
>
> Any longer and I let the magic smoke out. Any shorter and the right
> thing fails to happen.
>
> I'm doing this on the AT91SAM9XE-EK with code that looks like:
>
> void Strobe( unsigned int bits )
> {
> LowerPins(...); // an inline function, pin->pio->CODR = bits;
> for (int i = 0; i < 7; ++i)
> {
> __asm__ __volatile__( "nop\n" );
> }
> __asm__ __volatile__( "nop\n" );
> __asm__ __volatile__( "nop\n" );
> __asm__ __volatile__( "nop\n" );
> __asm__ __volatile__( "nop\n" );
> RaisePins(...); // an inline function, pin->pio->SODR = bits;
> }
>
> This function is off in its own object module.
>
> Changing the "7" in there is distinctly non-linear, but I have the
> feeling that that's because the optimizer sometimes chooses to unroll
> the loop. However, it also seems like there's some sort of difference
> in how long this takes to run depending on linking: The delay seems to
> change every time I compile the project.
>
> Anyone have any idea what's going on? I'm way too acquainted with the
> 'scope recently.
>
> Dan
>

Reply by 42Bastian June 3, 20092009-06-03
Dan Lyke schrieb:

> I'm doing this on the AT91SAM9XE-EK with code that looks like:
>
> void Strobe( unsigned int bits )
> {
> LowerPins(...); // an inline function, pin->pio->CODR = bits;
> for (int i = 0; i < 7; ++i)
> {
> __asm__ __volatile__( "nop\n" );
> }
> __asm__ __volatile__( "nop\n" );
> __asm__ __volatile__( "nop\n" );
> __asm__ __volatile__( "nop\n" );
> __asm__ __volatile__( "nop\n" );
> RaisePins(...); // an inline function, pin->pio->SODR = bits;
> }
>
> This function is off in its own object module.
>
> Changing the "7" in there is distinctly non-linear, but I have the
> feeling that that's because the optimizer sometimes chooses to unroll
> the loop. However, it also seems like there's some sort of difference
> in how long this takes to run depending on linking: The delay seems to
> change every time I compile the project.
a) Don't write such code in C, use assembly (even not inline assembly)
b) Does this part have I-TCM, if so, place the code in there.
Otherwise you get runtime differences due to the cache.
c) Lock interrupts.

--
42Bastian
------------------
Parts of this email are written with invisible ink.

Note: SPAM-only account, direct mail to bs42@...