In comp.arch.embedded,
alb <alessandro.basili@cern.ch> wrote:
> On 3/9/2012 9:51 AM, Stef wrote:
>
>> In my opinion, based only on what I've read here (and I have not read
>> every post in detail), you should go through the entire FPGA code and
>> get it cleanly through the tools before you send it somewhere where 
>> it can never be changed again.
>
> We are going through the FPGAs every day, trying to understand the logic
> behind and the potential flaws. Sometime it is so poorly segmented that
> is very hard to understand the original intention.
>
> Unfortunately everything is up in space now and we have to live with it.

That's what I initially understood from your posts. But later I somehow
understood that the gear was still on the ground a was going into space.
That made me wonder why you where working around the problem instead of
solving the actual cause. Understood now (again): FPGA is fixed, no
changes possible, ever.

> Given the level of the design I expect a lot of other big flaws
> somewhere else. We can only hope all these flaws can be circumvented by
> software, but it's a real pain.

Given the situation as it is, it is all you can do. Good luck!


-- 
Stef    (remove caps, dashes and .invalid from e-mail address to reply by mail)

All progress is based upon a universal innate desire of every organism
to live beyond its income.
		-- Samuel Butler, "Notebooks"

On 3/9/2012 9:51 AM, Stef wrote:
[...]
>>
>> The correct design should have resynchronized the input with the
>> internal clock (clk) and everything would have been fine, since from
>> that moment on everything is running at the pace of the internal clock.
> 
> Something that should be done for _every_ input to the FPGA. Have you
> checked how other inputs are handled? If there is a problem like this on
> the UART data input, it is more likely that there is a problem with other
> inputs as well. Re-timing an input can in some cases also make it easier
> to meet timing requirements, maybe solving some of the timing errors you
> found when you tried to re-compile the FPGA?
> 

Actually we have found tons of warnings for 'inferred latch' and stuff
like that, when we tried to synthesize again the vhdl. I was staring
with squared eyes when I looked at the logs.
Unfortunately those guys did not have any idea of what they were doing
and problems like the one in this thread are nearly everywhere in the
FPGAs (likely there are only three in the system!).

> I am not an expert on this, so if you want to go deeper into this, 
> comp.arch.fpga may be a more suitable place.
> 

thanks for the pointer, I am a very frequent reader of c.a.f.

> In my opinion, based only on what I've read here (and I have not read
> every post in detail), you should go through the entire FPGA code and
> get it cleanly through the tools before you send it somewhere where 
> it can never be changed again.

We are going through the FPGAs every day, trying to understand the logic
behind and the potential flaws. Sometime it is so poorly segmented that
is very hard to understand the original intention.

Unfortunately everything is up in space now and we have to live with it.
Given the level of the design I expect a lot of other big flaws
somewhere else. We can only hope all these flaws can be circumvented by
software, but it's a real pain.

In comp.arch.embedded,
alb <alessandro.basili@cern.ch> wrote:
> transition. In the event where input is happening *just before* the clk
> rising edge, input_d can toggle, but cond will not be long enough to
> ensure correct setup time for the start bit to start:
>           __    __    __    __    __    __    __    __    __    __
> clk    __|  |__|  |__|  |__|  |__|  |__|  |__|  |__|  |__|  |__|  |__|
>        ________                                 ______________________
> input          |_______________________________|
>        _________                                 _____________________
> input_d         |_______________________________|
>                 _
> cond   ________| |____________________________________________________
>
> start  _______________________________________________________________
>
> The hold time for 'start' is guaranteed by the place&route, since the
> tool will guarantee that delay between the 'input_d' and 'start' ffs are
> within a clock cycle. But the tool cannot guarantee that input is
> removed before the necessary setup time.
>
> The correct design should have resynchronized the input with the
> internal clock (clk) and everything would have been fine, since from
> that moment on everything is running at the pace of the internal clock.

Something that should be done for _every_ input to the FPGA. Have you
checked how other inputs are handled? If there is a problem like this on
the UART data input, it is more likely that there is a problem with other
inputs as well. Re-timing an input can in some cases also make it easier
to meet timing requirements, maybe solving some of the timing errors you
found when you tried to re-compile the FPGA?

I am not an expert on this, so if you want to go deeper into this, 
comp.arch.fpga may be a more suitable place.

In my opinion, based only on what I've read here (and I have not read
every post in detail), you should go through the entire FPGA code and
get it cleanly through the tools before you send it somewhere where 
it can never be changed again.

-- 
Stef    (remove caps, dashes and .invalid from e-mail address to reply by mail)

A plethora of individuals with expertise in culinary techniques contaminate
the potable concoction produced by steeping certain edible nutriments.

On 3/8/2012 8:24 PM, langwadt@fonz.dk wrote:
[...]
>> It depends on the size of the packet (*) that you send. For a 100 bytes
>> packet would be like ~0.5%, while with a 2KB packet is ~50%.
>>
>> The receiver does not resynchronize the input signal with its internal
>> clock and the condition to have a start bit is set when the negated
>> signal and the clocked signal are both 1.
>>
>> here is a simplified snippet in vhdl:
>>
>>> process (clk)
>>> begin
>>>   if rising_edge (clk) then
>>>     input_d <= input;
>>>   end if;
>>> end process;
>>> process (clk)
>>> begin
>>>   if rising_edge (clk) then
>>>     start_bit <= not input and input_d;
>>>   end if;
>>> end process;
> that snippet seems to imply that the UART is running at a clk that is
> equal
> to the baudrate, not some multiple of it (8 or 16  in most uarts)
The clk signal is 16 times the baudrate - my apologies if I did not
explicitly stated that. The clk signal in the snippet is running at ~3.5
us, while the bit length is ~52us.
> If so I am surprised it works at all for any input
>
>> Since the 'not input' is not synchronized with the internal clock, the
>> start_bit ff may not have the hold time satisfied hence the miss of the
>> start bit.
I think I was wrong here, the problem is on setup rather than hold time.
> apart from the issues of metastability and such when sampling an async
> signal
> it shouldn't be a problem as long as the clock is some multiple of the
> baudrate,
> if you don't see the edge this clk cycle you will see it the next
>
The problem is all about time violation. If you don't see the edge on
the correct clk cycle you will never see it unless you have yet another
edge.
I'll try to argument it more to avoid - hopefully - further confusion:
          __    __    __    __    __    __    __    __    __    __
clk    __|  |__|  |__|  |__|  |__|  |__|  |__|  |__|  |__|  |__|  |__|
       ________                                 ______________________
input          |_______________________________|
       ______________                               __________________
input_d              |_____________________________|
                _____
cond   ________|     |________________________________________________
                      _____
start  ______________|     |__________________________________________

where cond = not input and input_d.

Since 'input' is not clocked with 'clk', you can see how the time that
'cond' is high depend on the phase between 'clk' and the input
transition. In the event where input is happening *just before* the clk
rising edge, input_d can toggle, but cond will not be long enough to
ensure correct setup time for the start bit to start:
          __    __    __    __    __    __    __    __    __    __
clk    __|  |__|  |__|  |__|  |__|  |__|  |__|  |__|  |__|  |__|  |__|
       ________                                 ______________________
input          |_______________________________|
       _________                                 _____________________
input_d         |_______________________________|
                _
cond   ________| |____________________________________________________

start  _______________________________________________________________

The hold time for 'start' is guaranteed by the place&route, since the
tool will guarantee that delay between the 'input_d' and 'start' ffs are
within a clock cycle. But the tool cannot guarantee that input is
removed before the necessary setup time.

The correct design should have resynchronized the input with the
internal clock (clk) and everything would have been fine, since from
that moment on everything is running at the pace of the internal clock.

Hope that helps.


> -Lasse

On 8 Mar., 16:39, alb <alessandro.bas...@cern.ch> wrote:
> On 3/8/2012 4:40 AM, langw...@fonz.dk wrote:
>
>
>
>
>
>
>
>
>
> >> Ok, a colleague of mine went through it and indeed the start-bit logic
> >> is faulty, since it is looking for a negative transition but without t=
he
> >> signal being synchronized with the internal clock (don't ask me how th=
at
> >> is possible!).
>
> >> Given this type of error the 0xFF byte will be lost completely, since
> >> there are no other start-bit to sync on within the byte, while in othe=
r
> >> cases it may resync with a '0' bit in within the byte.
>
> > I trying real hard to understand what it is you are saying
>
> > if the uart cannot find the edge of the start bit and then sample 8
> > bits
> > correctly with out more edges to resync the baudrates would have
> > differ
> > quite bit, like several %
>
> It depends on the size of the packet (*) that you send. For a 100 bytes
> packet would be like ~0.5%, while with a 2KB packet is ~50%.
>
> The receiver does not resynchronize the input signal with its internal
> clock and the condition to have a start bit is set when the negated
> signal and the clocked signal are both 1.
>
> here is a simplified snippet in vhdl:
>
> > process (clk)
> > begin
> > =A0 if rising_edge (clk) then
> > =A0 =A0 input_d <=3D input;
> > =A0 end if;
> > end process;
>
> > process (clk)
> > begin
> > =A0 if rising_edge (clk) then
> > =A0 =A0 start_bit <=3D not input and input_d;
> > =A0 end if;
> > end process;
>

that snippet seems to imply that the UART is running at a clk that is
equal
to the baudrate, not some multiple of it (8 or 16  in most uarts)

If so I am surprised it works at all for any input

> Since the 'not input' is not synchronized with the internal clock, the
> start_bit ff may not have the hold time satisfied hence the miss of the
> start bit.

apart from the issues of metastability and such when sampling an async
signal
it shouldn't be a problem as long as the clock is some multiple of the
baudrate,
if you don't see the edge this clk cycle you will see it the next

-Lasse

On 3/8/2012 2:43 AM, Charles Bryant wrote:
> In article <9rp8unF1i9U1@mid.individual.net>,
> alb  <alessandro.basili@cern.ch> wrote:
> }On 3/6/2012 2:51 AM, Charles Bryant wrote:
> }[...]
> .. running rx at much higher clock ...
> }This approach is nice if and only if the transmitter does not introduce
> }extra gaps in between bytes, which may make your job to take time into
> }account a little bit more complex. And the problem is still not solved,
> }since the receiver is missing the start bit (transition high-low).
> 
> Reading your solution, I think I may have been wrong in my assumption
> about how the receiver worked. I assumed that when it missed the start
> bit, if another zero bit arrived it would see that (i.e.
> level-triggered), rather than needing a 1 and subsequent 0, so indeed
> my solution would not work (nor would the suggestion of using two
> stop bits).
> 
> }Timing the distance between interrupt may be tricky since at 19.2 Kbaud
> }a 52us interval is needed for each bit, hence you would need your timer
> }to run faster than that...
> 
> The ADSP-21020 TCOUNT register runs at the processor clock speed, so
> at a typical speed of 20MHz it increments every 50ns. And it can be
> read in a single cycle.
> 

My apologies, I thought you wanted to serve the timer interrupt to do
something, instead of simply checking the clock.
[...]
> }The byte encoding for the character looks like the following:
> }                                             ____
> }____| st | rs | b0 | b1 | b2 | b3 | sh | cb |
> .. rest omitted ...
> 
> That looks very good. I'm sure that theoretically it would be possible
> to design something with lower overhead, but if you can tolerate that
> amount of overhead, it is simple enough to see that it obviously
> works, while a more complex solution might have an obscure flaw.

The overhead may be decreased simply throwing away the possibility to
have control characters (which is also possible), hence being able to
transmit 5 bits instead of 4. This is also being taken into
consideration, but considering the benefits of having control characters
it was actually decided that we don't care about the overhead.

On 3/7/2012 7:47 PM, Tim Wescott wrote:
[...]
> 
> I know it sounds like an oxymoron, but that's a really elegant kludge.  
> That you had to do it at all makes it a kludge -- but it looks like you 
> did a good job with it within the confines of what you had to work with.

I agree that is a shame we had to introduce this additional layers, but
if you look on the command level everything would look the same and all
the kludge is left in the other levels where the dirty work is being done.

After all there's always somebody who's doing the dirty work, likely in
this case we may have found a solution that lives the dirt down at the
bottom.

So far we have not found pitfalls, but we will post it if that is the case.

Al

On 3/8/2012 4:40 AM, langwadt@fonz.dk wrote:
>> Ok, a colleague of mine went through it and indeed the start-bit logic
>> is faulty, since it is looking for a negative transition but without the
>> signal being synchronized with the internal clock (don't ask me how that
>> is possible!).
>>
>> Given this type of error the 0xFF byte will be lost completely, since
>> there are no other start-bit to sync on within the byte, while in other
>> cases it may resync with a '0' bit in within the byte.
>>
> 
> I trying real hard to understand what it is you are saying
> 
> if the uart cannot find the edge of the start bit and then sample 8
> bits
> correctly with out more edges to resync the baudrates would have
> differ
> quite bit, like several %
> 

It depends on the size of the packet (*) that you send. For a 100 bytes
packet would be like ~0.5%, while with a 2KB packet is ~50%.

The receiver does not resynchronize the input signal with its internal
clock and the condition to have a start bit is set when the negated
signal and the clocked signal are both 1.

here is a simplified snippet in vhdl:
> process (clk)
> begin
>   if rising_edge (clk) then
>     input_d <= input;
>   end if;
> end process;
> 
> process (clk)
> begin 
>   if rising_edge (clk) then
>     start_bit <= not input and input_d;
>   end if;
> end process;


Since the 'not input' is not synchronized with the internal clock, the
start_bit ff may not have the hold time satisfied hence the miss of the
start bit.

> -Lasse
> 

(*)  a packet is a continuous stream of characters (**)
(**) a character is what is between a start and a stop bit

On 5 Mar., 11:44, alb <alessandro.bas...@cern.ch> wrote:
> On 3/2/2012 8:38 PM, Tim Wescott wrote:
>
>
>
>
>
>
>
>
>
> > On Fri, 02 Mar 2012 14:03:07 +0100, alb wrote:
>
> >> On 3/2/2012 12:52 PM, Stef wrote:
> >>> In comp.arch.embedded,
> >>> alb <alessandro.bas...@cern.ch> wrote:
> >>>> Hi everyone,
>
> >>>> in the system I am using there is an ADSP21020 connected to an FPGA
> >>>> which is receiving data from a serial port. The FPGA receives the
> >>>> serial bytes and sets an interrupt and a bit in a status register once
> >>>> the byte is ready in the output register (one 'start bit' and one
> >>>> 'stop bit'). The DSP can look at the registers simply reading from a
> >>>> mapped port and we can choose either polling the status register or
> >>>> using the interrupt.
>
> >>>> Unfortunately this is just on paper. The real world is much more
> >>>> different since the FPGA receiver is apparently 'losing' bits. When we
> >>>> send a "packet" (a sequence of bytes) what we can observe with the
> >>>> scope it that sometimes the interrupts are not equally spaced in time
> >>>> and there is one byte less w.r.t. what we send. So we suspect that the
> >>>> receiver has started on the wrong 'start bit', hence screwing up
> >>>> everything.
>
> >>>> The incidence of this error looks like dependent on the length of the
> >>>> packet we send, leading to think that due to some synchronization
> >>>> problem the uart looses the sync (maybe timing issues on the fpga).
>
> >>>> Given the fact that we cannot change the fpga, I came up with the idea
> >>>> to use some forward error correction (FEC) encoding to overcome this
> >>>> issue, but if my diagnosis is correct it looks like that the broken
> >>>> sequence of bytes is not only missing some bytes, it will certainly
> >>>> have the bit shifted (starting on wrong 'start bit') with some bits
> >>>> inserted ('start bit' and 'stop bit' will be part of the data) and I'm
> >>>> not sure if there exists some technique which may recover such a
> >>>> broken sequence.
>
> >>>> On top of it I don't have any feeling how much would cost (in terms of
> >>>> memory and cpu resources) any type of FEC decoding on the DSP.
>
> >>>> Any suggestions and/or ideas?
>
> >>> Is this a continuous stream of bits, with no pauses between bytes?
> >>> Looks like the start bit detection does not re-adjust it's timing to
> >>> the actual edge of the next start bit. With small diffferences in
> >>> bitrate, this causes the receiver to fall out of sync as you found.
>
> >> in within a "packet" there's should be no pause between bytes, I will
> >> check though. There might be a small difference in bitrate, maybe I
> >> would need to verify how much.
>
> >>> Obviously, the best solution is to fix the FPGA as it is 'broken'. Is
> >>> there no way to fix it or get it fixed?
>
> >> The FPGA, is flying in space, together with the rest of the equipment.
> >> We cannot reprogram it, we can only replace the software in the DSP,
> >> with non-trivial effort.
>
> >>> Can you change the sender of the data? If so, you can set it to 2 stop
> >>> bits. This can allow the receive to re-sync every byte. If possible, I
> >>> do try to set my transmitters to 2 stop bits and receivers to 1. This
> >>> can prevent trouble like this but costs a little bandwidth.
>
> >> We are currently investigating it, the transmitter is controlled by an
> >> 8051 and in principle we should have control over it. Your idea is to
> >> use the second stop bit to allow better synching and hopefully not lose
> >> the following start bit, correct?
>
> >>> Another option would be to tweak the bitrates. It seems your sender is
> >>> now a tiny bit on the fast side w.r.t. the receiver. Maybe you can
> >>> slown down the clock on your sender by 1 or 2 percent? Try to get an
> >>> accurate measurement of the bitrate on both sides before you do
> >>> anything.
>
> >> We can certainly measure the transmission rate. I am not sure we can
> >> tweak the bitrates to that level. The current software on the 8051
> >> supports several bitrates (19.2, 9.6, 4.8, 2.4 Kbaud) but I'm afraid
> >> those options are somehow hardcoded in the transmitter. Certainly it
> >> would be worth having a look.
>
> > Go over the FPGA code with a fine-toothed comb -- whatever you're doing,
> > it won't help if the FPGA doesn't support it.
>
> Ok, a colleague of mine went through it and indeed the start-bit logic
> is faulty, since it is looking for a negative transition but without the
> signal being synchronized with the internal clock (don't ask me how that
> is possible!).
>
> Given this type of error the 0xFF byte will be lost completely, since
> there are no other start-bit to sync on within the byte, while in other
> cases it may resync with a '0' bit in within the byte.
>

I trying real hard to understand what it is you are saying

if the uart cannot find the edge of the start bit and then sample 8
bits
correctly with out more edges to resync the baudrates would have
differ
quite bit, like several %

-Lasse

In article <9rp8unF1i9U1@mid.individual.net>,
alb  <alessandro.basili@cern.ch> wrote:
}On 3/6/2012 2:51 AM, Charles Bryant wrote:
}[...]
.. running rx at much higher clock ...
}This approach is nice if and only if the transmitter does not introduce
}extra gaps in between bytes, which may make your job to take time into
}account a little bit more complex. And the problem is still not solved,
}since the receiver is missing the start bit (transition high-low).

Reading your solution, I think I may have been wrong in my assumption
about how the receiver worked. I assumed that when it missed the start
bit, if another zero bit arrived it would see that (i.e.
level-triggered), rather than needing a 1 and subsequent 0, so indeed
my solution would not work (nor would the suggestion of using two
stop bits).

}Timing the distance between interrupt may be tricky since at 19.2 Kbaud
}a 52us interval is needed for each bit, hence you would need your timer
}to run faster than that...

The ADSP-21020 TCOUNT register runs at the processor clock speed, so
at a typical speed of 20MHz it increments every 50ns. And it can be
read in a single cycle.

} Considering that the fastest interrupt
}service routine introduce ~3.5us of overhead it looks like you won't
}have much more time to spend for the rest of the application.

Unless you're running the processor at a very slow speed you should be
able to make an ISR take a lot less time than that.

}The byte encoding for the character looks like the following:
}                                             ____
}____| st | rs | b0 | b1 | b2 | b3 | sh | cb |
.. rest omitted ...

That looks very good. I'm sure that theoretically it would be possible
to design something with lower overhead, but if you can tolerate that
amount of overhead, it is simple enough to see that it obviously
works, while a more complex solution might have an obscure flaw.