Reply by Stef March 9, 20122012-03-09
In comp.arch.embedded,
alb <alessandro.basili@cern.ch> wrote:
> On 3/9/2012 9:51 AM, Stef wrote: > >> In my opinion, based only on what I've read here (and I have not read >> every post in detail), you should go through the entire FPGA code and >> get it cleanly through the tools before you send it somewhere where >> it can never be changed again. > > We are going through the FPGAs every day, trying to understand the logic > behind and the potential flaws. Sometime it is so poorly segmented that > is very hard to understand the original intention. > > Unfortunately everything is up in space now and we have to live with it.
That's what I initially understood from your posts. But later I somehow understood that the gear was still on the ground a was going into space. That made me wonder why you where working around the problem instead of solving the actual cause. Understood now (again): FPGA is fixed, no changes possible, ever.
> Given the level of the design I expect a lot of other big flaws > somewhere else. We can only hope all these flaws can be circumvented by > software, but it's a real pain.
Given the situation as it is, it is all you can do. Good luck! -- Stef (remove caps, dashes and .invalid from e-mail address to reply by mail) All progress is based upon a universal innate desire of every organism to live beyond its income. -- Samuel Butler, "Notebooks"
Reply by alb March 9, 20122012-03-09
On 3/9/2012 9:51 AM, Stef wrote:
[...]
>> >> The correct design should have resynchronized the input with the >> internal clock (clk) and everything would have been fine, since from >> that moment on everything is running at the pace of the internal clock. > > Something that should be done for _every_ input to the FPGA. Have you > checked how other inputs are handled? If there is a problem like this on > the UART data input, it is more likely that there is a problem with other > inputs as well. Re-timing an input can in some cases also make it easier > to meet timing requirements, maybe solving some of the timing errors you > found when you tried to re-compile the FPGA? >
Actually we have found tons of warnings for 'inferred latch' and stuff like that, when we tried to synthesize again the vhdl. I was staring with squared eyes when I looked at the logs. Unfortunately those guys did not have any idea of what they were doing and problems like the one in this thread are nearly everywhere in the FPGAs (likely there are only three in the system!).
> I am not an expert on this, so if you want to go deeper into this, > comp.arch.fpga may be a more suitable place. >
thanks for the pointer, I am a very frequent reader of c.a.f.
> In my opinion, based only on what I've read here (and I have not read > every post in detail), you should go through the entire FPGA code and > get it cleanly through the tools before you send it somewhere where > it can never be changed again.
We are going through the FPGAs every day, trying to understand the logic behind and the potential flaws. Sometime it is so poorly segmented that is very hard to understand the original intention. Unfortunately everything is up in space now and we have to live with it. Given the level of the design I expect a lot of other big flaws somewhere else. We can only hope all these flaws can be circumvented by software, but it's a real pain.
Reply by Stef March 9, 20122012-03-09
In comp.arch.embedded,
alb <alessandro.basili@cern.ch> wrote:
> transition. In the event where input is happening *just before* the clk > rising edge, input_d can toggle, but cond will not be long enough to > ensure correct setup time for the start bit to start: > __ __ __ __ __ __ __ __ __ __ > clk __| |__| |__| |__| |__| |__| |__| |__| |__| |__| |__| > ________ ______________________ > input |_______________________________| > _________ _____________________ > input_d |_______________________________| > _ > cond ________| |____________________________________________________ > > start _______________________________________________________________ > > The hold time for 'start' is guaranteed by the place&route, since the > tool will guarantee that delay between the 'input_d' and 'start' ffs are > within a clock cycle. But the tool cannot guarantee that input is > removed before the necessary setup time. > > The correct design should have resynchronized the input with the > internal clock (clk) and everything would have been fine, since from > that moment on everything is running at the pace of the internal clock.
Something that should be done for _every_ input to the FPGA. Have you checked how other inputs are handled? If there is a problem like this on the UART data input, it is more likely that there is a problem with other inputs as well. Re-timing an input can in some cases also make it easier to meet timing requirements, maybe solving some of the timing errors you found when you tried to re-compile the FPGA? I am not an expert on this, so if you want to go deeper into this, comp.arch.fpga may be a more suitable place. In my opinion, based only on what I've read here (and I have not read every post in detail), you should go through the entire FPGA code and get it cleanly through the tools before you send it somewhere where it can never be changed again. -- Stef (remove caps, dashes and .invalid from e-mail address to reply by mail) A plethora of individuals with expertise in culinary techniques contaminate the potable concoction produced by steeping certain edible nutriments.
Reply by alb March 9, 20122012-03-09
On 3/8/2012 8:24 PM, langwadt@fonz.dk wrote:
[...]
>> It depends on the size of the packet (*) that you send. For a 100 bytes >> packet would be like ~0.5%, while with a 2KB packet is ~50%. >> >> The receiver does not resynchronize the input signal with its internal >> clock and the condition to have a start bit is set when the negated >> signal and the clocked signal are both 1. >> >> here is a simplified snippet in vhdl: >> >>> process (clk) >>> begin >>> if rising_edge (clk) then >>> input_d <= input; >>> end if; >>> end process; >>> process (clk) >>> begin >>> if rising_edge (clk) then >>> start_bit <= not input and input_d; >>> end if; >>> end process; > that snippet seems to imply that the UART is running at a clk that is > equal > to the baudrate, not some multiple of it (8 or 16 in most uarts)
The clk signal is 16 times the baudrate - my apologies if I did not explicitly stated that. The clk signal in the snippet is running at ~3.5 us, while the bit length is ~52us.
> If so I am surprised it works at all for any input > >> Since the 'not input' is not synchronized with the internal clock, the >> start_bit ff may not have the hold time satisfied hence the miss of the >> start bit.
I think I was wrong here, the problem is on setup rather than hold time.
> apart from the issues of metastability and such when sampling an async > signal > it shouldn't be a problem as long as the clock is some multiple of the > baudrate, > if you don't see the edge this clk cycle you will see it the next >
The problem is all about time violation. If you don't see the edge on the correct clk cycle you will never see it unless you have yet another edge. I'll try to argument it more to avoid - hopefully - further confusion: __ __ __ __ __ __ __ __ __ __ clk __| |__| |__| |__| |__| |__| |__| |__| |__| |__| |__| ________ ______________________ input |_______________________________| ______________ __________________ input_d |_____________________________| _____ cond ________| |________________________________________________ _____ start ______________| |__________________________________________ where cond = not input and input_d. Since 'input' is not clocked with 'clk', you can see how the time that 'cond' is high depend on the phase between 'clk' and the input transition. In the event where input is happening *just before* the clk rising edge, input_d can toggle, but cond will not be long enough to ensure correct setup time for the start bit to start: __ __ __ __ __ __ __ __ __ __ clk __| |__| |__| |__| |__| |__| |__| |__| |__| |__| |__| ________ ______________________ input |_______________________________| _________ _____________________ input_d |_______________________________| _ cond ________| |____________________________________________________ start _______________________________________________________________ The hold time for 'start' is guaranteed by the place&route, since the tool will guarantee that delay between the 'input_d' and 'start' ffs are within a clock cycle. But the tool cannot guarantee that input is removed before the necessary setup time. The correct design should have resynchronized the input with the internal clock (clk) and everything would have been fine, since from that moment on everything is running at the pace of the internal clock. Hope that helps.
> -Lasse
Reply by lang...@fonz.dk March 8, 20122012-03-08
On 8 Mar., 16:39, alb <alessandro.bas...@cern.ch> wrote:
> On 3/8/2012 4:40 AM, langw...@fonz.dk wrote: > > > > > > > > > > >> Ok, a colleague of mine went through it and indeed the start-bit logic > >> is faulty, since it is looking for a negative transition but without t=
he
> >> signal being synchronized with the internal clock (don't ask me how th=
at
> >> is possible!). > > >> Given this type of error the 0xFF byte will be lost completely, since > >> there are no other start-bit to sync on within the byte, while in othe=
r
> >> cases it may resync with a '0' bit in within the byte. > > > I trying real hard to understand what it is you are saying > > > if the uart cannot find the edge of the start bit and then sample 8 > > bits > > correctly with out more edges to resync the baudrates would have > > differ > > quite bit, like several % > > It depends on the size of the packet (*) that you send. For a 100 bytes > packet would be like ~0.5%, while with a 2KB packet is ~50%. > > The receiver does not resynchronize the input signal with its internal > clock and the condition to have a start bit is set when the negated > signal and the clocked signal are both 1. > > here is a simplified snippet in vhdl: > > > process (clk) > > begin > > =A0 if rising_edge (clk) then > > =A0 =A0 input_d <=3D input; > > =A0 end if; > > end process; > > > process (clk) > > begin > > =A0 if rising_edge (clk) then > > =A0 =A0 start_bit <=3D not input and input_d; > > =A0 end if; > > end process; >
that snippet seems to imply that the UART is running at a clk that is equal to the baudrate, not some multiple of it (8 or 16 in most uarts) If so I am surprised it works at all for any input
> Since the 'not input' is not synchronized with the internal clock, the > start_bit ff may not have the hold time satisfied hence the miss of the > start bit.
apart from the issues of metastability and such when sampling an async signal it shouldn't be a problem as long as the clock is some multiple of the baudrate, if you don't see the edge this clk cycle you will see it the next -Lasse
Reply by alb March 8, 20122012-03-08
On 3/8/2012 2:43 AM, Charles Bryant wrote:
> In article <9rp8unF1i9U1@mid.individual.net>, > alb <alessandro.basili@cern.ch> wrote: > }On 3/6/2012 2:51 AM, Charles Bryant wrote: > }[...] > .. running rx at much higher clock ... > }This approach is nice if and only if the transmitter does not introduce > }extra gaps in between bytes, which may make your job to take time into > }account a little bit more complex. And the problem is still not solved, > }since the receiver is missing the start bit (transition high-low). > > Reading your solution, I think I may have been wrong in my assumption > about how the receiver worked. I assumed that when it missed the start > bit, if another zero bit arrived it would see that (i.e. > level-triggered), rather than needing a 1 and subsequent 0, so indeed > my solution would not work (nor would the suggestion of using two > stop bits). > > }Timing the distance between interrupt may be tricky since at 19.2 Kbaud > }a 52us interval is needed for each bit, hence you would need your timer > }to run faster than that... > > The ADSP-21020 TCOUNT register runs at the processor clock speed, so > at a typical speed of 20MHz it increments every 50ns. And it can be > read in a single cycle. >
My apologies, I thought you wanted to serve the timer interrupt to do something, instead of simply checking the clock. [...]
> }The byte encoding for the character looks like the following: > } ____ > }____| st | rs | b0 | b1 | b2 | b3 | sh | cb | > .. rest omitted ... > > That looks very good. I'm sure that theoretically it would be possible > to design something with lower overhead, but if you can tolerate that > amount of overhead, it is simple enough to see that it obviously > works, while a more complex solution might have an obscure flaw.
The overhead may be decreased simply throwing away the possibility to have control characters (which is also possible), hence being able to transmit 5 bits instead of 4. This is also being taken into consideration, but considering the benefits of having control characters it was actually decided that we don't care about the overhead.
Reply by alb March 8, 20122012-03-08
On 3/7/2012 7:47 PM, Tim Wescott wrote:
[...]
> > I know it sounds like an oxymoron, but that's a really elegant kludge. > That you had to do it at all makes it a kludge -- but it looks like you > did a good job with it within the confines of what you had to work with.
I agree that is a shame we had to introduce this additional layers, but if you look on the command level everything would look the same and all the kludge is left in the other levels where the dirty work is being done. After all there's always somebody who's doing the dirty work, likely in this case we may have found a solution that lives the dirt down at the bottom. So far we have not found pitfalls, but we will post it if that is the case. Al
Reply by alb March 8, 20122012-03-08
On 3/8/2012 4:40 AM, langwadt@fonz.dk wrote:
>> Ok, a colleague of mine went through it and indeed the start-bit logic >> is faulty, since it is looking for a negative transition but without the >> signal being synchronized with the internal clock (don't ask me how that >> is possible!). >> >> Given this type of error the 0xFF byte will be lost completely, since >> there are no other start-bit to sync on within the byte, while in other >> cases it may resync with a '0' bit in within the byte. >> > > I trying real hard to understand what it is you are saying > > if the uart cannot find the edge of the start bit and then sample 8 > bits > correctly with out more edges to resync the baudrates would have > differ > quite bit, like several % >
It depends on the size of the packet (*) that you send. For a 100 bytes packet would be like ~0.5%, while with a 2KB packet is ~50%. The receiver does not resynchronize the input signal with its internal clock and the condition to have a start bit is set when the negated signal and the clocked signal are both 1. here is a simplified snippet in vhdl:
> process (clk) > begin > if rising_edge (clk) then > input_d <= input; > end if; > end process; > > process (clk) > begin > if rising_edge (clk) then > start_bit <= not input and input_d; > end if; > end process;
Since the 'not input' is not synchronized with the internal clock, the start_bit ff may not have the hold time satisfied hence the miss of the start bit.
> -Lasse >
(*) a packet is a continuous stream of characters (**) (**) a character is what is between a start and a stop bit
Reply by lang...@fonz.dk March 7, 20122012-03-07
On 5 Mar., 11:44, alb <alessandro.bas...@cern.ch> wrote:
> On 3/2/2012 8:38 PM, Tim Wescott wrote: > > > > > > > > > > > On Fri, 02 Mar 2012 14:03:07 +0100, alb wrote: > > >> On 3/2/2012 12:52 PM, Stef wrote: > >>> In comp.arch.embedded, > >>> alb <alessandro.bas...@cern.ch> wrote: > >>>> Hi everyone, > > >>>> in the system I am using there is an ADSP21020 connected to an FPGA > >>>> which is receiving data from a serial port. The FPGA receives the > >>>> serial bytes and sets an interrupt and a bit in a status register once > >>>> the byte is ready in the output register (one 'start bit' and one > >>>> 'stop bit'). The DSP can look at the registers simply reading from a > >>>> mapped port and we can choose either polling the status register or > >>>> using the interrupt. > > >>>> Unfortunately this is just on paper. The real world is much more > >>>> different since the FPGA receiver is apparently 'losing' bits. When we > >>>> send a "packet" (a sequence of bytes) what we can observe with the > >>>> scope it that sometimes the interrupts are not equally spaced in time > >>>> and there is one byte less w.r.t. what we send. So we suspect that the > >>>> receiver has started on the wrong 'start bit', hence screwing up > >>>> everything. > > >>>> The incidence of this error looks like dependent on the length of the > >>>> packet we send, leading to think that due to some synchronization > >>>> problem the uart looses the sync (maybe timing issues on the fpga). > > >>>> Given the fact that we cannot change the fpga, I came up with the idea > >>>> to use some forward error correction (FEC) encoding to overcome this > >>>> issue, but if my diagnosis is correct it looks like that the broken > >>>> sequence of bytes is not only missing some bytes, it will certainly > >>>> have the bit shifted (starting on wrong 'start bit') with some bits > >>>> inserted ('start bit' and 'stop bit' will be part of the data) and I'm > >>>> not sure if there exists some technique which may recover such a > >>>> broken sequence. > > >>>> On top of it I don't have any feeling how much would cost (in terms of > >>>> memory and cpu resources) any type of FEC decoding on the DSP. > > >>>> Any suggestions and/or ideas? > > >>> Is this a continuous stream of bits, with no pauses between bytes? > >>> Looks like the start bit detection does not re-adjust it's timing to > >>> the actual edge of the next start bit. With small diffferences in > >>> bitrate, this causes the receiver to fall out of sync as you found. > > >> in within a "packet" there's should be no pause between bytes, I will > >> check though. There might be a small difference in bitrate, maybe I > >> would need to verify how much. > > >>> Obviously, the best solution is to fix the FPGA as it is 'broken'. Is > >>> there no way to fix it or get it fixed? > > >> The FPGA, is flying in space, together with the rest of the equipment. > >> We cannot reprogram it, we can only replace the software in the DSP, > >> with non-trivial effort. > > >>> Can you change the sender of the data? If so, you can set it to 2 stop > >>> bits. This can allow the receive to re-sync every byte. If possible, I > >>> do try to set my transmitters to 2 stop bits and receivers to 1. This > >>> can prevent trouble like this but costs a little bandwidth. > > >> We are currently investigating it, the transmitter is controlled by an > >> 8051 and in principle we should have control over it. Your idea is to > >> use the second stop bit to allow better synching and hopefully not lose > >> the following start bit, correct? > > >>> Another option would be to tweak the bitrates. It seems your sender is > >>> now a tiny bit on the fast side w.r.t. the receiver. Maybe you can > >>> slown down the clock on your sender by 1 or 2 percent? Try to get an > >>> accurate measurement of the bitrate on both sides before you do > >>> anything. > > >> We can certainly measure the transmission rate. I am not sure we can > >> tweak the bitrates to that level. The current software on the 8051 > >> supports several bitrates (19.2, 9.6, 4.8, 2.4 Kbaud) but I'm afraid > >> those options are somehow hardcoded in the transmitter. Certainly it > >> would be worth having a look. > > > Go over the FPGA code with a fine-toothed comb -- whatever you're doing, > > it won't help if the FPGA doesn't support it. > > Ok, a colleague of mine went through it and indeed the start-bit logic > is faulty, since it is looking for a negative transition but without the > signal being synchronized with the internal clock (don't ask me how that > is possible!). > > Given this type of error the 0xFF byte will be lost completely, since > there are no other start-bit to sync on within the byte, while in other > cases it may resync with a '0' bit in within the byte. >
I trying real hard to understand what it is you are saying if the uart cannot find the edge of the start bit and then sample 8 bits correctly with out more edges to resync the baudrates would have differ quite bit, like several % -Lasse
Reply by Charles Bryant March 7, 20122012-03-07
In article <9rp8unF1i9U1@mid.individual.net>,
alb  <alessandro.basili@cern.ch> wrote:
}On 3/6/2012 2:51 AM, Charles Bryant wrote:
}[...]
.. running rx at much higher clock ...
}This approach is nice if and only if the transmitter does not introduce
}extra gaps in between bytes, which may make your job to take time into
}account a little bit more complex. And the problem is still not solved,
}since the receiver is missing the start bit (transition high-low).

Reading your solution, I think I may have been wrong in my assumption
about how the receiver worked. I assumed that when it missed the start
bit, if another zero bit arrived it would see that (i.e.
level-triggered), rather than needing a 1 and subsequent 0, so indeed
my solution would not work (nor would the suggestion of using two
stop bits).

}Timing the distance between interrupt may be tricky since at 19.2 Kbaud
}a 52us interval is needed for each bit, hence you would need your timer
}to run faster than that...

The ADSP-21020 TCOUNT register runs at the processor clock speed, so
at a typical speed of 20MHz it increments every 50ns. And it can be
read in a single cycle.

} Considering that the fastest interrupt
}service routine introduce ~3.5us of overhead it looks like you won't
}have much more time to spend for the rest of the application.

Unless you're running the processor at a very slow speed you should be
able to make an ISR take a lot less time than that.

}The byte encoding for the character looks like the following:
}                                             ____
}____| st | rs | b0 | b1 | b2 | b3 | sh | cb |
.. rest omitted ...

That looks very good. I'm sure that theoretically it would be possible
to design something with lower overhead, but if you can tolerate that
amount of overhead, it is simple enough to see that it obviously
works, while a more complex solution might have an obscure flaw.