Serial Bus Speed on PCs| page 3

Reply by Andrew Smallshaw ●December 5, 20222022-12-05

On 2022-11-30, Rick C <gnuarm.deletethisbit@gmail.com> wrote:
> I am using laptops to control test fixtures via a USB serial port.  I'm looking at combining many test fixtures in one chassis, controlled over one serial port.  The problem I'm concerned about is not the speed of the bus, which can range up to 10 Mbps.  It's the interface to the serial port.  
>
> The messages are all short, around 15 characters.  The master PC addresses a slave and the slave promptly replies.  It seems this message level hand shake creates a bottle neck in every interface I've looked at. 
>
> FTDI has a high-speed USB cable that is likely limited by the 8 kHz polling rate.  So the message and response pair would be limited to 4 kHz.  Spread over 256 end points, that's only 16 message pairs a second to each target.  That might be workable if there were no other delays. 

Use some multidrop standard at the physical layer such as RS485.
At the DLL adopt a token ring style arbitration system.  The first
device interprets the request from the host as both receiving the
token and a request for data - for consistency with the other units
you'd probably want to format that initial request as a dumy "Device
0" response.  Device N interprets the reply from N-1 as sending it
the token and its request to transmit.  From the host perspective
you send a single request and get back a byte stream with the
results from all devices.

-- 
Andrew Smallshaw
andrews@sdf.org

Reply by Rick C ●December 5, 20222022-12-05

On Monday, December 5, 2022 at 4:58:39 AM UTC-5, Andrew Smallshaw wrote:
> On 2022-11-30, Rick C <gnuarm.del...@gmail.com> wrote: 
> > I am using laptops to control test fixtures via a USB serial port. I'm looking at combining many test fixtures in one chassis, controlled over one serial port. The problem I'm concerned about is not the speed of the bus, which can range up to 10 Mbps. It's the interface to the serial port. 
> > 
> > The messages are all short, around 15 characters. The master PC addresses a slave and the slave promptly replies. It seems this message level hand shake creates a bottle neck in every interface I've looked at. 
> > 
> > FTDI has a high-speed USB cable that is likely limited by the 8 kHz polling rate. So the message and response pair would be limited to 4 kHz. Spread over 256 end points, that's only 16 message pairs a second to each target. That might be workable if there were no other delays.
> Use some multidrop standard at the physical layer such as RS485. 
> At the DLL adopt a token ring style arbitration system. The first 
> device interprets the request from the host as both receiving the 
> token and a request for data - for consistency with the other units 
> you'd probably want to format that initial request as a dumy "Device 
> 0" response. Device N interprets the reply from N-1 as sending it 
> the token and its request to transmit. From the host perspective 
> you send a single request and get back a byte stream with the 
> results from all devices. 

That scheme requires every end point to know where it is in the grand scheme, but more importantly, to know what other end points are in the system.  It also requires the master to address every end point in sequence.  How would you address one end point only, or some number of missing slots?  This would require the end point keep track of what commands have been sent, as well as who has replied.  

I've mulled this about for the last few days, including a priority scheme where handshake lines would be used to pass the priority more mechanically.  This priority "token" could be passed through the entire chain of 16 boards and 8 endpoints on each board, but it can also be done by using a priority chain only within the 8 slaves on each test fixture boards.  This will provide a burst of serial port operation for about 500 us at a 3 Mbps rate.  So if USB has a polling rate of 1 ms, we would get half bandwidth, which would be pretty good.  I feel better about blocking 8 commands for a given test fixture than blocking all 128 commands.  

Someone had suggested padding the transmitted data to set the timing of the replies.  That would work as well, and order would no longer be significant at all.  But I'm not comfortable with sending garbage data too control timing.   It can make debug more difficult.  Too bad there's no way to send a data byte without a start bit!  lol 

-- 

Rick C.

+-- Get 1,000 miles of free Supercharging
+-- Tesla referral code - https://ts.la/richard11209

Reply by Rick C ●December 5, 20222022-12-05

On Monday, December 5, 2022 at 2:57:46 AM UTC-5, David Brown wrote:
> On 04/12/2022 17:54, Rick C wrote: 
> > On Sunday, December 4, 2022 at 7:21:56 AM UTC-5, David Brown wrote: 
> >> On 03/12/2022 21:42, Rick C wrote: 
> >>> On Wednesday, November 30, 2022 at 12:14:18 PM UTC-5, David 
> >>> Brown wrote: 
> >> 
> >>>> A communication hierarchy is likely the best way to handle 
> >>>> this. 
> >>>> 
> >>>> Alternatively, at the messages from the PC can be large and 
> >>>> broadcast, rather than divided up. You could even make an 
> >>>> EtherCAT-style serial protocol (using the hybrid RS-422 bus 
> >>>> you suggested earlier). The PC could send a single massive 
> >>>> serial telegram consisting of multiple small ones: 
> >>>> 
> >>>> <header><padding><tele1><padding><tele2><padding>...<pause> 
> >>>> 
> >>>> Each slave would reply after hearing its own telegram, fast 
> >>>> enough to be complete in good time before the next slave 
> >>>> starts. (Adjust padding as necessary to give this timing.) 
> >>>> 
> >>>> Then from the PC side, you have one big telegram out, and one 
> >>>> big telegram in - using 3 MBaud if you like. 
> >>> 
> >>> I've been giving this some thought and it might work, but it's 
> >>> not guaranteed. This will prevent the slaves from talking over 
> >>> one another. But I don't know if the replies will be seen as a 
> >>> unit for shipping over Ethernet or USB by the adapter. I've been 
> >>> told that the messages will see delays in the adapters, but no 
> >>> one has indicated how they block the data. In the case of the 
> >>> FTDI adapter, the issue is the polling rate. 
> >>> 
> >>> This is the format I'm currently thinking of 01 23 45 C\r\n - 11 
> >>> chars 01 23 45 C 67\r\n - 14 chars 
> >>> 
> >>> The transmitted message would add 15 char of padding for a total 
> >>> of 26 chars per end point. At 3 Mbps a message takes 87 us to 
> >>> transmit on the serial bus for 11,500 messages a second, or 90 
> >>> messages per second per end point. That certainly would do the 
> >>> job, if I've done the math right. Even assuming other factors cut 
> >>> this rate in half, and it's still around 45 messages per end 
> >>> point each second. 
> >>> 
> >> Just to be clear - the slaves should not send any kind of dummy 
> >> characters. When they have read their part of the incoming stream, 
> >> they turn on their driver, send their reply, then turn off the 
> >> driver. 
> >> 
> >> The master side might need dummy characters for padding if the 
> >> slave replies (including any handling delay - the slaves might be 
> >> fast, but they still take some time) can be longer than the master 
> >> side telegrams. 
> >> 
> >> Each subtelegram in the master's telegram chain must be 
> >> self-contained - a start character, an ending CRC or simple 
> >> checksum, and so on. Replies from slaves must also be 
> >> self-contained. 
> >> 
> >> It doesn't matter how the USB-to-serial or Ethernet-to-serial 
> >> adaptors break up the messages - applications read the data as 
> >> serial streams, not synchronous timed data. The only timing you 
> >> have is a pause between master telegrams, which can be many 
> >> milliseconds long, used to ensure that if something has gone wrong 
> >> or lost synchronisation, their receiving state machine is reset and 
> >> ready for the next round. 
> > 
> > It absolutely does matter how the messages get broken up. That's 
> > where the delays come in. If the slave replies are sent over the 
> > network/USB bus one at a time, it's not significantly better than the 
> > original approach. 
> >
> I mean it doesn't matter how the messages are broken up from the 
> application code's viewpoint, as long as you handle it correctly as a 
> stream and don't incorrectly assume you always read whole telegrams at a 
> time. 

Of course the application doesn't care.  No one is worried about the application.  The concern is the timing of the messages on the various buses.  A message broken up too much may be sent in multiple small pieces resulting in more delays. 


> You can expect the converter to buffer up the incoming data and send it 
> in large lumps up the USB or Ethernet bus. That's how it can work at 
> high baud rates and throughputs. You lose the precise timing 
> information, however, and have extra latency and jitter - so you be sure 
> to treat the incoming data as a stream and then that does not matter.

I don't "expect" anything of the adapter.  They have delays that are largely unexplained, at least in any detail.  That's why this is hard to deal with. 

Right now I'm looking at using a priority enable across the 8 end points within a test fixture board.  That will allow a 400 us message block at 3 Mbps, with 350 us of overlap between the commands and the replies, so 450 us total.  That would work well with either a 1 ms polling rate or a 0.5 ms polling rate, if available, and provide 50 us of breathing room for the adapter.  

This is a lot like making gears for a mechanical clock, with a calendar and an appointment reminder.  LOL 

-- 

Rick C.

+-+ Get 1,000 miles of free Supercharging
+-+ Tesla referral code - https://ts.la/richard11209

Reply by ●December 5, 20222022-12-05

Rick C <gnuarm.deletethisbit@gmail.com> wrote:
> On Sunday, December 4, 2022 at 10:33:28 PM UTC-5, anti...@math.uni.wroc.pl wrote:
> > Rick C <gnuarm.del...@gmail.com> wrote: 
> > > On Sunday, December 4, 2022 at 4:30:35 PM UTC-5, anti...@math.uni.wroc.pl wrote: 
> > > > Rick C <gnuarm.del...@gmail.com> wrote: 
> > > > > On Wednesday, November 30, 2022 at 9:08:25 PM UTC-4, anti...@math.uni.wroc.pl wrote: 
> > > > > > Rick C <gnuarm.del...@gmail.com> wrote: 
> 
> > > > > > With relatively cheap convertors 
> > > > > > on Linux to handle 10000 roundtrips for 15 bytes messages I need 
> > > > > > the following times: 
> > > > > > 
> > > > > > CH340 2Mb/s, waiting, 6.890s 
> > > > > 
> > > > > That's 11.3 per target, per second. (128 targets) 
> > > > > 
> > > > > > CH340 2Mb/s, overlapped 1.058s 
> > > > > 
> > > > > That's pretty close to 74 per target, per second. 
> > > > > 
> > > > > I used to use the CH340 devices, but we had intermittent lockups of the serial port when testing all day long. I switched to FTDI and that went away. I think you told me you have no such problems. Maybe it's the CH340 Windows serial drivers. 
> > > > Well, my use is rather light. Most is for debugging at say 9600 or 
> > > > 115200. And when plugged in convertor mostly sits idle. I previously 
> > > > wrote that CH340 did not work at 921600. More testing showed that 
> > > > it actually worked, but speed was significantly different, I had to 
> > > > set my MCU to 847000 communicate. This could be bug in Linux driver 
> > > > (there is rather funky formula connecting speed to parameters 
> > > > and it looks easy to get it wrong). Similary, when CH340 was set to 576800 
> > > > I had to set MCU to 541300. Even after matching speed at nomial 
> > > > 576800, 921600 and 1152000 test time was much (more than 10 times) 
> > > > higher than for other rates (I only tested 1 character messages at those 
> > > > rates, did not want to wait for full test). Also, 500000 was significantly 
> > > > slower than 460800 (but "merely" 2 times slower for 1 character messages 
> > > > and catching up with longer messages). Still, ATM CH340 looks 
> > > > resonably good. 
> > > 
> > > Yes, it's reasonably good for situations where it does not need to work reliably. I was surprised when the finger was pointed to the CH340 adapter. But someone (probably here) had warned me they are not dependable, and now I know. The cost of a name brand adapter is not so much that it's worth saving the difference, only to have to throw it out and go with FTDI anyway, when you have real work to do.
> > Well, I say you what I observed. People say various thing on the 
> > net. I was interested if net know something about my trouble with 
> > CP2104 so I googled for "CP2104 lockup". And I got a bunch of 
> > complaints about FTDI devices, solved by using CP2104. So, there 
> > is a lot of noise and ATM I prefer to stay with what I see.
> 
> What sort of complaints about FTDI?  Did you contact them about it? 

Things like computer locking up (IIUC fixed by newer driver).  Or
"communication did not work" (no real info).  ATM I have enough
converters.  If I need more/better I will look at FTDI products
and possible ask them questions.

> > > > Remark: I bought all my convertors from Chinese sellers. IIUC 
> > > > FTDI chip is faked a lot, but other too. Still, I think they 
> > > > show what is possible and illustrate some difficulties. 
> > > 
> > > FTDI fakes no longer work with the FTDI drivers. Maybe they play a cat and mouse game, with each side one upping the other, but it's not worth the bother to try it out. FTDI sells cables. It's easier to just buy them from FTDI.
> > AFAIK Linux driver does not discriminate againt non-FTDI devices. 
> > So fact that convertors works with Linux driver tells you nothing 
> > about its origin. And for the record, I bought mine several years 
> > ago.
> 
> I'm not using Linux.  I don't have any FTDI fakes.  I have some Prolific fakes somewhere, if I could find them.  I never had one bricked, but I think it was Prolific that did that some years ago.  Or, I may have them confused with FTDI.  I remember the bricking driver was released with a Windows update and MS was pretty pissed off when the bricking hit the news. 
 
It was FTDI who bricked fakes, that was widely discussed.  I did not
hear about Prolific doing something like that.

> > > > > > CP2104 2Mb/s, waiting, 2.514s 
> > > > > > CP2104 2Mb/s, overlapped 1.214s 
> > > > > 
> > > > > I don't know what the CP2104 is. 
> > > > It is a chip by Silicon Laboratories. Datasheet gives contact address 
> > > > in Austin, TX. 
> > > > > I'm not certain what "overlapped" means in this test. Did you just continue to send 15 byte messages with no delays 10,000 times? 
> > > > No. My slave simply returns back each received character. There is 
> > > > some software delay but it should be less than 2us. So even waiting 
> > > > test has some overlap at character level. To get more overlap above 
> > > > I cheated: my test program was sending 1 more character than it should. 
> > > > So sent message was 16 bytes, read was 15. After reading 15 another 
> > > > batch of 16 was sent and so on. In total there were 10000 more 
> > > > characters sent than received. My hope was that OS would read 
> > > > and buffer excess characters, but it seems that at least for 
> > > > CP2104 they cause trouble. My current guess is that OS is 
> > > > reading only when requested, but I did not investigate deeper... 
> > > > > Since you are in the mood for testing, what happens if you run overlapped, with 128 messages of 15 characters and wait for the replies before sending the next batch? Also, if you don't mind, can you try 20 character messages? 
> > > > OK, I tried modifeed version of my test program. It first sends 
> > > > k messages without reading anything, then goes to main loop where 
> > > > after sending each message it read one. At the end it tail loop 
> > > > which reads last k messages without sending anything. So, there 
> > > > is k + 1 messages in transit: after sending message k + i program 
> > > > waits for answer to message i. In total there is 10000 messages. 
> > > > Results are: 
> > > > 
> > > > CH340, 15 char message 20 char message 
> > > > k = 0 6.869s 7.163s 
> > > > k = 1 4.682s 1.320s 
> > > > k = 2 0.992s 1.320s 
> > > > k = 3 0.991s 1.319s 
> > > > k = 4 0.991s 1.320s 
> > > > k = 5 0.990s 1.319s 
> > > > k = 8 0.992s 1.320s 
> > > > k = 12 0.990s 1.320s 
> > > > k = 20 0.992s 1.319s 
> > > > k = 36 0.991s 1.321s 
> > > > k = 128 0.991s 1.319s 
> > > > 
> > > > CP2104, 15 char message 20 char message 
> > > > k = 0 2.508s 3.756s 
> > > > k = 1 1.897s 1.993s 
> > > > k = 2 1.668s 2.087s 
> > > > k = 3 1.486s 1.887s 
> > > > k = 4 1.457s 1.917s 
> > > > k = 5 1.559s 1.877s 
> > > > k = 8 1.455s 1.803s 
> > > > k = 12 1.337s 1.501s 
> > > > k = 20 1.123s 1.499s 
> > > > k = 36 1.125s 1.502s 
> > > > 
> > > > k = 128 reliably stalled, there were random stalls in other cases 
> > > > 
> > > > FTDI232R, 
> > > > 2 Mbit/s 15 char message 20 char message 
> > > > k = 0 5.478s 3.755s 
> > > > k = 1 4.929s 3.030s 
> > > > k = 2 2.506s 3.339s 
> > > > k = 3 2.459s 2.020s 
> > > > k = 4 1.708s 1.061s 
> > > > k = 5 1.671s 1.032s 
> > > > k = 8 0.764s 1.021s 
> > > > k = 12 0.772s 1.014s 
> > > > k = 20 0.763s 1.009s 
> > > > k = 36 0.758s 1.007s 
> > > > k = 128 0.757s 1.008s 
> > > > 
> > > > FTDI232R, 
> > > > 3 Mbit/s 15 char message 20 char message 
> > > > k = 0 8.216s 10.007s 
> > > > k = 1 5.006s 4.344s 
> > > > k = 2 3.338s 1.602s 
> > > > k = 3 2.406s 1.444s 
> > > > k = 4 1.766s 1.316s 
> > > > k = 5 1.599s 1.673s 
> > > > k = 8 1.040s 1.327s 
> > > > k = 12 1.071s 1.312s 
> > > > 
> > > > With k = 20, k = 36 and k = 128 communication stalled. 
> > > 
> > > Some of the results seem odd, hard to understand, like why the message rate improves so much as k is increased, but so dramatically at 3 Mbps. They all seem to approach ~1.3 second as k increases. At k=0 they are around 1 ms per message, which is the polling rate... if you adjust it. I think the default for FTDI was 8 ms.
> > Let me first comment 2Mbit/s results. FTDI transfers data in 64-byte 
> > blocks (they say that actual payload is 62-bytes and there are 2-bytes 
> > of protocol info). With 15 characters messages 0.764s really means 
> > 98% of use of serial bandwidth, so essentiall as good as possible. 
> 
> Yeah, I'm not following that at all.  At k=8, the 2 Mbps FTDI transferred in 1.021s.  What is 0.764s???  

It seems that your news agent messed formating of tables.  I gave
results in two columns, one column for 15 character messages, second
for 20 character messages.  0.764s is for 15 character messages,
1.021s is for 20 character messages.

> > Corresponding k = 8 means really 9 messages in transit, so 135 
> > characters which is slightly more than 2 buffers. More data in 
> > transit does not help, but also does not make things worse. 
> > With 20 charaster messages main improvement is at k = 4 which 
> > means 100 characters, which is smaller than 2 buffers, with extra 
> > improvements for more data in transit. With CH340 and 15 char 
> > messages we see main improvement for k = 2, which corresponds 
> > to 45 characters in transit. With 20 char messages we get 
> > impovement for k = 1 which is 40 charactes in transit. 
> > CH340 uses 32 character transfer buffers, so improvemnet corresponds 
> > to somwhat more than 1 buffer in transit. Now, if transfers 
> > between converter and PC were at optimal times, then one buffer 
> > + one character would be enough to get full serial speed. But 
> > USB tranfers can not be started at arbitrary times, IIUC there 
> > are discrete time slots when transfer can occur. When tranfer 
> > can not be done in given slot it must wait for next slot. 
> > So, depending on locations of possible slots more buffering 
> > and more data in transit may be needed for optimal performance. 
> > OTOH 2-3 buffers should be enough to allow PC to get full 
> > bandwidth and this is in good agreement with FTDI results. 
> > In case of CH340 there is extra factor: CH340 also uses 8 byte 
> > transfers. I do not know what function they have, but 
> > resonably likely guess is that those 8 byte pack tranfer control 
> > info that FTDI bundles with normal data. Anyway, those 
> > are "interrupt" tranfers in USB sense, so have higher priority 
> > than data transfer. Resonable guess it that they steal some 
> > USB bandwith from data tranfers. Also, smaller than maximal 
> > data block size limits efficiency, so it is possible that 
> > CH340 is limited by USB bandwith (lack of enough slots). 
> > 
> > Now, concerning 3 Mbits/s, due to different serial speed 
> > optimal times for transfers are different than in 2 Mbits/s 
> > case. It is possible that there is worse fit of desired 
> > and possible transfer times. Buffering allows to at least 
> > partially cure this, so initial improvement. But clearly, 
> > there is some extra bottleneck. Now some speculation: 
> > with 1/8 ms USB-2.0 cycle, there is 1500 FS clock per 
> > cycle. I would have to look at spec to be sure, but this 
> > is close to 150 byte worst case FS transfer. Beside data 
> > there is some USB protocol overhead and (speculatively) it 
> > is possible that low level USB diver may refuse to schedule 
> > two 64-byte transfers in single cycle. In such case effective 
> > bandwith for serial data would be 4096000 bits, which 
> > correspond to 5120000 serial bits (serial sends start and stop 
> > bits which are not needed for USB). This is less than 
> > full duplex 3 Mbits/s (both directions add to 6 Mbits/s and 
> > must go trouh the same USB). With larger amount of data in 
> > transit this could give wild oscilations in amount of 
> > buffered data, leading to slowdown when buffers get empty 
> > and giving stall when receive buffer overflows. 
> > 
> > Of course there is another speculation: convertor may be fake. 
> > Supposedly fakes use MCU-s with special program. Software 
> > could crate delays which limit transfer rate at 3 Mbits/s 
> > and lead to data loss/stall with more data in transit.

I last part I was partially wrong.  USB-2.0 spec says that transmission
between PC and high speed hub is always high speed.  For full speed
devices hub is supposed to buffer messages and transmit to device
at its speed.  In effect PC needs two high speed messages per low
speed message.  My tests above was with converter connected via
high speed hub.  There was also Stlink dongle plugged into the
same hub.  To remove effect of hub I tried plugging converter
directly into USB-1.1 port on separate USB controller.  That
led to significantly longer times.  I also tried to connect Stlink
into separate port so that converter was the only thing connected
to the hub.  I run several times few cases at 3 Mbits/s, for short
messages and low k results vary significanlty,
for 120 characters and k = 0 I got times from 6.375s to 6.598s.
At 2 Mbits/s in 25 runs I got one outlier at 6.667s, the rest
was between 6.029s and 0m6.049s.

Anyway, USB seem to have significant impact on possible speed,
with full speed convertor and full duplex trasmission 2 Mbits/s
seem to give better speed than 3 Mbits/s.  Maybe better USB
hub could help (I do not know how to find out size of buffers
in my hub, but by the spec hub may have buffers just for 2
bulk transfers or mauch more).  Given the above I would expect
convertor connected via high speed USB to perform better at 3 Mbits/s.

> Reading your tests has made me realize, that while combining the messages for every target into one batch can be a bit unwieldy, I could limit the combinations to the end points on a single card.  The responses have to be combined for the one driver anyway.  Between the 8 end points on a single board I could easily combine those commands, and then stagger the replies without any extra signals between the boards and no special characters in the command stream.  
> 
> Again, thinking out loud, at 3 Mbps, 8 * 150 bits per command is 1,200 bits or 400 us.  That would greatly reduce the wasted time, even with a 1 ms polling period.  It would allow an exchange of 8 commands and 8 replies every 2 ms, or 4,000 per second.  That would be almost 32 per end point, which would great!  Actually, it could be faster than this, since the staggering of the replies, doesn't require the first reply to wait for the last command.  So replies will start at the end of the first command.  The beauty of full-duplex! 
> 
> Any chance you could run your test on the FTDI cable at 3 Mbps with a 1,200 bit block of data (120 characters)?  I imagine the RS-232 waveform is getting a bit triangular a that speed.

See above.  Note that my test slave started replay after receiving first
character.  ATM it seems that with enough overlap at 2 Mbits/s I getting
repeatably almost optimal speed (even with 1.1 port).  But with less
overlap there are randomly looking variations, which probably means
high sensitivity to precise timing of messages.  And at 3 Mbits/s
variation seem to be much worse.

-- 
                              Waldek Hebisch

Reply by Rick C ●December 6, 20222022-12-06

On Monday, December 5, 2022 at 9:30:24 PM UTC-5, anti...@math.uni.wroc.pl wrote:
> Rick C <gnuarm.del...@gmail.com> wrote: 
> > On Sunday, December 4, 2022 at 10:33:28 PM UTC-5, anti...@math.uni.wroc.pl wrote: 
> > > Rick C <gnuarm.del...@gmail.com> wrote: 
> > > > On Sunday, December 4, 2022 at 4:30:35 PM UTC-5, anti...@math.uni.wroc.pl wrote: 
> > > > > Rick C <gnuarm.del...@gmail.com> wrote: 
> > > > > > On Wednesday, November 30, 2022 at 9:08:25 PM UTC-4, anti...@math.uni.wroc.pl wrote: 
> > > > > > > Rick C <gnuarm.del...@gmail.com> wrote: 
> > 
> > > > > > > With relatively cheap convertors 
> > > > > > > on Linux to handle 10000 roundtrips for 15 bytes messages I need 
> > > > > > > the following times: 
> > > > > > > 
> > > > > > > CH340 2Mb/s, waiting, 6.890s 
> > > > > > 
> > > > > > That's 11.3 per target, per second. (128 targets) 
> > > > > > 
> > > > > > > CH340 2Mb/s, overlapped 1.058s 
> > > > > > 
> > > > > > That's pretty close to 74 per target, per second. 
> > > > > > 
> > > > > > I used to use the CH340 devices, but we had intermittent lockups of the serial port when testing all day long. I switched to FTDI and that went away. I think you told me you have no such problems. Maybe it's the CH340 Windows serial drivers. 
> > > > > Well, my use is rather light. Most is for debugging at say 9600 or 
> > > > > 115200. And when plugged in convertor mostly sits idle. I previously 
> > > > > wrote that CH340 did not work at 921600. More testing showed that 
> > > > > it actually worked, but speed was significantly different, I had to 
> > > > > set my MCU to 847000 communicate. This could be bug in Linux driver 
> > > > > (there is rather funky formula connecting speed to parameters 
> > > > > and it looks easy to get it wrong). Similary, when CH340 was set to 576800 
> > > > > I had to set MCU to 541300. Even after matching speed at nomial 
> > > > > 576800, 921600 and 1152000 test time was much (more than 10 times) 
> > > > > higher than for other rates (I only tested 1 character messages at those 
> > > > > rates, did not want to wait for full test). Also, 500000 was significantly 
> > > > > slower than 460800 (but "merely" 2 times slower for 1 character messages 
> > > > > and catching up with longer messages). Still, ATM CH340 looks 
> > > > > resonably good. 
> > > > 
> > > > Yes, it's reasonably good for situations where it does not need to work reliably. I was surprised when the finger was pointed to the CH340 adapter. But someone (probably here) had warned me they are not dependable, and now I know. The cost of a name brand adapter is not so much that it's worth saving the difference, only to have to throw it out and go with FTDI anyway, when you have real work to do. 
> > > Well, I say you what I observed. People say various thing on the 
> > > net. I was interested if net know something about my trouble with 
> > > CP2104 so I googled for "CP2104 lockup". And I got a bunch of 
> > > complaints about FTDI devices, solved by using CP2104. So, there 
> > > is a lot of noise and ATM I prefer to stay with what I see. 
> > 
> > What sort of complaints about FTDI? Did you contact them about it?
> Things like computer locking up (IIUC fixed by newer driver). Or 
> "communication did not work" (no real info). ATM I have enough 
> converters. If I need more/better I will look at FTDI products 
> and possible ask them questions.
> > > > > Remark: I bought all my convertors from Chinese sellers. IIUC 
> > > > > FTDI chip is faked a lot, but other too. Still, I think they 
> > > > > show what is possible and illustrate some difficulties. 
> > > > 
> > > > FTDI fakes no longer work with the FTDI drivers. Maybe they play a cat and mouse game, with each side one upping the other, but it's not worth the bother to try it out. FTDI sells cables. It's easier to just buy them from FTDI. 
> > > AFAIK Linux driver does not discriminate againt non-FTDI devices. 
> > > So fact that convertors works with Linux driver tells you nothing 
> > > about its origin. And for the record, I bought mine several years 
> > > ago. 
> > 
> > I'm not using Linux. I don't have any FTDI fakes. I have some Prolific fakes somewhere, if I could find them. I never had one bricked, but I think it was Prolific that did that some years ago. Or, I may have them confused with FTDI. I remember the bricking driver was released with a Windows update and MS was pretty pissed off when the bricking hit the news.
> It was FTDI who bricked fakes, that was widely discussed. I did not 
> hear about Prolific doing something like that.
> > > > > > > CP2104 2Mb/s, waiting, 2.514s 
> > > > > > > CP2104 2Mb/s, overlapped 1.214s 
> > > > > > 
> > > > > > I don't know what the CP2104 is. 
> > > > > It is a chip by Silicon Laboratories. Datasheet gives contact address 
> > > > > in Austin, TX. 
> > > > > > I'm not certain what "overlapped" means in this test. Did you just continue to send 15 byte messages with no delays 10,000 times? 
> > > > > No. My slave simply returns back each received character. There is 
> > > > > some software delay but it should be less than 2us. So even waiting 
> > > > > test has some overlap at character level. To get more overlap above 
> > > > > I cheated: my test program was sending 1 more character than it should. 
> > > > > So sent message was 16 bytes, read was 15. After reading 15 another 
> > > > > batch of 16 was sent and so on. In total there were 10000 more 
> > > > > characters sent than received. My hope was that OS would read 
> > > > > and buffer excess characters, but it seems that at least for 
> > > > > CP2104 they cause trouble. My current guess is that OS is 
> > > > > reading only when requested, but I did not investigate deeper... 
> > > > > > Since you are in the mood for testing, what happens if you run overlapped, with 128 messages of 15 characters and wait for the replies before sending the next batch? Also, if you don't mind, can you try 20 character messages? 
> > > > > OK, I tried modifeed version of my test program. It first sends 
> > > > > k messages without reading anything, then goes to main loop where 
> > > > > after sending each message it read one. At the end it tail loop 
> > > > > which reads last k messages without sending anything. So, there 
> > > > > is k + 1 messages in transit: after sending message k + i program 
> > > > > waits for answer to message i. In total there is 10000 messages. 
> > > > > Results are: 
> > > > > 
> > > > > CH340, 15 char message 20 char message 
> > > > > k = 0 6.869s 7.163s 
> > > > > k = 1 4.682s 1.320s 
> > > > > k = 2 0.992s 1.320s 
> > > > > k = 3 0.991s 1.319s 
> > > > > k = 4 0.991s 1.320s 
> > > > > k = 5 0.990s 1.319s 
> > > > > k = 8 0.992s 1.320s 
> > > > > k = 12 0.990s 1.320s 
> > > > > k = 20 0.992s 1.319s 
> > > > > k = 36 0.991s 1.321s 
> > > > > k = 128 0.991s 1.319s 
> > > > > 
> > > > > CP2104, 15 char message 20 char message 
> > > > > k = 0 2.508s 3.756s 
> > > > > k = 1 1.897s 1.993s 
> > > > > k = 2 1.668s 2.087s 
> > > > > k = 3 1.486s 1.887s 
> > > > > k = 4 1.457s 1.917s 
> > > > > k = 5 1.559s 1.877s 
> > > > > k = 8 1.455s 1.803s 
> > > > > k = 12 1.337s 1.501s 
> > > > > k = 20 1.123s 1.499s 
> > > > > k = 36 1.125s 1.502s 
> > > > > 
> > > > > k = 128 reliably stalled, there were random stalls in other cases 
> > > > > 
> > > > > FTDI232R, 
> > > > > 2 Mbit/s 15 char message 20 char message 
> > > > > k = 0 5.478s 3.755s 
> > > > > k = 1 4.929s 3.030s 
> > > > > k = 2 2.506s 3.339s 
> > > > > k = 3 2.459s 2.020s 
> > > > > k = 4 1.708s 1.061s 
> > > > > k = 5 1.671s 1.032s 
> > > > > k = 8 0.764s 1.021s 
> > > > > k = 12 0.772s 1.014s 
> > > > > k = 20 0.763s 1.009s 
> > > > > k = 36 0.758s 1.007s 
> > > > > k = 128 0.757s 1.008s 
> > > > > 
> > > > > FTDI232R, 
> > > > > 3 Mbit/s 15 char message 20 char message 
> > > > > k = 0 8.216s 10.007s 
> > > > > k = 1 5.006s 4.344s 
> > > > > k = 2 3.338s 1.602s 
> > > > > k = 3 2.406s 1.444s 
> > > > > k = 4 1.766s 1.316s 
> > > > > k = 5 1.599s 1.673s 
> > > > > k = 8 1.040s 1.327s 
> > > > > k = 12 1.071s 1.312s 
> > > > > 
> > > > > With k = 20, k = 36 and k = 128 communication stalled. 
> > > > 
> > > > Some of the results seem odd, hard to understand, like why the message rate improves so much as k is increased, but so dramatically at 3 Mbps. They all seem to approach ~1.3 second as k increases. At k=0 they are around 1 ms per message, which is the polling rate... if you adjust it. I think the default for FTDI was 8 ms. 
> > > Let me first comment 2Mbit/s results. FTDI transfers data in 64-byte 
> > > blocks (they say that actual payload is 62-bytes and there are 2-bytes 
> > > of protocol info). With 15 characters messages 0.764s really means 
> > > 98% of use of serial bandwidth, so essentiall as good as possible. 
> > 
> > Yeah, I'm not following that at all. At k=8, the 2 Mbps FTDI transferred in 1.021s. What is 0.764s???
> It seems that your news agent messed formating of tables. I gave 
> results in two columns, one column for 15 character messages, second 
> for 20 character messages. 0.764s is for 15 character messages, 
> 1.021s is for 20 character messages.

Yes, I see that now.  Google Groups removes excess spaces.  Not a good idea and for no apparent reason.  If they want to conserve bytes, maybe they should delete the message contents.  That would greatly reduce the noise and only reduce the signal slightly in many cases. 


> > > Corresponding k = 8 means really 9 messages in transit, so 135 
> > > characters which is slightly more than 2 buffers. More data in 
> > > transit does not help, but also does not make things worse. 
> > > With 20 charaster messages main improvement is at k = 4 which 
> > > means 100 characters, which is smaller than 2 buffers, with extra 
> > > improvements for more data in transit. With CH340 and 15 char 
> > > messages we see main improvement for k = 2, which corresponds 
> > > to 45 characters in transit. With 20 char messages we get 
> > > impovement for k = 1 which is 40 charactes in transit. 
> > > CH340 uses 32 character transfer buffers, so improvemnet corresponds 
> > > to somwhat more than 1 buffer in transit. Now, if transfers 
> > > between converter and PC were at optimal times, then one buffer 
> > > + one character would be enough to get full serial speed. But 
> > > USB tranfers can not be started at arbitrary times, IIUC there 
> > > are discrete time slots when transfer can occur. When tranfer 
> > > can not be done in given slot it must wait for next slot. 
> > > So, depending on locations of possible slots more buffering 
> > > and more data in transit may be needed for optimal performance. 
> > > OTOH 2-3 buffers should be enough to allow PC to get full 
> > > bandwidth and this is in good agreement with FTDI results. 
> > > In case of CH340 there is extra factor: CH340 also uses 8 byte 
> > > transfers. I do not know what function they have, but 
> > > resonably likely guess is that those 8 byte pack tranfer control 
> > > info that FTDI bundles with normal data. Anyway, those 
> > > are "interrupt" tranfers in USB sense, so have higher priority 
> > > than data transfer. Resonable guess it that they steal some 
> > > USB bandwith from data tranfers. Also, smaller than maximal 
> > > data block size limits efficiency, so it is possible that 
> > > CH340 is limited by USB bandwith (lack of enough slots). 
> > > 
> > > Now, concerning 3 Mbits/s, due to different serial speed 
> > > optimal times for transfers are different than in 2 Mbits/s 
> > > case. It is possible that there is worse fit of desired 
> > > and possible transfer times. Buffering allows to at least 
> > > partially cure this, so initial improvement. But clearly, 
> > > there is some extra bottleneck. Now some speculation: 
> > > with 1/8 ms USB-2.0 cycle, there is 1500 FS clock per 
> > > cycle. I would have to look at spec to be sure, but this 
> > > is close to 150 byte worst case FS transfer. Beside data 
> > > there is some USB protocol overhead and (speculatively) it 
> > > is possible that low level USB diver may refuse to schedule 
> > > two 64-byte transfers in single cycle. In such case effective 
> > > bandwith for serial data would be 4096000 bits, which 
> > > correspond to 5120000 serial bits (serial sends start and stop 
> > > bits which are not needed for USB). This is less than 
> > > full duplex 3 Mbits/s (both directions add to 6 Mbits/s and 
> > > must go trouh the same USB). With larger amount of data in 
> > > transit this could give wild oscilations in amount of 
> > > buffered data, leading to slowdown when buffers get empty 
> > > and giving stall when receive buffer overflows. 
> > > 
> > > Of course there is another speculation: convertor may be fake. 
> > > Supposedly fakes use MCU-s with special program. Software 
> > > could crate delays which limit transfer rate at 3 Mbits/s 
> > > and lead to data loss/stall with more data in transit.
> I last part I was partially wrong. USB-2.0 spec says that transmission 
> between PC and high speed hub is always high speed. For full speed 
> devices hub is supposed to buffer messages and transmit to device 
> at its speed. In effect PC needs two high speed messages per low 
> speed message. My tests above was with converter connected via 
> high speed hub. There was also Stlink dongle plugged into the 
> same hub. To remove effect of hub I tried plugging converter 
> directly into USB-1.1 port on separate USB controller. That 
> led to significantly longer times. I also tried to connect Stlink 
> into separate port so that converter was the only thing connected 
> to the hub. I run several times few cases at 3 Mbits/s, for short 
> messages and low k results vary significanlty, 
> for 120 characters and k = 0 I got times from 6.375s to 6.598s. 
> At 2 Mbits/s in 25 runs I got one outlier at 6.667s, the rest 
> was between 6.029s and 0m6.049s. 
> 
> Anyway, USB seem to have significant impact on possible speed, 
> with full speed convertor and full duplex trasmission 2 Mbits/s 
> seem to give better speed than 3 Mbits/s. Maybe better USB 
> hub could help (I do not know how to find out size of buffers 
> in my hub, but by the spec hub may have buffers just for 2 
> bulk transfers or mauch more). 

I would not be using a hub at all.  This PC would be dedicated to testing and only a mouse would use a USB in addition to the serial dongle.  Oh, I think they use a bar code scanner too, so three USB ports. 


> Given the above I would expect 
> convertor connected via high speed USB to perform better at 3 Mbits/s.
> > Reading your tests has made me realize, that while combining the messages for every target into one batch can be a bit unwieldy, I could limit the combinations to the end points on a single card. The responses have to be combined for the one driver anyway. Between the 8 end points on a single board I could easily combine those commands, and then stagger the replies without any extra signals between the boards and no special characters in the command stream. 
> > 
> > Again, thinking out loud, at 3 Mbps, 8 * 150 bits per command is 1,200 bits or 400 us. That would greatly reduce the wasted time, even with a 1 ms polling period. It would allow an exchange of 8 commands and 8 replies every 2 ms, or 4,000 per second. That would be almost 32 per end point, which would great! Actually, it could be faster than this, since the staggering of the replies, doesn't require the first reply to wait for the last command. So replies will start at the end of the first command. The beauty of full-duplex! 
> > 
> > Any chance you could run your test on the FTDI cable at 3 Mbps with a 1,200 bit block of data (120 characters)? I imagine the RS-232 waveform is getting a bit triangular a that speed.
> See above. Note that my test slave started replay after receiving first 
> character. ATM it seems that with enough overlap at 2 Mbits/s I getting 
> repeatably almost optimal speed (even with 1.1 port). But with less 
> overlap there are randomly looking variations, which probably means 
> high sensitivity to precise timing of messages. And at 3 Mbits/s 
> variation seem to be much worse. 

The priority protocol I described above would overlap after one message.  So not a lot of difference.  

Thanks for the info.  It was very useful. 

-- 

Rick C.

+-- Get 1,000 miles of free Supercharging
+-- Tesla referral code - https://ts.la/richard11209

Reply by David Brown ●December 6, 20222022-12-06

On 06/12/2022 08:41, Rick C wrote:

> Yes, I see that now.  Google Groups removes excess spaces.  Not a
> good idea and for no apparent reason.  If they want to conserve
> bytes, maybe they should delete the message contents.  That would
> greatly reduce the noise and only reduce the signal slightly in many
> cases.
> 

You do realise it is up to /you/, the person making a post, to snip 
excess content?  For some reason, google posters do this extremely badly 
- either they never snip, or they cut too much (including attributions).

Google groups ruins the format of Usenet posts - including removing 
leading spaces and screwing up line endings.  It's one of the reasons 
why so many Usenet users dislike it.

(Yes, I know you have some particular personal reasons for using GG.)

Reply by Rick C ●December 6, 20222022-12-06

On Tuesday, December 6, 2022 at 8:32:28 AM UTC-4, David Brown wrote:
> On 06/12/2022 08:41, Rick C wrote: 
> 
> > Yes, I see that now. Google Groups removes excess spaces. Not a 
> > good idea and for no apparent reason. If they want to conserve 
> > bytes, maybe they should delete the message contents. That would 
> > greatly reduce the noise and only reduce the signal slightly in many 
> > cases. 
> >
> You do realise it is up to /you/, the person making a post, to snip 
> excess content? For some reason, google posters do this extremely badly 
> - either they never snip, or they cut too much (including attributions). 
> 
> Google groups ruins the format of Usenet posts - including removing 
> leading spaces and screwing up line endings. It's one of the reasons 
> why so many Usenet users dislike it. 
> 
> (Yes, I know you have some particular personal reasons for using GG.)

I guess I should have used a smiley.  I was trying to say that much of what is posted here would be better not posted at all... as a joke. 

-- 

Rick C.

+-+ Get 1,000 miles of free Supercharging
+-+ Tesla referral code - https://ts.la/richard11209

Reply by David Brown ●December 7, 20222022-12-07

On 07/12/2022 03:03, Rick C wrote:
> On Tuesday, December 6, 2022 at 8:32:28 AM UTC-4, David Brown wrote:
>> On 06/12/2022 08:41, Rick C wrote:
>>
>>> Yes, I see that now. Google Groups removes excess spaces. Not a
>>> good idea and for no apparent reason. If they want to conserve
>>> bytes, maybe they should delete the message contents. That would
>>> greatly reduce the noise and only reduce the signal slightly in many
>>> cases.
>>>
>> You do realise it is up to /you/, the person making a post, to snip
>> excess content? For some reason, google posters do this extremely badly
>> - either they never snip, or they cut too much (including attributions).
>>
>> Google groups ruins the format of Usenet posts - including removing
>> leading spaces and screwing up line endings. It's one of the reasons
>> why so many Usenet users dislike it.
>>
>> (Yes, I know you have some particular personal reasons for using GG.)
> 
> I guess I should have used a smiley.  I was trying to say that much of what is posted here would be better not posted at all... as a joke.
> 

I think there's been a lot of interesting stuff posted in this thread. 
Maybe not all of it has been useful to /you/, but you're not paying us 
for the job.  So we chatter - sometimes people learn something new or 
get some new ideas.

Reply by Rick C ●December 7, 20222022-12-07

On Wednesday, December 7, 2022 at 3:08:22 AM UTC-4, David Brown wrote:
> On 07/12/2022 03:03, Rick C wrote: 
> > On Tuesday, December 6, 2022 at 8:32:28 AM UTC-4, David Brown wrote: 
> >> On 06/12/2022 08:41, Rick C wrote: 
> >> 
> >>> Yes, I see that now. Google Groups removes excess spaces. Not a 
> >>> good idea and for no apparent reason. If they want to conserve 
> >>> bytes, maybe they should delete the message contents. That would 
> >>> greatly reduce the noise and only reduce the signal slightly in many 
> >>> cases. 
> >>> 
> >> You do realise it is up to /you/, the person making a post, to snip 
> >> excess content? For some reason, google posters do this extremely badly 
> >> - either they never snip, or they cut too much (including attributions). 
> >> 
> >> Google groups ruins the format of Usenet posts - including removing 
> >> leading spaces and screwing up line endings. It's one of the reasons 
> >> why so many Usenet users dislike it. 
> >> 
> >> (Yes, I know you have some particular personal reasons for using GG.) 
> > 
> > I guess I should have used a smiley. I was trying to say that much of what is posted here would be better not posted at all... as a joke. 
> >
> I think there's been a lot of interesting stuff posted in this thread. 
> Maybe not all of it has been useful to /you/, but you're not paying us 
> for the job. So we chatter - sometimes people learn something new or 
> get some new ideas.

Again, I was not being clear enough.  By "here", I did not mean this thread.  I was referring to newsgroups as a whole. 

IT WAS JUST A JOKE!

Previous 1 23Next

Serial Bus Speed on PCs

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About EmbeddedRelated.com

Social Networks

The Related Media Group