EmbeddedRelated.com
Forums
Memfault State of IoT Report

Shared Communications Bus - RS-422 or RS-485

Started by Rick C November 2, 2022
On 2022-11-07 Rick C wrote in comp.arch.embedded:
> On Monday, November 7, 2022 at 7:07:43 AM UTC-4, Stef wrote: >> On 2022-11-07 Rick C wrote in comp.arch.embedded: >> > On Monday, November 7, 2022 at 5:26:06 AM UTC-5, Stef wrote: >> >> On 2022-11-07 Rick C wrote in comp.arch.embedded: >> > >> > I care. Don't you? >> No, I don't. We do use FTDI chips in our designs to interface a serial >> port to USB. And we also use ready made FTDI cables. We use these chips >> and cables based on their specifications in datasheets and user guides >> etc. I have never felt the need to invesitigate how the UART/USB >> functionality was actually implemented inside the chip. What would I do >> with this knowledge? In a design I must rely on the behaviour as >> specified in the datasheet. > > It's hard to imagine an engineer with no curiosity.
Yes, that's hard. But imagining an engineer who does not care about the internal structure of every single chip he uses is a lot easier (for me). I tend to focus my curiiosity on things that matter to me, don't you? -- Stef One difference between a man and a machine is that a machine is quiet when well oiled.
On 2022-11-07 Rick C wrote in comp.arch.embedded:
> On Monday, November 7, 2022 at 6:55:27 AM UTC-4, Stef wrote: >> On 2022-11-07 Rick C wrote in comp.arch.embedded: >> > On Sunday, November 6, 2022 at 6:34:59 PM UTC-5, Richard Damon wrote: >> >> On 11/6/22 8:56 AM, Rick C wrote: >> >> > There's no point to inter-message delays. If there is an error that causes a loss of framing, the devices will see that and ignore the message. As I've said, the real issue is that the message will not be responded to, and the software will fail. At that point the user will exit the software on the PC and start over. That gives a nice long delay for resyncing. >> >> If the only way to handle a missed message is to abort the whole >> >> software system, that seems to be a pretty bad system. >> > >> > You would certainly think that if your error rate was more than once a hundred years. I expect to be long dead before an RS-422 bus only 10 feet long burps a bit error. >> I would not dare to implement a serial protocol without any form of >> error checking, on any length of cable. >> >> You mention ESD somewhere. This can be a serious disturbance that can >> easily corrupt a few bits. > > Yes, I mentioned ESD somewhere. This is testing newly constructed circuit boards, so is used in an ESD controlled environment. >
You wrote: "I could probably get away with TTL level signals, but I'd like to have the ESD protection these RS-422 chips give. That additional noise immunity means there is an extremely small chance of bit errors. If we have problems, the error handling can be added." This led me to believe you were expecting actual ESD discharges that could disturb your messages. ESD protection is just that: protection against device damage I do not believe ESD protection does anything to improve noise immunity. It just increases the ESD level at which the device will be damaged. And if you have an ESD controlled environment, that is not actually needed.
>> Reminds me of a product where we got windows blue screens during ESD >> testing on a device connected via an FTDI USB to serial adapter. Cable >> length less than 6 feet. > > I assume you mean some other device was being ESD tested? This is not being used in an ESD testing lab. Was the FTDI serial cable RS-232 by any chance? Being single ended, that is much less tolerant of noise.
No a device with an FTDI chip on it was tested. USB cable was <= 6 feet and serial ports were only a few centimeters of TTL level PCB traces. This was reproducable with an evaluation kit with only USB connected.
> >> >> Note, if the master sends out a message, and waits for a response, with >> >> a retry if the message is not replied to, that naturally puts a pause in >> >> the communication bus for inter-message synchronization. >> > >> > The pause is already there by virtue of the protocol. Commands and replies are on different busses. >> > >> > >> >> Based on your description, I can't imagine the master starting a message >> >> for another slave until after the first one answers, or you will >> >> interfere with the arbitration control of the reply bus. >> > >> > Exactly! Now you are starting to catch on. >> So you do wait for a reply, and a reply is only expected on a valid >> message? What if there is no reply, do you retry? If so, you already have >> implemented some basic error checking. For more robustness you could (I >> would) add some kind of CRC. > > There should not be any messages other than "valid" messages. I don't recall specifically what the slave does on messages with bit errors, but I'm pretty sure it simply doesn't know they have bit errors. The message has no checksum or other bit error control. The format has one character to indicate the "command" type. If that character is corrupted, the command is not used, unless it is changed to another valid character (3 of 256 chance).
Okay, the slaves are already implemented? Missed that. So there is some very basic error detection: the command must be valid. And if it is not and the slave does not reply, what does the master do?
> Again, there's no reason to "detect" errors since I've implemented no error protocol. That is many times more complex than simply ignoring the errors, which works because errors don't happen often enough to have an impact on testing.
A test rig that ignores errors. I don't know the requirements of this test and how bad it would be to have an invalid pass/fail result.
> On the Apollo moon missions, they took no precautions against damage from micrometeoroids, because the effort required was not commensurate with the likelihood of the event.
I am not sure what they could have done, but adding effective shields would probably have prohibitive weight consequences, if at all possible. But if you can believe the movie Apollo 13, thre is a real danger from micrometeorites. -- Stef <KnaraKat> Bite me. * TheOne gets some salt, then proceeds to nibble on KnaraKat a little bit....
On Monday, November 7, 2022 at 12:57:27 PM UTC-4, Stef wrote:
> On 2022-11-07 Rick C wrote in comp.arch.embedded: > > On Monday, November 7, 2022 at 7:07:43 AM UTC-4, Stef wrote: > >> On 2022-11-07 Rick C wrote in comp.arch.embedded: > >> > On Monday, November 7, 2022 at 5:26:06 AM UTC-5, Stef wrote: > >> >> On 2022-11-07 Rick C wrote in comp.arch.embedded: > >> > > >> > I care. Don't you? > >> No, I don't. We do use FTDI chips in our designs to interface a serial > >> port to USB. And we also use ready made FTDI cables. We use these chips > >> and cables based on their specifications in datasheets and user guides > >> etc. I have never felt the need to invesitigate how the UART/USB > >> functionality was actually implemented inside the chip. What would I do > >> with this knowledge? In a design I must rely on the behaviour as > >> specified in the datasheet. > > > > It's hard to imagine an engineer with no curiosity. > Yes, that's hard. But imagining an engineer who does not care about the > internal structure of every single chip he uses is a lot easier (for > me). I tend to focus my curiiosity on things that matter to me, don't > you?
By definition curiosity is, "an eager desire to know or learn about something". That's not limited to things I *need* to know about. In fact, I don't limit my curiosity at all. It's a desire, not an act. The knowledge can be very useful, if it opens new ideas for how to use these devices. In fact, I found that the majority of FTDI cables are full speed, which is much more limiting that the few Hi-speed USB cables they make. The Hi-speed cables seem to handle a lot more protocols. So now I'm back to wondering if they are implemented in a CPU based design. -- Rick C. +++- Get 1,000 miles of free Supercharging +++- Tesla referral code - https://ts.la/richard11209
On Monday, November 7, 2022 at 1:20:33 PM UTC-4, Stef wrote:
> On 2022-11-07 Rick C wrote in comp.arch.embedded: > > On Monday, November 7, 2022 at 6:55:27 AM UTC-4, Stef wrote: > >> On 2022-11-07 Rick C wrote in comp.arch.embedded: > >> > On Sunday, November 6, 2022 at 6:34:59 PM UTC-5, Richard Damon wrote: > >> >> On 11/6/22 8:56 AM, Rick C wrote: > >> >> > There's no point to inter-message delays. If there is an error that causes a loss of framing, the devices will see that and ignore the message. As I've said, the real issue is that the message will not be responded to, and the software will fail. At that point the user will exit the software on the PC and start over. That gives a nice long delay for resyncing. > >> >> If the only way to handle a missed message is to abort the whole > >> >> software system, that seems to be a pretty bad system. > >> > > >> > You would certainly think that if your error rate was more than once a hundred years. I expect to be long dead before an RS-422 bus only 10 feet long burps a bit error. > >> I would not dare to implement a serial protocol without any form of > >> error checking, on any length of cable. > >> > >> You mention ESD somewhere. This can be a serious disturbance that can > >> easily corrupt a few bits. > > > > Yes, I mentioned ESD somewhere. This is testing newly constructed circuit boards, so is used in an ESD controlled environment. > > > You wrote: > "I could probably get away with TTL level signals, but I'd like to have > the ESD protection these RS-422 chips give. That additional noise > immunity means there is an extremely small chance of bit errors. If we > have problems, the error handling can be added." > This led me to believe you were expecting actual ESD discharges that > could disturb your messages. > > ESD protection is just that: protection against device damage > > I do not believe ESD protection does anything to improve noise immunity. > It just increases the ESD level at which the device will be damaged.
Yes, you are right. My language there is poor. I should have said I prefer the noise immunity the RS-422 devices have compared to TTL devices *in addition to* the ESD immunity.
> And if you have an ESD controlled environment, that is not actually > needed.
In theory, but I can't control how these will be used in the future. ESD immunity is something I want designed into any application that is connected by a cable.
> >> Reminds me of a product where we got windows blue screens during ESD > >> testing on a device connected via an FTDI USB to serial adapter. Cable > >> length less than 6 feet. > > > > I assume you mean some other device was being ESD tested? This is not being used in an ESD testing lab. Was the FTDI serial cable RS-232 by any chance? Being single ended, that is much less tolerant of noise. > No a device with an FTDI chip on it was tested. USB cable was <= 6 feet > and serial ports were only a few centimeters of TTL level PCB traces. > This was reproducable with an evaluation kit with only USB connected.
So you were shooting high voltages into a device and were surprised the PC it was connected to crashed? I'm not following this at all. I'm pretty sure the FTDI cable is not rated to provide isolation. That has nothing to do with ESD protection. As you say, ESD protection is about damage, not operation.
> >> >> Note, if the master sends out a message, and waits for a response, with > >> >> a retry if the message is not replied to, that naturally puts a pause in > >> >> the communication bus for inter-message synchronization. > >> > > >> > The pause is already there by virtue of the protocol. Commands and replies are on different busses. > >> > > >> > > >> >> Based on your description, I can't imagine the master starting a message > >> >> for another slave until after the first one answers, or you will > >> >> interfere with the arbitration control of the reply bus. > >> > > >> > Exactly! Now you are starting to catch on. > >> So you do wait for a reply, and a reply is only expected on a valid > >> message? What if there is no reply, do you retry? If so, you already have > >> implemented some basic error checking. For more robustness you could (I > >> would) add some kind of CRC. > > > > There should not be any messages other than "valid" messages. I don't recall specifically what the slave does on messages with bit errors, but I'm pretty sure it simply doesn't know they have bit errors. The message has no checksum or other bit error control. The format has one character to indicate the "command" type. If that character is corrupted, the command is not used, unless it is changed to another valid character (3 of 256 chance). > Okay, the slaves are already implemented? Missed that.
A test fixture is in use, with software on the PC. There's no reason to change the protocol in the new test fixture and software unless there is a need, a new requirement.
> So there is some very basic error detection: the command must be valid. > And if it is not and the slave does not reply, what does the master do?
The command being valid is based on as single character. The command is something like, "01 23 X<cr><lf>". I suppose the CR LF might also be required, but I don't recall. It might require one and ignore the other. The whole CR LF thing is such a PITA. The only character that is required for sure, is the "X", which at the moment can be one of three from the possible characters (don't recall if they are 8 bit or 7). I also don't recall if parity checking is used. I do know that I had a flaw in the initial setup that gave intermittent errors. I had the hardest time finding the problem because of using bias in where to look. I tried adding re-transmission, which helped, but it borked up the code pretty well. I guess my software skills are not so good. In the end, it was an Ariane problem where the UART in the FPGA was existing code that was reused. Thinking it was a previously validated module, it was not suspected... at all. Eventually I realized it did not include the input FF synchronization to absolve race conditions. That was left for the system designer to add, since there may be more than one device on the same input. Since that was solved, we've tested thousands of UUTs with no interface bit errors. So I have no worries about this.
> > Again, there's no reason to "detect" errors since I've implemented no error protocol. That is many times more complex than simply ignoring the errors, which works because errors don't happen often enough to have an impact on testing. > A test rig that ignores errors. I don't know the requirements of this > test and how bad it would be to have an invalid pass/fail result.
Since the test will be run, over night, every few seconds, with all UUT errors logged, the chances of the same bit error happening the same way, causing the same miss of a UUT failure some thousands of time (about 7,000), is on the order as a proton decaying. Well, maybe a bit more likely.
> > On the Apollo moon missions, they took no precautions against damage from micrometeoroids, because the effort required was not commensurate with the likelihood of the event. > I am not sure what they could have done, but adding effective shields > would probably have prohibitive weight consequences, if at all possible. > But if you can believe the movie Apollo 13, thre is a real danger from > micrometeorites.
Real, even if very small danger. That's the point. In this case, the impact is small, the likelihood is small, and the work to mitigate the problem is far more effort than justifiable, no matter how emotional people may get about "Errors! OMG, there may be ERRORS!" Maybe I need a heavy duty cabinet to protect against the very real possibility of meteors? https://abc7chicago.com/meteor-california-destroys-home-shower/12425011/ -- Rick C. ++++ Get 1,000 miles of free Supercharging ++++ Tesla referral code - https://ts.la/richard11209
On 2022-11-07 Rick C wrote in comp.arch.embedded:
> On Monday, November 7, 2022 at 12:57:27 PM UTC-4, Stef wrote: >> On 2022-11-07 Rick C wrote in comp.arch.embedded: >> > On Monday, November 7, 2022 at 7:07:43 AM UTC-4, Stef wrote: >> >> On 2022-11-07 Rick C wrote in comp.arch.embedded: >> >> > On Monday, November 7, 2022 at 5:26:06 AM UTC-5, Stef wrote: >> >> >> On 2022-11-07 Rick C wrote in comp.arch.embedded: >> >> > >> >> > I care. Don't you? >> >> No, I don't. We do use FTDI chips in our designs to interface a serial >> >> port to USB. And we also use ready made FTDI cables. We use these chips >> >> and cables based on their specifications in datasheets and user guides >> >> etc. I have never felt the need to invesitigate how the UART/USB >> >> functionality was actually implemented inside the chip. What would I do >> >> with this knowledge? In a design I must rely on the behaviour as >> >> specified in the datasheet. >> > >> > It's hard to imagine an engineer with no curiosity. >> Yes, that's hard. But imagining an engineer who does not care about the >> internal structure of every single chip he uses is a lot easier (for >> me). I tend to focus my curiiosity on things that matter to me, don't >> you? > > By definition curiosity is, "an eager desire to know or learn about something". That's not limited to things I *need* to know about. In fact, I don't limit my curiosity at all. It's a desire, not an act. >
Learn about something != learn about everything Matter to me != *need* to know about My not caring abbout the innards of a particular chip seems to let you think I don't care about anything. But we are not discussing my interests here, but your bus. -- Stef Old age is the most unexpected of things that can happen to a man. -- Trotsky
On 2022-11-07 Rick C wrote in comp.arch.embedded:
> On Monday, November 7, 2022 at 1:20:33 PM UTC-4, Stef wrote: >> On 2022-11-07 Rick C wrote in comp.arch.embedded: >> > On Monday, November 7, 2022 at 6:55:27 AM UTC-4, Stef wrote: >> >> On 2022-11-07 Rick C wrote in comp.arch.embedded: >> >> > On Sunday, November 6, 2022 at 6:34:59 PM UTC-5, Richard Damon wrote: >> >> >> On 11/6/22 8:56 AM, Rick C wrote: >> >> >> > There's no point to inter-message delays. If there is an error that causes a loss of framing, the devices will see that and ignore the message. As I've said, the real issue is that the message will not be responded to, and the software will fail. At that point the user will exit the software on the PC and start over. That gives a nice long delay for resyncing. >> >> >> If the only way to handle a missed message is to abort the whole >> >> >> software system, that seems to be a pretty bad system. >> >> > >> >> > You would certainly think that if your error rate was more than once a hundred years. I expect to be long dead before an RS-422 bus only 10 feet long burps a bit error. >> >> I would not dare to implement a serial protocol without any form of >> >> error checking, on any length of cable. >> >> >> >> You mention ESD somewhere. This can be a serious disturbance that can >> >> easily corrupt a few bits. >> > >> > Yes, I mentioned ESD somewhere. This is testing newly constructed circuit boards, so is used in an ESD controlled environment. >> > >> You wrote: >> "I could probably get away with TTL level signals, but I'd like to have >> the ESD protection these RS-422 chips give. That additional noise >> immunity means there is an extremely small chance of bit errors. If we >> have problems, the error handling can be added." >> This led me to believe you were expecting actual ESD discharges that >> could disturb your messages. >> >> ESD protection is just that: protection against device damage >> >> I do not believe ESD protection does anything to improve noise immunity. >> It just increases the ESD level at which the device will be damaged. > > Yes, you are right. My language there is poor. I should have said I prefer the noise immunity the RS-422 devices have compared to TTL devices *in addition to* the ESD immunity. > > >> And if you have an ESD controlled environment, that is not actually >> needed. > > In theory, but I can't control how these will be used in the future. ESD immunity is something I want designed into any application that is connected by a cable. >
Yes, alway protect accessible parts.
> >> >> Reminds me of a product where we got windows blue screens during ESD >> >> testing on a device connected via an FTDI USB to serial adapter. Cable >> >> length less than 6 feet. >> > >> > I assume you mean some other device was being ESD tested? This is not being used in an ESD testing lab. Was the FTDI serial cable RS-232 by any chance? Being single ended, that is much less tolerant of noise. >> No a device with an FTDI chip on it was tested. USB cable was <= 6 feet >> and serial ports were only a few centimeters of TTL level PCB traces. >> This was reproducable with an evaluation kit with only USB connected. > > So you were shooting high voltages into a device and were surprised the PC it was connected to crashed? I'm not following this at all. I'm pretty sure the FTDI cable is not rated to provide isolation. That has nothing to do with ESD protection. As you say, ESD protection is about damage, not operation.
Ofcourse not into a device. But all over the enclosure, as is required to pass EMC testing. These discharges cause current spikes that can induce currents in parts of your circuits. Part of ESD testing also uses coupling planes, where you fire on a metal plate 'near' the device. That can also give a lot of noise. All these things may not cause device damage like direct ESD discharges, but they can disturb the device operation. Depending on the expected performance level, this may cause a fail. For medical devices you usually cannot get away with worse than "temporary loss of function and recovery without operator intervention". ESD protection is indeed about damage prevention. But passing an ESD test usually requires more than just preventing damage. How would you rate a phone that resets every time you pick it up when you have not properly discharged yourself from static electricity? It may just reboot and work fine after that, but it would still be a crappy phone.
>> >> >> Note, if the master sends out a message, and waits for a response, with >> >> >> a retry if the message is not replied to, that naturally puts a pause in >> >> >> the communication bus for inter-message synchronization. >> >> > >> >> > The pause is already there by virtue of the protocol. Commands and replies are on different busses. >> >> > >> >> > >> >> >> Based on your description, I can't imagine the master starting a message >> >> >> for another slave until after the first one answers, or you will >> >> >> interfere with the arbitration control of the reply bus. >> >> > >> >> > Exactly! Now you are starting to catch on. >> >> So you do wait for a reply, and a reply is only expected on a valid >> >> message? What if there is no reply, do you retry? If so, you already have >> >> implemented some basic error checking. For more robustness you could (I >> >> would) add some kind of CRC. >> > >> > There should not be any messages other than "valid" messages. I don't recall specifically what the slave does on messages with bit errors, but I'm pretty sure it simply doesn't know they have bit errors. The message has no checksum or other bit error control. The format has one character to indicate the "command" type. If that character is corrupted, the command is not used, unless it is changed to another valid character (3 of 256 chance). >> Okay, the slaves are already implemented? Missed that. > > A test fixture is in use, with software on the PC. There's no reason to change the protocol in the new test fixture and software unless there is a need, a new requirement.
Ah, existing stuff.
>> So there is some very basic error detection: the command must be valid. >> And if it is not and the slave does not reply, what does the master do? > > The command being valid is based on as single character. The command is something like, "01 23 X<cr><lf>". I suppose the CR LF might also be required, but I don't recall. It might require one and ignore the other. The whole CR LF thing is such a PITA. The only character that is required for sure, is the "X", which at the moment can be one of three from the possible characters (don't recall if they are 8 bit or 7). I also don't recall if parity checking is used.
Okay, more restrictions on valid messages, yet more error detection present already. ;-)
> I do know that I had a flaw in the initial setup that gave intermittent errors. I had the hardest time finding the problem because of using bias in where to look. I tried adding re-transmission, which helped, but it borked up the code pretty well. I guess my software skills are not so good. In the end, it was an Ariane problem where the UART in the FPGA was existing code that was reused. Thinking it was a previously validated module, it was not suspected... at all. Eventually I realized it did not include the input FF synchronization to absolve race conditions. That was left for the system designer to add, since there may be more than one device on the same input. > > Since that was solved, we've tested thousands of UUTs with no interface bit errors. So I have no worries about this. > > >> > Again, there's no reason to "detect" errors since I've implemented no error protocol. That is many times more complex than simply ignoring the errors, which works because errors don't happen often enough to have an impact on testing. >> A test rig that ignores errors. I don't know the requirements of this >> test and how bad it would be to have an invalid pass/fail result. > > Since the test will be run, over night, every few seconds, with all UUT errors logged, the chances of the same bit error happening the same way, causing the same miss of a UUT failure some thousands of time (about 7,000), is on the order as a proton decaying. Well, maybe a bit more likely. >
Another layer of error detection. ;-)
> >> > On the Apollo moon missions, they took no precautions against damage from micrometeoroids, because the effort required was not commensurate with the likelihood of the event. >> I am not sure what they could have done, but adding effective shields >> would probably have prohibitive weight consequences, if at all possible. >> But if you can believe the movie Apollo 13, thre is a real danger from >> micrometeorites. > > Real, even if very small danger. That's the point. In this case, the impact is small, the likelihood is small, and the work to mitigate the problem is far more effort than justifiable, no matter how emotional people may get about "Errors! OMG, there may be ERRORS!" > > Maybe I need a heavy duty cabinet to protect against the very real possibility of meteors? > > https://abc7chicago.com/meteor-california-destroys-home-shower/12425011/ >
-- Stef The only winner in the War of 1812 was Tchaikovsky. -- David Gerrold
On Monday, November 7, 2022 at 4:30:37 PM UTC-4, Stef wrote:
> On 2022-11-07 Rick C wrote in comp.arch.embedded: > > On Monday, November 7, 2022 at 12:57:27 PM UTC-4, Stef wrote: > >> On 2022-11-07 Rick C wrote in comp.arch.embedded: > >> > On Monday, November 7, 2022 at 7:07:43 AM UTC-4, Stef wrote: > >> >> On 2022-11-07 Rick C wrote in comp.arch.embedded: > >> >> > On Monday, November 7, 2022 at 5:26:06 AM UTC-5, Stef wrote: > >> >> >> On 2022-11-07 Rick C wrote in comp.arch.embedded: > >> >> > > >> >> > I care. Don't you? > >> >> No, I don't. We do use FTDI chips in our designs to interface a serial > >> >> port to USB. And we also use ready made FTDI cables. We use these chips > >> >> and cables based on their specifications in datasheets and user guides > >> >> etc. I have never felt the need to invesitigate how the UART/USB > >> >> functionality was actually implemented inside the chip. What would I do > >> >> with this knowledge? In a design I must rely on the behaviour as > >> >> specified in the datasheet. > >> > > >> > It's hard to imagine an engineer with no curiosity. > >> Yes, that's hard. But imagining an engineer who does not care about the > >> internal structure of every single chip he uses is a lot easier (for > >> me). I tend to focus my curiiosity on things that matter to me, don't > >> you? > > > > By definition curiosity is, "an eager desire to know or learn about something". That's not limited to things I *need* to know about. In fact, I don't limit my curiosity at all. It's a desire, not an act. > > > Learn about something != learn about everything > Matter to me != *need* to know about > > My not caring abbout the innards of a particular chip seems to let you > think I don't care about anything. But we are not discussing my > interests here, but your bus.
Seems to me you wanted to talk about my interests when you said, "Why are you discussing this?" and then continued discussing that issue for some half dozen more posts. -- Rick C. ----- Get 1,000 miles of free Supercharging ----- Tesla referral code - https://ts.la/richard1120
On Monday, November 7, 2022 at 5:04:29 PM UTC-4, Stef wrote:
> On 2022-11-07 Rick C wrote in comp.arch.embedded: > > On Monday, November 7, 2022 at 1:20:33 PM UTC-4, Stef wrote: > >> On 2022-11-07 Rick C wrote in comp.arch.embedded: > >> > On Monday, November 7, 2022 at 6:55:27 AM UTC-4, Stef wrote: > >> >> On 2022-11-07 Rick C wrote in comp.arch.embedded: > >> >> > On Sunday, November 6, 2022 at 6:34:59 PM UTC-5, Richard Damon wrote: > >> >> >> On 11/6/22 8:56 AM, Rick C wrote: > >> >> >> > There's no point to inter-message delays. If there is an error that causes a loss of framing, the devices will see that and ignore the message. As I've said, the real issue is that the message will not be responded to, and the software will fail. At that point the user will exit the software on the PC and start over. That gives a nice long delay for resyncing. > >> >> >> If the only way to handle a missed message is to abort the whole > >> >> >> software system, that seems to be a pretty bad system. > >> >> > > >> >> > You would certainly think that if your error rate was more than once a hundred years. I expect to be long dead before an RS-422 bus only 10 feet long burps a bit error. > >> >> I would not dare to implement a serial protocol without any form of > >> >> error checking, on any length of cable. > >> >> > >> >> You mention ESD somewhere. This can be a serious disturbance that can > >> >> easily corrupt a few bits. > >> > > >> > Yes, I mentioned ESD somewhere. This is testing newly constructed circuit boards, so is used in an ESD controlled environment. > >> > > >> You wrote: > >> "I could probably get away with TTL level signals, but I'd like to have > >> the ESD protection these RS-422 chips give. That additional noise > >> immunity means there is an extremely small chance of bit errors. If we > >> have problems, the error handling can be added." > >> This led me to believe you were expecting actual ESD discharges that > >> could disturb your messages. > >> > >> ESD protection is just that: protection against device damage > >> > >> I do not believe ESD protection does anything to improve noise immunity. > >> It just increases the ESD level at which the device will be damaged. > > > > Yes, you are right. My language there is poor. I should have said I prefer the noise immunity the RS-422 devices have compared to TTL devices *in addition to* the ESD immunity. > > > > > >> And if you have an ESD controlled environment, that is not actually > >> needed. > > > > In theory, but I can't control how these will be used in the future. ESD immunity is something I want designed into any application that is connected by a cable. > > > Yes, alway protect accessible parts. > > > >> >> Reminds me of a product where we got windows blue screens during ESD > >> >> testing on a device connected via an FTDI USB to serial adapter. Cable > >> >> length less than 6 feet. > >> > > >> > I assume you mean some other device was being ESD tested? This is not being used in an ESD testing lab. Was the FTDI serial cable RS-232 by any chance? Being single ended, that is much less tolerant of noise. > >> No a device with an FTDI chip on it was tested. USB cable was <= 6 feet > >> and serial ports were only a few centimeters of TTL level PCB traces. > >> This was reproducable with an evaluation kit with only USB connected. > > > > So you were shooting high voltages into a device and were surprised the PC it was connected to crashed? I'm not following this at all. I'm pretty sure the FTDI cable is not rated to provide isolation. That has nothing to do with ESD protection. As you say, ESD protection is about damage, not operation. > Ofcourse not into a device. But all over the enclosure, as is required > to pass EMC testing. These discharges cause current spikes that can > induce currents in parts of your circuits. Part of ESD testing also uses > coupling planes, where you fire on a metal plate 'near' the device. That > can also give a lot of noise. All these things may not cause device > damage like direct ESD discharges, but they can disturb the device > operation. Depending on the expected performance level, this may cause a > fail. For medical devices you usually cannot get away with worse than > "temporary loss of function and recovery without operator intervention". > > ESD protection is indeed about damage prevention. But passing an ESD > test usually requires more than just preventing damage. > > How would you rate a phone that resets every time you pick it up when > you have not properly discharged yourself from static electricity? It > may just reboot and work fine after that, but it would still be a crappy > phone.
Lol! I probably would barely notice! Cell phones are among the most unreliable devices we use with any regularity. I recall a Dave Barry article that was talking about cell phones which he mocked by describing typical conversations as, "What? WHAT?" Now, it's more like, "Hello...? Hello...? <click>"
> >> >> >> Note, if the master sends out a message, and waits for a response, with > >> >> >> a retry if the message is not replied to, that naturally puts a pause in > >> >> >> the communication bus for inter-message synchronization. > >> >> > > >> >> > The pause is already there by virtue of the protocol. Commands and replies are on different busses. > >> >> > > >> >> > > >> >> >> Based on your description, I can't imagine the master starting a message > >> >> >> for another slave until after the first one answers, or you will > >> >> >> interfere with the arbitration control of the reply bus. > >> >> > > >> >> > Exactly! Now you are starting to catch on. > >> >> So you do wait for a reply, and a reply is only expected on a valid > >> >> message? What if there is no reply, do you retry? If so, you already have > >> >> implemented some basic error checking. For more robustness you could (I > >> >> would) add some kind of CRC. > >> > > >> > There should not be any messages other than "valid" messages. I don't recall specifically what the slave does on messages with bit errors, but I'm pretty sure it simply doesn't know they have bit errors. The message has no checksum or other bit error control. The format has one character to indicate the "command" type. If that character is corrupted, the command is not used, unless it is changed to another valid character (3 of 256 chance). > >> Okay, the slaves are already implemented? Missed that. > > > > A test fixture is in use, with software on the PC. There's no reason to change the protocol in the new test fixture and software unless there is a need, a new requirement. > Ah, existing stuff.
Yes, the very first sentence of the very first post was, "I have a test fixture that uses RS-232 to communicate with a PC."
> >> So there is some very basic error detection: the command must be valid. > >> And if it is not and the slave does not reply, what does the master do? > > > > The command being valid is based on as single character. The command is something like, "01 23 X<cr><lf>". I suppose the CR LF might also be required, but I don't recall. It might require one and ignore the other. The whole CR LF thing is such a PITA. The only character that is required for sure, is the "X", which at the moment can be one of three from the possible characters (don't recall if they are 8 bit or 7). I also don't recall if parity checking is used. > Okay, more restrictions on valid messages, yet more error detection > present already. ;-) No real detection since there's no awareness of the error. It's like saying your transmission has "error detection" because it can stop working because a gear tooth broke off and jammed the whole transmission breaking more gears.
> > I do know that I had a flaw in the initial setup that gave intermittent errors. I had the hardest time finding the problem because of using bias in where to look. I tried adding re-transmission, which helped, but it borked up the code pretty well. I guess my software skills are not so good. In the end, it was an Ariane problem where the UART in the FPGA was existing code that was reused. Thinking it was a previously validated module, it was not suspected... at all. Eventually I realized it did not include the input FF synchronization to absolve race conditions. That was left for the system designer to add, since there may be more than one device on the same input. > > > > Since that was solved, we've tested thousands of UUTs with no interface bit errors. So I have no worries about this. > > > > > >> > Again, there's no reason to "detect" errors since I've implemented no error protocol. That is many times more complex than simply ignoring the errors, which works because errors don't happen often enough to have an impact on testing. > >> A test rig that ignores errors. I don't know the requirements of this > >> test and how bad it would be to have an invalid pass/fail result. > > > > Since the test will be run, over night, every few seconds, with all UUT errors logged, the chances of the same bit error happening the same way, causing the same miss of a UUT failure some thousands of time (about 7,000), is on the order as a proton decaying. Well, maybe a bit more likely. > > > Another layer of error detection. ;-)
Errors in the UUT. If there is an error in the comms link, we likely would not even know about it. This will be an interesting test for both the UUTs and the comms link. I'm not certain how many messages it currently takes to implement any given test, but it should be possible to run the tests in parallel minimizing wait times for the PC software. I would estimate the total test time for a chassis to be between 10 and 60 seconds, so between 1,200 and 7,200 tests in the 20 hour soak time. As I learn more about the FTDI device, I am more pessimistic about the throughput. I could shove the details of tests into the FPGAs, so the commands are more like, run test 1 on channel number 2. That would cut the number of tests significantly, but require much more work in updating the FPGA software. I think I'll start with a direct transfer of the existing protocol. -- Rick C. ----+ Get 1,000 miles of free Supercharging ----+ Tesla referral code - https://ts.la/richard11209
On 2022-11-07 Rick C wrote in comp.arch.embedded:
> On Monday, November 7, 2022 at 4:30:37 PM UTC-4, Stef wrote:
...
>> My not caring abbout the innards of a particular chip seems to let you >> think I don't care about anything. But we are not discussing my >> interests here, but your bus. > > Seems to me you wanted to talk about my interests when you said, "Why are you discussing this?" and then continued discussing that issue for some half dozen more posts.
That was not my intention. It seemed to me that you cared about the internal implementation of the FTDI chip in relation to your bus problem. I just wanted to point out that is of no concern for your bus operation. And then I just got dragged in. ;-) -- Stef He's the kind of guy, that, well, if you were ever in a jam he'd be there... with two slices of bread and some chunky peanut butter.
Rick C <gnuarm.deletethisbit@gmail.com> writes:
> I could shove the details of tests into the FPGAs, so the commands are > more like, run test 1 on channel number 2. That would cut the number > of tests significantly, but require much more work in updating the > FPGA software.
Are we circling back to the idea putting a microprocessor on the test board? Ivan Sutherland famously called this a wheel of reincarnation: http://www.cap-lore.com/Hardware/Wheel.html

Memfault State of IoT Report