On 25/05/2014 19:44, haiticare2011@gmail.com wrote:> Hi all, > > I've been a SW developer, but one question I've never addressed is: What OS > latencies and CPU delays are there in a compiled, running program? Is there any > simple way to minimize them? > > I am thinking of a simple c code program that reads data off a pci card and then > writes it to memory like a PCIe SSD drive. I understand there will be various > hardware latencies and delays in the data input. > > But what if the assembler program is executing? Does the OS "butt in" and context > switch/ multi-task during execution of a continuous compiled program? If so, how > does one shut that off? > > I've read about this somewhere, but never paid attention to it. > > Thanks in advance > jb >From this and your other posts I think you are trying to make a data acquisition system which will store up to 10Mbyte/s on a PC hard drive. You've got three ways (at least) to get the data into the PC: USB, Ethernet and PCI. USB and Ethernet are relatively easy and work with any kind of PC and won't need fancy driver level code - so will probably work with any OS. Ethernet is the most simple from the PC software point of view. 10Mbyte/s is wire speed maxed out for 100Mb Ethernet so you'll struggle If you try to use a typical micro's on chip MAC. You can get ARM based micros with high speed USB. If I were doing this (and I have , many times) I'd use an FPGA to control the ADC , buffer the data and drive Ethernet via an off chip Gigabit PHY. You will need to buffer the data from the ADC and unless you are very clever with the host computer you'll need a decent sized buffer for the data. How big depends on so many variables that it's very risky to guess - you'll need to check but I would start with enough to store 500mS worth of data (5M bytes in your case so use a 32Mbyte or so SDRAM). In order to control the Ethernet interface you'll need to be quite confident with VHDL or Verilog or use a soft micro on the FPGA and get into a different kind of mess. If all your experience is with software you might do better with the micro with built in high speed USB but you'll need one which supports external SDRAM at the same time and your data throughput will be challenging. PCI has all the problems of USB and Ethernet interfaces and a lot of additional ones as well - don't go that way unless there is a really good reason for it. Unless you need a lot of these I suggest you just buy something, and of course if you want a good design done you could always email me :-) Michael Kellett
Hidden latencies and delays for a running program?
Started by ●May 25, 2014
Reply by ●May 26, 20142014-05-26
Reply by ●May 26, 20142014-05-26
On Sunday, May 25, 2014 5:11:37 PM UTC-4, rickman wrote:> On 5/25/2014 3:06 PM, Tauno Voipio wrote: > > > On 25.5.14 21:44, haizticare2011@gmail.com wrote: > > >> Hi all, > > >> > > >> I've been a SW developer, but one question I've never addressed is: > > >> What OS > > >> latencies and CPU delays are there in a compiled, running program? Is > > >> there any > > >> simple way to minimize them? > > >> > > >> I am thinking of a simple c code program that reads data off a pci > > >> card and then > > >> writes it to memory like a PCIe SSD drive. I understand there will be > > >> various > > >> hardware latencies and delays in the data input. > > >> > > >> But what if the assembler program is executing? Does the OS "butt in" > > >> and context > > >> switch/ multi-task during execution of a continuous compiled program? > > >> If so, how > > >> does one shut that off? > > > > > > Yes, it does, and you should not attempt to prevent it, > > > as you may make the whole system totally unresponsive. > > > > I worked on a real time PC in which we had installed a board. It ran NT > > with a real time extension. First pass of my board had a bug which hung > > the bus transfer and the *entire* machine hung. Wow! The only way out > > was a hardware reset. > > > > > > > There is little difference between a compiled C program > > > and an assembly program performing the same algorithm. > > > > > > The write to the SSD drive is far from simple, if you > > > have a file system on the card. Also, the SSD may have > > > an internal controller which needs time slots for its > > > own purposes. Examples are SD (camera) cards and USB sticks. > > > > JB seems to have a lot to learn about real time systems. The part I > > don't quite get is why the PC side has to be real time. If he uses a > > separate MCU board to capture the ADC data (the important real time part > > of the problem) it can then send the data to a PC, not in "real time", > > just with a through put that exceeds the data rate. Adequate buffering > > on the MCU card will assure no loss of data. Then the PC can store the > > data on any media it wishes. Sounds simple enough to me but I don't get > > why he continues to flog this horse. > > > > -- > > > > RickRick If this is as trivial as you say, then there would be more examples of how to do it that work. But there aren't. There is little consensus on how to achieve good data throughput. Solutions range all over the place, and few work. For example, there is "Starter Ware," a low overhead OS for ARM from Ti. But if you read the forums, much of the documentation is incorrect and unworkable. Now, you recommend a "mcu board." Now we're getting somewhere. Do you have any actual examples of this working? Which mcu? How was the bus to the PC configured? Since you say "I have a lot to learn, teach me your concrete system example." JB
Reply by ●May 26, 20142014-05-26
On Monday, May 26, 2014 3:23:58 AM UTC-4, David Brown wrote:> On 26/05/14 08:19, rickman wrote: > > > On 5/26/2014 1:51 AM, Tauno Voipio wrote: > > >> > > >> Maybe the PHB has orederd him to make the PC a real-time > > >> capturing system. Anyway, he'll have a stiff climb up the > > >> learning steps. > > > > > > PHB? Do you mean powers that be? He has been asking about embedded, > > > but seems to think he has to put the entire system on the embedded > > > device. I don't want to give the guy grief, but it sounds like he is > > > not familiar enough with embedded design to even know if his task can > > > use it effectively or not. He seems to reject a lot of suggestions > > > before he understands them. I'm also very unclear on what data rate he > > > really needs from the front end to the storage. > > > > > > > The OP is very unclear about the data rate he needs (he alternates over > > several orders of magnitude), and has no idea at all about the sample > > size. The worrying thing is that he does not seem to consider this a > > problem, and does not realise that this project needs a lot of thought > > and planning, then a lot of research and prototyping, before he can > > start looking at implementation and development. > > > > He also has virtually no idea about the technologies for implementing > > the system. He has some fixed pre-conceived ideas that he won't change > > no matter what people tell him - he believes USB latency will cause > > trouble, he believes SSD is the greatest invention since sliced bread, > > he believes assembly programming will be more "real time" than C > > programming. > > > > The guy may be a good SW developer for all I know, but he is clearly far > > out of his depth with this project. I don't know if this is his own > > fault, or that of a PHB, but he desperately needs help here (of a kind > > that we cannot give him) before he wastes lots of time and money.Thanks for the compliments. :) I'm convinced the rank-and-file developers out there don't have their ducks in a row on this one, either. Judging by the BBB developers attempts, it's still the Wild West. :)
Reply by ●May 26, 20142014-05-26
On Sun, 25 May 2014 11:44:25 -0700 (PDT), haiticare2011@gmail.com wrote:>I've been a SW developer, but one question I've never addressed is: What OS >latencies and CPU delays are there in a compiled, running program? Is there any >simple way to minimize them? > >I am thinking of a simple c code program that reads data off a pci card and then >writes it to memory like a PCIe SSD drive. I understand there will be various >hardware latencies and delays in the data input.If that is all you need, what do you need an OS for ? Just use an ISR (Interrupt Service Routine) for reading your input card (such as an ADC) and an other ISR for writing the data to SSD drive (write complete interrupt). The main program then consists of initializing those two interrupt service routines and a program body, consisting of an eternal loop, consisting of a (low power) wait for interrupt instruction.
Reply by ●May 26, 20142014-05-26
On 5/26/2014 8:50 AM, haiticare2011@gmail.com wrote:> Rick > If this is as trivial as you say, then there would be more examples of how to > do it that work. But there aren't. There is little consensus on how to achieve > good data throughput. Solutions range all over the place, and few work. For > example, there is "Starter Ware," a low overhead OS for ARM from Ti. But if you > read the forums, much of the documentation is incorrect and unworkable. > > Now, you recommend a "mcu board." Now we're getting somewhere. Do you have any > actual examples of this working? Which mcu? How was the bus to the PC > configured? Since you say "I have a lot to learn, teach me your concrete system > example."No, I have not built your system for you already. In the other thread I have given you lots of material for you to work with. On the other hand you have not given us a set of requirements to work from. When I get the requirements I will consider if I want to take on the job. :) -- Rick
Reply by ●May 26, 20142014-05-26
On 5/26/2014 5:50 AM, haiticare2011@gmail.com wrote:> If this is as trivial as you say, then there would be more examples > of how to do it that work. But there aren't.Um, are you requiring an "example" to be of the form: _Application Note 1234: Using the C Language Under <OS> to Copy <arbitrary> Data from a PCI Card in a PC(?) to an SSD in the Same PC without any Constraints on Timeliness using Free Tools" If that's the case, I can save you a lot of time...> There is little > consensus on how to achieve good data throughput. Solutions range all > over the place, and few work.Which *specifically* "don't work"? And, do they not work because of ommissions on YOUR part? If not, please identify *why* they "don't work". E.g., the solution I provided *does* work as I have used it on dozens of projects. If you can't see how use on a non-PC applies, then I can site my 9-track tape driver that runs on a PC... not a PCI card (ISA) and not an SSD (IDE) but if you cant work "in the abstract", you'll never work in the *specific*!> For example, there is "Starter Ware," a > low overhead OS for ARM from Ti. But if you read the forums, much of > the documentation is incorrect and unworkable.Then "Starter Ware" requires more of you than you are able to provide. Fine. Pick something else. You probably can't use Limbo, either -- due to your unspecified timing constraints, hardware interface, file formats, filesystem choice, etc. Jaluna would intimidate you with its build environment. RTEMS might not provide the (unspecified) user interface you need. QNX costs money. You probably can't write on bare iron... etc. Hey, maybe the Linux folks can entertain your queries! I'm sure there's a newsgroup/forum for that! In all seriousness, until *you* know (meaning "can put in unambiguous quantifiable terms") what your complete set of criteria are, you're just going to be squeezing balloons -- always chasing, never achieving. Good luck!
Reply by ●May 27, 20142014-05-27
On Monday, May 26, 2014 6:15:08 PM UTC-4, Don Y wrote:> On 5/26/2014 5:50 AM, ihaiticare2011@gmail.com wrote: > > > > > If this is as trivial as you say, then there would be more examples > > > of how to do it that work. But there aren't. > > > > Um, are you requiring an "example" to be of the form: > > _Application Note 1234: Using the C Language Under <OS> to > > Copy <arbitrary> Data from a PCI Card in a PC(?) to an SSD > > in the Same PC without any Constraints on Timeliness using > > Free Tools" > > If that's the case, I can save you a lot of time... > > > > > There is little > > > consensus on how to achieve good data throughput. Solutions range all > > > over the place, and few work. > > > > Which *specifically* "don't work"? And, do they not work because of > > ommissions on YOUR part? If not, please identify *why* they "don't > > work". E.g., the solution I provided *does* work as I have used it > > on dozens of projects. If you can't see how use on a non-PC applies, > > then I can site my 9-track tape driver that runs on a PC... not a > > PCI card (ISA) and not an SSD (IDE) but if you cant work "in the > > abstract", you'll never work in the *specific*! > > > > > For example, there is "Starter Ware," a > > > low overhead OS for ARM from Ti. But if you read the forums, much of > > > the documentation is incorrect and unworkable. > > > > Then "Starter Ware" requires more of you than you are able to provide. > > Fine. Pick something else. > > > > You probably can't use Limbo, either -- due to your unspecified > > timing constraints, hardware interface, file formats, filesystem > > choice, etc. > > > > Jaluna would intimidate you with its build environment. > > > > RTEMS might not provide the (unspecified) user interface you > > need. > > > > QNX costs money. > > > > You probably can't write on bare iron... > > > > etc. > > > > Hey, maybe the Linux folks can entertain your queries! I'm > > sure there's a newsgroup/forum for that! > > > > In all seriousness, until *you* know (meaning "can put in unambiguous > > quantifiable terms") what your complete set of criteria are, you're > > just going to be squeezing balloons -- always chasing, never achieving. > > > > Good luck!Actually, the failure of the ARM community to achieve any serious IO is embarassingly apparent and does not require any bureaucratic structure to see it. The GPIO data rate was coaxed into the mHz range, but with great difficulty. It is natively in the low kHz range. General material is offered, which evaporates under scrutiny...
Reply by ●May 27, 20142014-05-27
On 2014-05-27, haiticare2011@gmail.com <haiticare2011@gmail.com> wrote:> On Monday, May 26, 2014 6:15:08 PM UTC-4, Don Y wrote: >> On 5/26/2014 5:50 AM, ihaiticare2011@gmail.com wrote: >> > There is little >> > consensus on how to achieve good data throughput. Solutions range all >> > over the place, and few work. >> >> Which *specifically* "don't work"? And, do they not work because of >> ommissions on YOUR part? If not, please identify *why* they "don't >> work". E.g., the solution I provided *does* work as I have used it >> on dozens of projects. If you can't see how use on a non-PC applies, >> then I can site my 9-track tape driver that runs on a PC... not a >> PCI card (ISA) and not an SSD (IDE) but if you cant work "in the >> abstract", you'll never work in the *specific*! >>800/1600 BPI or one of those new-fangled 6250 BPI tape drives ? :-) (PS: note above smiley)>> > For example, there is "Starter Ware," a >> > low overhead OS for ARM from Ti. But if you read the forums, much of >> > the documentation is incorrect and unworkable. >>StarterWare is not a OS; it's a support library for bare metal programming.>> Then "Starter Ware" requires more of you than you are able to provide. >> Fine. Pick something else. >> >> You probably can't use Limbo, either -- due to your unspecified >> timing constraints, hardware interface, file formats, filesystem >> choice, etc. >> >> Jaluna would intimidate you with its build environment. >> >> RTEMS might not provide the (unspecified) user interface you >> need. >> >> QNX costs money. >> >> You probably can't write on bare iron... >> >> etc. >> >> Hey, maybe the Linux folks can entertain your queries! I'm >> sure there's a newsgroup/forum for that! >> >> In all seriousness, until *you* know (meaning "can put in unambiguous >> quantifiable terms") what your complete set of criteria are, you're >> just going to be squeezing balloons -- always chasing, never achieving. >> >> Good luck! > > Actually, the failure of the ARM community to achieve any serious IO is > embarassingly apparent and does not require any bureaucratic structure to > see it. > The GPIO data rate was coaxed into the mHz range, but with great difficulty. > It is natively in the low kHz range. > General material is offered, which evaporates under scrutiny...Are you confusing the GPIO achievable rates under Linux with those achievable in a bare metal environment ? IIRC, those enhanced Linux speeds involve writing the GPIO lines directly via memory mapped I/O rather than through a driver call for each I/O manipulation. If it's the memory mapped option under Linux you are talking about, then I don't see that as "difficult". I would like to say that while I am a programmer as part of my day job, my embedded work is purely a hobby. However, the questions others here have asked you are among the questions I would already have asked myself before posting here. Doing embedded work requires a certain mindset and the ability to pull together data from various sources. The questions you have been asked are good questions and are designed to make you think about the problem and what hardware/timing constraints are required to solve the problem. Doing research (and knowing how to do that research) is a part of any serious embedded project and it's not something you can avoid. Simon. PS: As a good natured comment, I wonder if I should start applying for embedded jobs. :-) Sometimes, I think that as a hobbyist I seem to know more about this world than those paid to do it for a living. :-) PPS: The above PS doesn't apply to the OP. I get the feeling he's being forced into something by his boss that he's not really comfortable doing and has not been trained on. I hope he's begun to understand more about the issues involved as a result of the various feedback here and can educate his boss about the issues involved. -- Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP Microsoft: Bringing you 1980s technology to a 21st century world
Reply by ●May 27, 20142014-05-27
Hi Simon, On 5/27/2014 5:11 AM, Simon Clubley wrote:>>>> There is little >>>> consensus on how to achieve good data throughput. Solutions range all >>>> over the place, and few work. >>> >>> Which *specifically* "don't work"? And, do they not work because of >>> ommissions on YOUR part? If not, please identify *why* they "don't >>> work". E.g., the solution I provided *does* work as I have used it >>> on dozens of projects. If you can't see how use on a non-PC applies, >>> then I can site my 9-track tape driver that runs on a PC... not a >>> PCI card (ISA) and not an SSD (IDE) but if you cant work "in the >>> abstract", you'll never work in the *specific*! > > 800/1600 BPI or one of those new-fangled 6250 BPI tape drives ? :-) > > (PS: note above smiley)My transport is 800/1600/3200 (oddball). The I/F card for which I wrote the driver is little more than a few latches and level translators; so, it's effectively "bit-banging" the interface (and, given it's ISA, the bus speed/cycle time makes it a real "challenge" to keep the interface satiated. ["data" transfers can benefit from DMA but most other controller actions -- tape positioning, etc. -- need a lot of hand-holding]>>>> For example, there is "Starter Ware," a >>>> low overhead OS for ARM from Ti. But if you read the forums, much of >>>> the documentation is incorrect and unworkable. > > StarterWare is not a OS; it's a support library for bare metal programming.Then, presumably, it is pretty "thin"? Should be relatively easy to see what it *is* doing and figure out what it *should* be doing?>>> Then "Starter Ware" requires more of you than you are able to provide. >>> Fine. Pick something else. >>> >>> You probably can't use Limbo, either -- due to your unspecified >>> timing constraints, hardware interface, file formats, filesystem >>> choice, etc. >>> >>> Jaluna would intimidate you with its build environment. >>> >>> RTEMS might not provide the (unspecified) user interface you >>> need. >>> >>> QNX costs money. >>> >>> You probably can't write on bare iron... >>> >>> etc. >>> >>> Hey, maybe the Linux folks can entertain your queries! I'm >>> sure there's a newsgroup/forum for that! >>> >>> In all seriousness, until *you* know (meaning "can put in unambiguous >>> quantifiable terms") what your complete set of criteria are, you're >>> just going to be squeezing balloons -- always chasing, never achieving. >>> >>> Good luck! >> >> Actually, the failure of the ARM community to achieve any serious IO is >> embarassingly apparent and does not require any bureaucratic structure to >> see it.What's the "ARM community". SA's could *easily* toggle I/O's at MHz rates. The FIRQ would even allow you to do it *outside* a "tight loop" (e.g., pseudo DMA -- but *without* DMA hardware!)>> The GPIO data rate was coaxed into the mHz range, but with great difficulty. >> It is natively in the low kHz range.You're looking at something else. E.g., you can run a PC's *parallel* port (which has the ISA bus between it and the CPU -- low bandwidth) for PLIP and achieve data rates in excess of 75KB/s (which means you're toggling pins at ~200KHz)>> General material is offered, which evaporates under scrutiny... > > Are you confusing the GPIO achievable rates under Linux with those > achievable in a bare metal environment ?Agreed.> IIRC, those enhanced Linux speeds involve writing the GPIO lines directly > via memory mapped I/O rather than through a driver call for each I/O > manipulation.And, are they from user-land *through* an intermediary?> If it's the memory mapped option under Linux you are talking about, > then I don't see that as "difficult". > > I would like to say that while I am a programmer as part of my day job, > my embedded work is purely a hobby. However, the questions others here > have asked you are among the questions I would already have asked myself > before posting here. > > Doing embedded work requires a certain mindset and the ability to pull > together data from various sources. The questions you have been asked > are good questions and are designed to make you think about the problem > and what hardware/timing constraints are required to solve the problem. > > Doing research (and knowing how to do that research) is a part of any > serious embedded project and it's not something you can avoid.This is actually true of *any* engineering endeavor. When I first looked at the NRL ruleset for text-to-phoneme conversion, I was tickled to find several "free" implementations of the original algorithm. This wasn't surprising -- the algorithm was well documented and the rules published. What *was* surprising was that virtually every (C) implementation was technically flawed! Their authors had failed to understand how SNOBOL -- the language in which the original implementation was crafted -- applied operators. Instead, they adopted more "modern" rules and, silently, altered the algorithms performance. They *thought* they knew what the algorithm was doing without actually *understanding* the published description. And, given the complexity of the ruleset -- and a lack of a set of test cases -- I suspect if they got *anything* that "sounded" like natural speech out of the algorithm, they ASSUMED it was working! <frown> And, don't get me started on the flaws in the available Klatt synthesizer implementations! Do the research, *understand* what it means, *then* tackle the problem at hand!> PS: As a good natured comment, I wonder if I should start applying for > embedded jobs. :-) Sometimes, I think that as a hobbyist I seem to know > more about this world than those paid to do it for a living. :-)IME, folks who do embedded work are either hardware guys who started writing code to prove their hardware works -- and then got "drafted" into *doing* the code (I know of a Fortune 500 company that had a *technician* writing the code for a large embedded project "because he tinkered with software at home"; the PHB was a self-confident BASIC programmer so he was *sure* he understood these issues... <frown>) but, without a formal software education, don't really understand how to *design* the software (---> buggy code); Or, they are software folks who know squat about hardware and, as a result, ill-equipped to understand what *can* (and does) go wrong and, therefore, write buggy code.> PPS: The above PS doesn't apply to the OP. I get the feeling he's being > forced into something by his boss that he's not really comfortable > doing and has not been trained on. I hope he's begun to understand more > about the issues involved as a result of the various feedback here and > can educate his boss about the issues involved.I'm not sure he "gets it". E.g., even a naive exposure to a CPU/MCU datasheet should make it *painfully* clear that "kilohertz" toggle rates suggests "something else" is going on ("Gee, what?"). It seems like he is expecting the equivalent of "finding a qsort() algorithm, published" -- that, instead, addresses *his* particular problem. And, seems unable/unwilling to see that moving bytes off a magnetic tape head and into memory (which, can obviously, then be moved onto disk -- by just specifying the disk device as the target) is "the same problem" he is facing. Or, pulling bytes in/out of a UART, NIC, etc. There really are very *few* "problems"... just lots of applications that map *onto* that problem set! ;-) (i.e., it is the apps that makes engineering interesting -- not the *problems*!) Off for my pro bono work...
Reply by ●May 27, 20142014-05-27
On 2014-05-27, Don Y <this@is.not.me.com> wrote:> Hi Simon, > > On 5/27/2014 5:11 AM, Simon Clubley wrote: >> >> StarterWare is not a OS; it's a support library for bare metal programming. > > Then, presumably, it is pretty "thin"? Should be relatively easy > to see what it *is* doing and figure out what it *should* be doing? >With the TI datasheet in your hand, it's very easy to see what is going on. This is the same example code we were talking about recently which TI had placed under export control and which I later found on GitHub (_after_ finding out the MMU answers the hard way. :-))>> IIRC, those enhanced Linux speeds involve writing the GPIO lines directly >> via memory mapped I/O rather than through a driver call for each I/O >> manipulation. > > And, are they from user-land *through* an intermediary? >I'm not 100% sure because I don't use Linux to directly manipulate GPIO lines; if using Linux, I tend to use a dedicated frontend MCU to get the realtime guarantees. However, AIUI under Linux you use mmap to map in the GPIO registers and then manipulate them directly.>> PS: As a good natured comment, I wonder if I should start applying for >> embedded jobs. :-) Sometimes, I think that as a hobbyist I seem to know >> more about this world than those paid to do it for a living. :-) >Major oops here. That _should_ say "...more about this world than *some* *of* those paid to do it for a living." I'm NOT trying to claim I know more about this stuff than the professional c.a.e regulars around here. :-)> IME, folks who do embedded work are either hardware guys who > started writing code to prove their hardware works -- and then > got "drafted" into *doing* the code (I know of a Fortune 500 > company that had a *technician* writing the code for a large > embedded project "because he tinkered with software at home"; > the PHB was a self-confident BASIC programmer so he was *sure* > he understood these issues... <frown>) but, without a formal > software education, don't really understand how to *design* > the software (---> buggy code); > > Or, they are software folks who know squat about hardware and, > as a result, ill-equipped to understand what *can* (and does) > go wrong and, therefore, write buggy code. >I came to the embedded world as a software person, but I also design and build my own circuits (although they are veroboard based :-)) so I have developed some understanding of the hardware side of things. I'm much stronger on the digital side of things than the analogue/analog side of things however. Simon. -- Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP Microsoft: Bringing you 1980s technology to a 21st century world







