EmbeddedRelated.com
Forums

Text on FSM

Started by jmariano March 9, 2023
We think a lot alike because we've both have done this a long time.

On Friday, March 10, 2023 at 2:10:30 PM UTC-5, Don Y wrote:
> On 3/10/2023 11:10 AM, Ed Prochak wrote: > >> The problem with all of these approaches is they add another "tool" to > >> the development process -- and another opportunity for the developer > >> (who only uses the tool for a *portion* of a project) to make mistakes > >> in its application. The more capable and expressive the tool, the > >> more knowledge is required of the developer to exploit its capabilities. > > > > While you're last statement is true, I do not think it is a valid argument against > > adding another tool to the development process. In a work (not hobby) > > environment, I expect good management and teams to make considered > > choices about what tools to apply. If a tool is feature rich and well documented, > > then the knowledge should be available (either in you head or in the manual) > > to make the job easier. > You have to decide if the effort to learn (and remain "current") > the tool offsets the advantages gained by using it. I see lots > of developers who "almost" know how to use something. IMO, this > is worse than *not* knowing hot to use it (or, not *having* the > tool) because it breeds a false sense of confidence in their > efforts.
To your 1st sentence: AMEN. We don't really disagree here. To the rest: I've seen it too. Developers that write C code thinking they are programming in C++ is an extreme example.
> >> [E.g., if you had to construct an arbitrary regex, could you do so > >> with the knowledge you have committed to memory? Would a reference > >> answer all of the questions you *might* have about your particular > >> pattern?] > > > > Yes. > > When I have done a lot of regex work I kept a quick reference card handy. > The qualifier on that last statement is the issue. What about > when you HAVEN'T done work with a tool for some period of time? > Will you admit to yourself that you likely need a "refresher"? > Or, will you stumble along and *hope* it "comes back to you"? > Error-free?
Personally, Yes, I admit to my own limitations.
> > E.g., are multidimensional arrays stored in row or column major order? > There are too many "little details" like this that only stay fresh > in your mind with "frequent refreshing".
True, but that is a different issue that the selection of the tool in the first place.
> > [I write a lot of formal documentation. Yet, I'll be damned if I can > remember the shortcut for "non-breaking hyphenation" -- despite using > it dozens of times on any given document! (tomorrow, I'll look it up, > again!)]
You're preaching to the choir here Don.
> >> Instead (IMO), you want something that lets a developer use a technology > >> without reliance on a particular tool (that may not be well-supported > >> or may have latent bugs that haven't yet been tickled). > > > > Being tool agnostic is an ideal goal. In a practice, you must pick some > > specific tools to get the work done within schedule, budget, and quality constraints. > > > > If you want portability of design then that should be an explicit fourth constraint. > > Most projects select tools with the additional LONG TERM constraint of > > support throughout the life of the product or product line. > I've made a lot of money addressing the needs of clients who banked on > a set of tools, only to discover that they were "no longer supported" > (e.g., doesn't run unde new version of OS, requires hardware that PCs no > longer include, etc.)
Yes a lot of projects don't follow due diligence. Hence my phrase earlier about "good management and teams" was used deliberately.
> > You don't realize this is a legitimate design issue until you get > bitten by it. And, at that time, the time and $$$ available are > seldom what you'd need.
Yes. That's true. Computer engineering has been in constant flux for 70+years. Those that fail to learn history are condemned to repeat it.
> >> As such, understanding the technique is more important than finding a tool > >> that may )or may not) address your needs. ("I want to write in Eiffel. > >> Does your tool output Eiffel source?" Next week it may be some other > >> /langue du jour/) > > > > Exactly why an abstracting tool that is more focused on the design is better than > > a specific tool. > What better than a human brain?
I was focusing on documenting the design. I've tried to leave projects with enough clear design so that I can be replaced.
> > State machine design tools are a good example. With a graphic design tool, > > it is much easier to spot missing states or incorrect transitions. It can be > > clear enough that even end users can understand and point out flaws or > > enhancements. I don't think the particular output language is the point here. > But you can use a pencil and paper (or drawing program) to make > such a diagram and "see" the same missing states/incorrect transitions.
Well, whiteboard is easier to "edit and then save via camera (&/or printer)
> The tool just automates binding the design to a particular implementaion.
Ideally it should do both, act as a design recording medium and output the implementation.
> > BTW, lots of your earlier comments match closely to what I would have posted. > > This one just struck me are being a little too idealistic.
> Have you looked at Harel charts? For *complex*/layered machines? > I suspect revisiting one that you managed to *coax* together at > the start of a project a year or two later (maintenance) would > leave you wondering what all of the cryptic notation means.
Haven't used Harel charts. That's part of UML and no place that I worked at has needed a tool set that complex. KISS Taking a quick look, [again!] I can see why it may not be a good choice.
> > Would you admit ignorance and "play it safe" -- and expect to have > time to refamiliarize yourself with it BEFORE trying to make > changes? Or, would you proceed with a false set of confidence > and *hope* you don't break anything along the way?
If it is the design, then it better be clear enough to just read through and be understandable and complete. If it is the implementation, then I never assume I am familiar with the code (even it I wrote it).
> > Because we all KNOW that we have more than adequate time to > devote to "doing it right", right? :>
On some projects more than others. But I've managed to work on a couple projects where putting in the time up front in design paid off big in the integration and release. I'm especially proud of those projects.
> > [This is why I push the "idealistic" because in practice is far from it]
At one of the last places I worked we developed a saying that we violently agree. 8^) You make great contributions here Don. Keep it up. Ed
On 3/10/2023 3:02 PM, Ed Prochak wrote:
> We think a lot alike because we've both have done this a long time.
Long lost brother? <raises eyebrows> Tell me, do you, too, have that extra thumb on your left hand?? :>
>>>> [E.g., if you had to construct an arbitrary regex, could you do so >>>> with the knowledge you have committed to memory? Would a reference >>>> answer all of the questions you *might* have about your particular >>>> pattern?] >>> >>> Yes. >>> When I have done a lot of regex work I kept a quick reference card handy. >> The qualifier on that last statement is the issue. What about >> when you HAVEN'T done work with a tool for some period of time? >> Will you admit to yourself that you likely need a "refresher"? >> Or, will you stumble along and *hope* it "comes back to you"? >> Error-free? > > Personally, Yes, I admit to my own limitations.
There are two issues potentially at play, here. One is "vanity/pride/overestimation of your abilities". The other is simply not *knowing* that you don't really know what you need to know (or, that you know "enough" that the "rest" is insignificant) [Recall, we also have to address newcomers to the codebase who may be seeing these techniques for the first time and convince themselves that they *think* they know what it does]
>> E.g., are multidimensional arrays stored in row or column major order? >> There are too many "little details" like this that only stay fresh >> in your mind with "frequent refreshing". > > True, but that is a different issue that the selection of the tool in the first place.
I'm just commenting on "little details" that are too easy to "forget to remember" -- like the nonbreaking hyphen, below. In my defense, I *know* that the nonbreaking hyphen exists, why/where it should be used AND HOW *EASILY* I CAN REFRESH MY MEMORY (by opening the "shortcuts" manual). So, the fact that I keep forgetting the shortcut doesn't bother me. But, some of the FSM tools are overly cryptic in how they try to encode different "machine aspects" into the definition of the machine. How likely would a "stale" developer be to remember what all of those are and how they are used/invoked? Would he be humble/practical enough to admit his need for a refresher? Would he have the *time* to do so (perceived or otherwise)?
>>>> Instead (IMO), you want something that lets a developer use a technology >>>> without reliance on a particular tool (that may not be well-supported >>>> or may have latent bugs that haven't yet been tickled). >>> >>> Being tool agnostic is an ideal goal. In a practice, you must pick some >>> specific tools to get the work done within schedule, budget, and quality constraints. >>> >>> If you want portability of design then that should be an explicit fourth constraint. >>> Most projects select tools with the additional LONG TERM constraint of >>> support throughout the life of the product or product line. >> I've made a lot of money addressing the needs of clients who banked on >> a set of tools, only to discover that they were "no longer supported" >> (e.g., doesn't run unde new version of OS, requires hardware that PCs no >> longer include, etc.) > > Yes a lot of projects don't follow due diligence. Hence my phrase > earlier about "good management and teams" was used deliberately.
This is the crux of our difference. Ed, you (appear to be) optimistic about what you can expect from other developers and managers. I'm highly (HIGHLY!) jaded/cynical. I expect people to be lazy (why crack open a reference?) as well as over-estimate their abilities. I've personally seen products released with KNOWN bugs, rationalizing that they aren't "that important" (isn't the customer/end user the one who is supposed to make that decision?). Or, that they'll be fixed in an upcoming release -- that never comes (because there are other issues that preempt it in the queue). Even the simplest/easiest of tasks often get overlooked or ignored. I have been trying to get a copy of my medical record under HIPPA. This is a CIVIL RIGHT (!) I filled out the APPROVED form. HAND DELIVERED to the office (so no need to rely on the mail for prompt delivery). "It can take up to 30 days" (because a third party has to fulfill the request) Sit back and *patiently* wait. Don't pester them cuz they've already told you "up to 30 days". And, maybe a few more in case it got mailed on the 30th day! Day 35, time to call and see what's going on. "Press 3 for Records and Referrals" 10 rings. Voicemail. OK, its possible the person is away from their desk. Leave message, name, reason for calling, callback number. Next day, still no callback. Try again. Same result. Leave ANOTHER message. Lather, rinse, repeat for a total of 5 days! OK, they've obviously got a problem with their "Records and Referrals" person! Escalate to office manager. Don't accept their offer to "send me to her voicemail" as we've done that for 5 days, already, with the records flunky. Office manager will track down the records person (why isn't she at work? at 9:30AM? Office opens at 8!) and get back to me. Don't pester later that day -- give them time, you're not the only thing that they need to deal with. Or, the next. Two days later, call again and office manager avoids the call relying on someone else to give me a message: A third party "records" firm is processing my request (Duh, we knew that when I dropped off the request ~6 weeks earlier!). "When can I expect it from THEM?" Dunno. "Who can I call for more information?" (cuz you folks have been really tedious to deal with). Dunno. Eventually, get office manager on the phone. She's got their phone number and a "request ID" used to identify the request initiated on my behalf. Call third party. Automated "Your request is being processed. Call back in 7 days." (we're well beyond that 30 day initial limit). Out of curiosity, call back and talk to a human. Who repeats the same 7 day message. "Why is it taking so long? I requested these 40+ days ago!" "We just got the request YESTERDAY." Ah, so now I know that the "office" never filed my request. They just lost it (or it was still sitting in the Records person's inbox). Long story. But, the point is, all the Records person had to do was pass the request on to the third party, initially. Insert in FAX machine (or document scanner) and press "SEND". Yet, this was beyond their abilities! And, thereafter (had they done their job originally), all they would have had to do to address my followups was to give me the phone number and identifier for me to contact the third party! Instead, they try to hide the fact that they didn't do their job (office manager didn't say, "Gee, Don, we're really sorry but your request got lost, somehow. We only just recently submitted it to the third party (AFTER YOU PESTERED US). We'll try to EXPEDITE it for you (so YOU don't have to deal with *OUR* problem)" [What would have happened had I been in need of a REFERRAL and the "Records and Referrals" person been just as negligent? Would I have been letting the clock run out on a potentially serious medical condition?] Back to the topic at hand... Ask developers why their code is buggy and they'll tell you it's because their *boss* doesn't give them time to do proper testing, or doesn't have good design practices in place, etc. AS IF *they* would do much better in the absence of those nasty, ignorant managers and corporate culture. Yet, look at FOSS products -- no boss, no deadlines, put whatever design practices *you* want in place (after all, it's !your! name that will be on the "product"; you won't be some anonymous employee/developer) and you still see the same /ad hoc/ practices at play. Or, *you* WITNESSED a particular bug while testing your code. Yet, you weren't able to (easily) reproduce it. So, you DISMISS it (as a fluke) -- despite KNOWING that it happened and you haven't done anything DELIBERATE to eliminate it. Really? What's your defense? You'll address it when some user encounters it? Or, you'll hope some other user finds and fixes it? <frown> No, I don't expect developers to "do the right thing" any more than I expect managers to put in place the right practices. There's always a rationalization... Jaded.
>>>> As such, understanding the technique is more important than finding a tool >>>> that may )or may not) address your needs. ("I want to write in Eiffel. >>>> Does your tool output Eiffel source?" Next week it may be some other >>>> /langue du jour/) >>> >>> Exactly why an abstracting tool that is more focused on the design is better than >>> a specific tool. >> What better than a human brain? > > I was focusing on documenting the design. I've tried to leave projects > with enough clear design so that I can be replaced.
My arguments have been towards "tools that do this code generation" (pozz/roland's upthread comments). To do so, they have to have more information encoded in the representation. A classic state diagram is conceptually simple. You don't have special symbols to reference timed transitions, history, nested machines, etc. Yet, even a small number of states can quickly have too many transitions to easily represent on a sheet of paper. This is a small (hardware) FSM. Poor quality because it was *hand* drawn (straight edge, lead holder and lettering guide) on a *D*-size sheet of paper, reduced to B size and then scanned at 1200dpi (in an attempt to preserve as much detail as possible)! <https://mega.nz/file/A3x03aAA#YNsJhdikiucjU6aKGWKL2eTu4D0i95sqjcLuzIhz7ys> 8 state variables (Moore type so they also represent the outputs). About 35 states. And, it is highly regular (as evidenced by the symmetry in the presentation. Imagine an embedded system (software) that has all sorts of "exceptions" to take into consideration (transtion vectors crisscrossing willy-nilly) All this machine does is act as mediator between an internal FIFO (controlled by another FSM) and a "swappable hardware interface". It allows that interface to present dots (foreground/background), lines (essentially "line feeds") and pages to a marking engine. It prioritizes multiple concurrent requests and acknowledges each (interlocking REQ-ACK handshake). The "work" it does is sorting out how to accumulate dots to deliver to the FIFO as packed bytes, pad out lines that have not specified enough dots to fill the width of the line, etc. I.e., it is *trivial*. Yet, a visual mess.
>> The tool just automates binding the design to a particular implementaion. > > Ideally it should do both, act as a design recording medium and > output the implementation.
But, again, if what you're trying to codify (in the drawing) has too many subtleties, how likely will you be to draw it correctly? How often will you "test your knowledge" just by seeing if it (the code) *appears* to do what you want? [Have a look at all of the subtlety in Harel state charts. They try to encode *everything* in the drawing -- which is what you'd need to do if you were going to generate code FROM that drawing. I contend that someone using them day in, day out, *could* likely be very effective with them. But, someone just using them for part of a project would perpetually be uncertain of the actions they're taking wrt them. Like trying to create an arbitrary regex after months of not using them] Documents should *enhance* your understanding of a product (or its behavior). They shouldn't become challenges in and of themselves (because they try to be equivalences for the actual code!) Each "node" in my current design operates in several "major states". It can be: - powered down (by the system) - powering up (at the behest of the system, running POST, acquiring OS image) - running diagnostics (taken out of service due to a fault) - powered up but "cold" (ready to receive application code, known to be in a reliable hardware state to deliver its potential functionality) - powered up and actively providing services - powered up but only serving compute resources (field is powered off) - powering down (shedding load to other nodes at the behest of the system) - faulted (failure detected so trying to recover) etc. It's really just a handful of different *major* states. Yet, the state diagram is a mess, just indicating how a node can move from one of these states to another (ignoring all of the "details" involved). And, it ignores the many states *within* a major state (to manage complexity) If you think about *applications*, in detail, you quickly discover that there are lots of states that need to be visible -- IF YOU WANT TO MODEL THE ENTIRE APPLICATION as a FSM). Instead, use state diagrams to explain portions of the design in more intuitive/visual ways -- without requiring them to be "complete enough" to generate code (via some other tool). Let the developer look elsewhere for more detail -- expressed in a medium that is better suited to THAT task.
>>> BTW, lots of your earlier comments match closely to what I would have posted. >>> This one just struck me are being a little too idealistic. > >> Have you looked at Harel charts? For *complex*/layered machines? >> I suspect revisiting one that you managed to *coax* together at >> the start of a project a year or two later (maintenance) would >> leave you wondering what all of the cryptic notation means. > > Haven't used Harel charts. That's part of UML and no place that > I worked at has needed a tool set that complex. KISS > Taking a quick look, [again!] I can see why it may not be a good choice.
But that shouldn't rule out using state diagrams *or* state machines. Harel just seems to think "if a little is good, a lot will be better!" I've implemented (software) state machines in probably a few dozen *different* ways -- each an "experiment" to see if some tweek of an earlier approach can give me better "app coverage" (without jeopardizing the project's design). My conclusion: there is no *best* way. So, adopting *an* approach (dictated by a tool) seems like a straight-jacket. The important thing is to have a visible structure to the design that is reasonably easy to grasp -- without wading through pages of case/switch statements with interspersed code fragments.
>> Would you admit ignorance and "play it safe" -- and expect to have >> time to refamiliarize yourself with it BEFORE trying to make >> changes? Or, would you proceed with a false set of confidence >> and *hope* you don't break anything along the way? > > If it is the design, then it better be clear enough to just read through > and be understandable and complete.
Does it really need to be complete for you to grok intent? Do I really care how the code detects a power failure to understand how it *reacts* to one? Use good abstractions and the next guy won't *have* to understand ALL of it.
> If it is the implementation, then I never assume I am familiar with > the code (even it I wrote it).
Wise man! :> Also, a testament to writing what needs to be written, instead of trying to be clever! I had to modify some of my baking Rxs do to egg shortages. Being reasonably good with arithmetic, I can scale recipes in my head. So, just jot down the actual quantities that I used (after scaling). Now, going back to clean up the little scraps of paper that I used (is this a universal truth? I always see folks with Rxs on oddball slips of paper -- the back of cash register receipts, envelopes, etc. I think the only folks who use those nice printed index cards are folks who don't bake! :<), I find some quantities that obviously aren't in the correct scaled proportions evident in the *other* quantities. As it's unlikely that I "figured wrong", there must have been a reason for my instituting that change... yet I've neglected to commit it to paper so I'm now at risk of having to make a similar change, in the future, and not realizing it until after the fact! :<
On 3/11/2023 6:01 AM, Don Y wrote:
> If you think about *applications*, in detail, you quickly discover that there > are lots of states that need to be visible -- IF YOU WANT TO MODEL THE ENTIRE > APPLICATION as a FSM). > > Instead, use state diagrams to explain portions of the design in > more intuitive/visual ways -- without requiring them to be "complete > enough" to generate code (via some other tool).&nbsp; Let the developer look > elsewhere for more detail -- expressed in a medium that is better > suited to THAT task.
Old-school analogy: you'd no longer write a (detailed) flowchart to document a piece of code (before or after writing it). But, would likely sketch the "major flow" -- even if only on the back of a napkin -- to act as general guidance.
Il 10/03/2023 16:20, Rick C ha scritto:
> On Friday, March 10, 2023 at 6:48:13&#8239;AM UTC-5, Robert Roland wrote: >> On Fri, 10 Mar 2023 09:54:23 +0100, pozz <pozz...@gmail.com> wrote: >> >>> There will be a time when the programmer would simply draw one or more >>> state-machines and click a button to generate the full working code in >>> whatever language. >> When I went to school, 30 or so years ago, we did have such a program. >> I am not able to remember its name, though. > > The problem these have, like many graphic oriented approaches, is continuing support and version control of the source. I've never worked with a state machine that was so complex it required anything other than a diagram drawn up in your favorite word processor drawing package. Typically, FSM can be decomposed into multiple distinct FSM that are easier to understand, and typically relate to the problem better. > > A FSM with 10 states is easy to code. A FSM with 100 states is probably several FSMs, mistakenly combined into one.
I think the complexity of a FSM is not only related to the number if states, but also to the transitions/inputs. It's much simpler to detect errors on a diagram instead on a cryptic list of switch/case instructions. The great advantage of a diagram is that it can be read by non-developers: customers, sales man, project managers and so on. UML diagrams are there exactly for these reasons. Now imagine a tool that takes as input the diagram and spits out a fsm.c without errors. I know most of FSMs are simple, but most of the time you refuse to solve a problem with a FSM just because it could be too complex to convert in code. However, if you had this type of tool, you would consider FSMs for much more problems. For example, a simple calculator can be modeled as a FSM: the transitions are keystrokes. However it isn't a simple FSM, because there are many subtle details that must be addressed. It is somewhat simple to make a diagram and solve some problems with arcs, arrows and rectangles, but it's much more complex to make the same things on a C code.
> Now imagine a tool that takes as input the diagram > and spits out a fsm.c
Indeed. Please check out the freeware QM modeling tool: - https://www.state-machine.com/qm Automatic Code Generation video: - https://youtu.be/FHV5vZyECOA QM is an example of a modern, *lightweight* modeling tool. Older, "high-ceremony" tools have been available since the '90, but they couldn't pull their own weight and didn't catch on. However, things changed since the '90s...
> For example, a simple calculator can be modeled as a FSM: the > transitions are keystrokes. However it isn't a simple FSM...
Interesting that you mention the calculator problem. (I've used it in my "Practical Statecharts" book published back in 2022.) I've also used it in my recent video "State machines as "spaghetti" reducers: - https://youtu.be/fLXxNe4YeJ4 It turns out that even the "simple" 4-operation calculator is complex enough to be almost impossible to get right with traditional "improvised" state management. A state machine, on the other hand, is quite manageable.
On 3/13/2023 8:35 AM, pozz wrote:
> I think the complexity of a FSM is not only related to the number if states, > but also to the transitions/inputs.
Of course. A 0..99 counter can have oodles of states... but the interactions between them are trivial to the point of being boring. Interconnectedness is the source of *all* complexity. E.g., the more modules your code interacts with, the more complex the code is likely to be!
> It's much simpler to detect errors on a diagram instead on a cryptic list of > switch/case instructions.
That depends on the problem and how expressive (and intuitive!) the drawing.
> The great advantage of a diagram is that it can be > read by non-developers: customers, sales man, project managers and so on.
See above.
> UML diagrams are there exactly for these reasons. > > Now imagine a tool that takes as input the diagram and spits out a fsm.c > without errors.
For what *classes* of state machines?
> I know most of FSMs are simple, but most of the time you refuse to solve a > problem with a FSM just because it could be too complex to convert in code. > However, if you had this type of tool, you would consider FSMs for much more > problems.
You're already building an FSM when you solve such a problem. Th eissue is HOW you build it. If /ad hoc/ then it's more likely to contain errors and be harder for folks to understand -- without digging deep.
> For example, a simple calculator can be modeled as a FSM: the transitions are > keystrokes. However it isn't a simple FSM, because there are many subtle > details that must be addressed. It is somewhat simple to make a diagram and > solve some problems with arcs, arrows and rectangles, but it's much more > complex to make the same things on a C code.
A calculator is a toy example. Imagine, instead, if that calculator (keys + display) also had to indicate when a key was *stuck*, when the batteries were low (without a dedicated "low battery" indicator), the current time of day (and date), implement an "egg timer" functionality as well as a traditional "alarm", etc. A calculator is *just* a calculator. Few embedded systems have such limited -- and "regular" -- functionality. Imagine being in the middle of a calculation and the display is commandeered by the alarm function signalling the programmed time has been reached. The value you had been typing is overwritten AND your next keystroke is expected (but not guaranteed) to be an interaction with the "alarm clock" -- acknowledge the alarm, request an additional 10 minutes, leave it active (but not monopolizing the display) for later attention, etc. This condition can persist indefinitely (unless the design times it out). So, what happens when the battery fails some time later? Does that message overlay the alarm message? How do you interact with THAT? And, when you've "cleared" it, does the alarm display reappear? Do you even *remember* the calculation that you were making when this started? [And let's not get into the possibilities for races because you hadn't considered how one "process/interaction" could be interrupted/preempted by another!]
Am 10.03.23 um 18:51 schrieb jmariano:

> My first contact with FSM was many years ago when I was finishing my PhD. I had to implement a FSM on a FPGA and, of course, I was late, so I pick an example somewhere and just implemented, without really knowing what I was doing.... > > Now I would like to finally understand a little better how these things work. > > Also, the sort of things that I build, usually have some type of "controller" in it, and usually talk to a PC via serial interface. This, it seems to me, could to be well modeled by a FSM. I understand that this representation might bring additional issues, as it was mentioned by Don Y, but my main goal here is to go one step further from the clumsy programming that I've been doing for the past few years into a bit more formal representation of the problems.
I learned state machines using the PLD compiler "CUPL" on a VAX11/750. That had a nice syntax for Mealy and Moore state machines and once I had understood that, I could also do it in PALASM or VHDL. Another useful tool is yacc from Unix or bison on Linux. It reads a grammar and builds a state machine from it. The state machine is later a blob of C. It reads a series of symbols from its input much like your serial device and hopps through the state machine if the string of symbols is legal given the grammar. If a legal grammar rule is recognized, you can trigger the execution of some C code. Grammar: expression = number: { $$ = $1; ) | expression = sym_variable: { $$ = $1; ) | expression = Number '+' number : { $$ = $1 + $3; } | expression = epression '*' expression: { $$ = $1 * $3; } | expression = epression '/' expression: { $$ = $1 / $3; } | expression = '(' expression and so on assignment = sym_variable ":=" expression: { $$ = $3; ) My syntax is wrong, but you get the principle. The stuff in curly braces is the executed C code when a rule is discovered, the $$ thingies are replaced by the relevant variables. Yacc tries to find the most general rules possible. That makes sure that y = a + b + c * sin(333); is recognized. Usually you will insert a lexical scanner in the input, like lex, so that the state machine does not grow beyond a reasonable size. That would filter out things like "begin", "end", "integer", ":=" and so on. lex() would return integer constants like SY_BEGIN, VARIABLE_NAME, '(' and so on. Yacc stands for "Yet Another Compiler Compiler", but it's original job was to create Weinberger arrays, a LSI structure not unlike a PAL. Cheers, Gerhard
On 3/13/2023 1:20 PM, Gerhard Hoffmann wrote:
> I learned state machines using the PLD compiler "CUPL" on > a VAX11/750. That had a nice syntax for Mealy and Moore > state machines and once I had understood that, I could also > do it in PALASM or VHDL.
CUPL, PALASM, PLDshell, ABEL, etc.
> Another useful tool is yacc from Unix or bison on Linux. > It reads a grammar and builds a state machine from it. > The state machine is later a blob of C. It reads a series of > symbols from its input much like your serial device and hopps > through the state machine if the string of symbols is legal > given the grammar. If a legal grammar rule is recognized, > you can trigger the execution of some C code.
Yes, but tedious for things like event driven code (where the symbols in the alphabet are events).
> Grammar: > > expression = number:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; { $$ = $1;&nbsp;&nbsp;&nbsp; ) > | expression = sym_variable:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; { $$ = $1;&nbsp;&nbsp;&nbsp; ) > | expression = Number '+' number :&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; { $$ = $1 + $3; } > | expression = epression '*' expression:&nbsp;&nbsp; { $$ = $1 * $3; } > | expression = epression '/' expression:&nbsp;&nbsp; { $$ = $1 / $3; } > | expression = '(' expression and so on > > assignment = sym_variable ":=" expression:&nbsp; { $$ = $3;&nbsp;&nbsp;&nbsp; ) > > My syntax is wrong, but you get the principle. The stuff > in curly braces is the executed C code when a rule is > discovered, the $$ thingies are replaced by the relevant variables. Yacc tries > to find the most general rules possible. > That makes sure that y = a + b + c * sin(333); is recognized. > > Usually you will insert a lexical scanner in the input, like > lex, so that the state machine does not grow beyond a reasonable > size. That would filter out things like "begin", "end", > "integer", ":=" and so on. lex() would return integer constants > like SY_BEGIN, VARIABLE_NAME, '(' and so on.
lex(1) most often used to aggregate things like <digit>s into <number> or <character>s into <token>/<reserved_word> without unduly burdening the grammar with individual rules for each token. No real "actions" done inside those tokens. lex/yacc are more applicable to parsing strings of characters as you'd encounter in a (programming) language. AFAICT, none of these tools knows how to optimize states by noting equivalences (?) [George would know for sure] OTOH, when dealing with hardware machines, it's a common step to reduce via implication tables, etc.
> Yacc stands for "Yet Another Compiler Compiler", but it's > original job was to create Weinberger arrays, a LSI structure > not unlike a PAL. > > Cheers, > Gerhard > >
On Mon, 13 Mar 2023 15:41:32 -0700, Don Y
<blockedofcourse@foo.invalid> wrote:

>lex/yacc are more applicable to parsing strings of characters >as you'd encounter in a (programming) language. > >AFAICT, none of these tools knows how to optimize states by >noting equivalences (?) [George would know for sure] >OTOH, when dealing with hardware machines, it's a common step >to reduce via implication tables, etc.
yacc and bison will remove unreachable PDA states. Unreachable states in the parser typically are not deliberate but rather result from ambiguities in the grammar. lex, being LALR(1), does not tolerate much ambiguity - however bison can generate more powerful LR(1) and GLR(1) parsers, and these are far more likely to have unreachable states accidentally generated. Similarly, lex will remove unreachable DFA states. Again, these typically are not deliberate, but rather result from lex combining the various input regex into a single DFA having multiple entry points and multiple accepting states. Then there is flex. flex has some DFA optimizations available. First, flex can compress the data tables which encode its DFA states. Second, flex can discover "equivalence" classes: groups of characters which result in the same action. And finally, flex can [sometimes] discover "meta-equivalence" classes: commonly expected sequences of characters and/or other equivalence classes. All of these can result in a smaller and/or more efficient recognizer than is possible with lex. yacc and lex are ancient and haven't been maintained for decades. They have limitations that will never be addressed, and bugs that will never be fixed. If you really need backward compatibility with them, it is available using bison and flex ... and if you don't need bacward compatibility, bison and flex are much more powerful and modern tools for use with new projects. [There are other modern alternatives to yacc and lex also, but discussion of them is beyond the scope of this missive.] George
On 3/13/2023 8:07 PM, George Neuner wrote:
> On Mon, 13 Mar 2023 15:41:32 -0700, Don Y > <blockedofcourse@foo.invalid> wrote: > >> lex/yacc are more applicable to parsing strings of characters >> as you'd encounter in a (programming) language. >> >> AFAICT, none of these tools knows how to optimize states by >> noting equivalences (?) [George would know for sure] >> OTOH, when dealing with hardware machines, it's a common step >> to reduce via implication tables, etc. > > yacc and bison will remove unreachable PDA states. Unreachable states > in the parser typically are not deliberate but rather result from > ambiguities in the grammar. lex, being LALR(1), does not tolerate > much ambiguity - however bison can generate more powerful LR(1) and > GLR(1) parsers, and these are far more likely to have unreachable > states accidentally generated. > > Similarly, lex will remove unreachable DFA states. Again, these > typically are not deliberate, but rather result from lex combining the > various input regex into a single DFA having multiple entry points and > multiple accepting states.
But, the FSM parallel would be an "orphan" state that would (*likely*) be readily apparent in a state diagram. "You can't get there from here". In hardware designs, it is not uncommon to have multiple equivalent states that aren't easily recognized as such (because you tend to ignore "don't cares" -- which are ripe for collapsing equivalents).
> Then there is flex. > > flex has some DFA optimizations available. First, flex can compress > the data tables which encode its DFA states. Second, flex can > discover "equivalence" classes: groups of characters which result in > the same action. And finally, flex can [sometimes] discover > "meta-equivalence" classes: commonly expected sequences of characters > and/or other equivalence classes.
But, those are mapping equivalent *inputs* together, not equivalent *states*. I.e., treating "BEGIN" and "begin" as equivalent. Would it recognize a common sequence of state transitions that occurs in two different places in the grammar? E.g., specifying the syntax for a numeric quantity in two different places only to realize that it's actually the same part of the grammar expressed as two different instances? <number> ::= <digit><digits> <value> ::= <digit><digits> <expr> ::= <number> <op> <number> | <value> <op> <value> (silly example; but the inputs in each case are the same)
> All of these can result in a smaller and/or more efficient recognizer > than is possible with lex. > > yacc and lex are ancient and haven't been maintained for decades. They > have limitations that will never be addressed, and bugs that will > never be fixed. If you really need backward compatibility with them, > it is available using bison and flex ... and if you don't need bacward > compatibility, bison and flex are much more powerful and modern tools > for use with new projects. > [There are other modern alternatives to yacc and lex also, but > discussion of them is beyond the scope of this missive.]
Now, to the question at hand (or, a version thereof). It's relatively common to design an algorithm or a hardware machine as a "state transition diagram" and then reduce to an implementation. (The question, posed here, is how much this can be automated and the benefits that might acrue). But, IME, DFA for grammars are most usually expressed in (e)bnf and *later* rendered into graphical form (e.g., railroad dwgs). The graphical form being a consequence of the specification, not the *driver* of the specification. And, the graphical form often *omitting* detail (i.e., how each rule in the grammar is acted upon). Said another way, do you see folks designing grammars graphically and relying on tools to convert these "expressions" to more concrete forms for parser generators? (this being the analog of how graphical tools are being suggested for application to FSM)