Text on FSM| page 4

Reply by pozz ●March 14, 20232023-03-14

Il 13/03/2023 16:55, StateMachineCOM ha scritto:
>> Now imagine a tool that takes as input the diagram
>> and spits out a fsm.c
> 
> Indeed. Please check out the freeware QM modeling tool:
> - https://www.state-machine.com/qm
> 
> Automatic Code Generation video:
> - https://youtu.be/FHV5vZyECOA
> 
> QM is an example of a modern, *lightweight* modeling tool. Older, "high-ceremony" tools have been available since the '90, but they couldn't pull their own weight and didn't catch on. However, things changed since the '90s...

Yes, I know this tool and I like its philosopy. However I understood I 
can't use the generated code in closed source projects without a 
commercial license.


>> For example, a simple calculator can be modeled as a FSM: the
>> transitions are keystrokes. However it isn't a simple FSM...
> 
> Interesting that you mention the calculator problem. (I've used it in my "Practical Statecharts" book published back in 2022.) I've also used it in my recent video "State machines as "spaghetti" reducers:
> 
> -  https://youtu.be/fLXxNe4YeJ4

I read your book ;-)


> It turns out that even the "simple" 4-operation calculator is complex enough to be almost impossible to get right with traditional "improvised" state management. A state machine, on the other hand, is quite manageable.

*Hierarchical* state-machine is fundamental here to reduce complexity.

Reply by pozz ●March 14, 20232023-03-14

Il 13/03/2023 17:29, Don Y ha scritto:
> On 3/13/2023 8:35 AM, pozz wrote:
>> I think the complexity of a FSM is not only related to the number if 
>> states, but also to the transitions/inputs.
> 
> Of course.&nbsp; A 0..99 counter can have oodles of states... but the 
> interactions
> between them are trivial to the point of being boring.
> 
> Interconnectedness is the source of *all* complexity.&nbsp; E.g., the more
> modules your code interacts with, the more complex the code is likely to 
> be!
> 
>> It's much simpler to detect errors on a diagram instead on a cryptic 
>> list of switch/case instructions.
> 
> That depends on the problem and how expressive (and intuitive!) the
> drawing.

A good diagram is always much more expressive than a good code, for 
developers and for non-developers.


>> The great advantage of a diagram is that it can be read by 
>> non-developers: customers, sales man, project managers and so on.
> 
> See above.
> 
>> UML diagrams are there exactly for these reasons.
>>
>> Now imagine a tool that takes as input the diagram and spits out a 
>> fsm.c without errors.
> 
> For what *classes* of state machines?

Hierarchical state-machines (UML state-machines) are fully qualified 
monsters.


>> I know most of FSMs are simple, but most of the time you refuse to 
>> solve a problem with a FSM just because it could be too complex to 
>> convert in code. However, if you had this type of tool, you would 
>> consider FSMs for much more problems.
> 
> You're already building an FSM when you solve such a problem.&nbsp; Th eissue is
> HOW you build it.&nbsp; If /ad hoc/ then it's more likely to contain errors
> and be harder for folks to understand -- without digging deep.

This is the reason why a fsm generation tool could help. You don't need 
to build with /ad hoc/ approach.


>> For example, a simple calculator can be modeled as a FSM: the 
>> transitions are keystrokes. However it isn't a simple FSM, because 
>> there are many subtle details that must be addressed. It is somewhat 
>> simple to make a diagram and solve some problems with arcs, arrows and 
>> rectangles, but it's much more complex to make the same things on a C 
>> code.
> 
> A calculator is a toy example.
> 
> Imagine, instead, if that calculator (keys + display) also had to
> indicate when a key was *stuck*, when the batteries were low
> (without a dedicated "low battery" indicator), the current time
> of day (and date), implement an "egg timer" functionality as well
> as a traditional "alarm", etc.
> 
> A calculator is *just* a calculator.&nbsp; Few embedded systems have such
> limited -- and "regular" -- functionality.
> 
> Imagine being in the middle of a calculation and the display is
> commandeered by the alarm function signalling the programmed time
> has been reached.&nbsp; The value you had been typing is overwritten
> AND your next keystroke is expected (but not guaranteed) to be
> an interaction with the "alarm clock" -- acknowledge the alarm,
> request an additional 10 minutes, leave it active (but not
> monopolizing the display) for later attention, etc.
> 
> This condition can persist indefinitely (unless the design
> times it out).&nbsp; So, what happens when the battery fails
> some time later?&nbsp; Does that message overlay the alarm
> message?&nbsp; How do you interact with THAT?&nbsp; And, when you've
> "cleared" it, does the alarm display reappear?&nbsp; Do you
> even *remember* the calculation that you were making when
> this started?
> 
> [And let's not get into the possibilities for races because
> you hadn't considered how one "process/interaction" could be
> interrupted/preempted by another!]

I didn't get your point of these all. There are simple applications and 
complex applications, I don't think you need to convince me.

Anyway even a simple application such as a standard calculator can be 
complex enough to be implemented as a flat state-machine without any tool.

However the complexity can be managed and reduced on a diagram of a 
UML/hierarchical state-machine. After that, click on a build button and 
you error-free code is done.

Reply by Don Y ●March 14, 20232023-03-14

On 3/14/2023 7:54 AM, pozz wrote:
> Il 13/03/2023 17:29, Don Y ha scritto:
>> On 3/13/2023 8:35 AM, pozz wrote:
>>> I think the complexity of a FSM is not only related to the number if states, 
>>> but also to the transitions/inputs.
>>
>> Of course.&nbsp; A 0..99 counter can have oodles of states... but the interactions
>> between them are trivial to the point of being boring.
>>
>> Interconnectedness is the source of *all* complexity.&nbsp; E.g., the more
>> modules your code interacts with, the more complex the code is likely to be!
>>
>>> It's much simpler to detect errors on a diagram instead on a cryptic list of 
>>> switch/case instructions.
>>
>> That depends on the problem and how expressive (and intuitive!) the
>> drawing.
> 
> A good diagram is always much more expressive than a good code, for developers 
> and for non-developers.

That depends on whether the diagram can express the issues that need to be
expressed, *concisely*.

nest_state := current_state + 1

sure is a lot more descriptive than wading through 99 discrete states
that all *seem* to say the same thing (but you must VERIFY to be sure!)

>>> The great advantage of a diagram is that it can be read by non-developers: 
>>> customers, sales man, project managers and so on.
>>
>> See above.
>>
>>> UML diagrams are there exactly for these reasons.
>>>
>>> Now imagine a tool that takes as input the diagram and spits out a fsm.c 
>>> without errors.
>>
>> For what *classes* of state machines?
> 
> Hierarchical state-machines (UML state-machines) are fully qualified monsters.

Your goal, with *documentation* (vs specification) is to educate quickly
and accurately.  If "I" have to understand nuances of a presentation
before the *real* meaning is apparent, then "I" will likely miss some
detail and likely not know it (for some group of "I").

E.g., in Limbo, there are two ways to make an assignment:

foo := 2
foo = 2

Is the latter a typographical error?  (No, the former instantiates and
types the variable in addition to making the assignment; the latter simply
does the assignment)

>>> I know most of FSMs are simple, but most of the time you refuse to solve a 
>>> problem with a FSM just because it could be too complex to convert in code. 
>>> However, if you had this type of tool, you would consider FSMs for much more 
>>> problems.
>>
>> You're already building an FSM when you solve such a problem.&nbsp; Th eissue is
>> HOW you build it.&nbsp; If /ad hoc/ then it's more likely to contain errors
>> and be harder for folks to understand -- without digging deep.
> 
> This is the reason why a fsm generation tool could help. You don't need to 
> build with /ad hoc/ approach.

But if the tool doesn't build the *entire* state machine portion of the code,
then what good is it?  If it just generates a skeleton and relies on the
developer to "flesh it out", then it's just a labor saver and still leaves
the application vulnerable to design omissions.

>>> For example, a simple calculator can be modeled as a FSM: the transitions 
>>> are keystrokes. However it isn't a simple FSM, because there are many subtle 
>>> details that must be addressed. It is somewhat simple to make a diagram and 
>>> solve some problems with arcs, arrows and rectangles, but it's much more 
>>> complex to make the same things on a C code.
>>
>> A calculator is a toy example.
>>
>> Imagine, instead, if that calculator (keys + display) also had to
>> indicate when a key was *stuck*, when the batteries were low
>> (without a dedicated "low battery" indicator), the current time
>> of day (and date), implement an "egg timer" functionality as well
>> as a traditional "alarm", etc.
>>
>> A calculator is *just* a calculator.&nbsp; Few embedded systems have such
>> limited -- and "regular" -- functionality.
>>
>> Imagine being in the middle of a calculation and the display is
>> commandeered by the alarm function signalling the programmed time
>> has been reached.&nbsp; The value you had been typing is overwritten
>> AND your next keystroke is expected (but not guaranteed) to be
>> an interaction with the "alarm clock" -- acknowledge the alarm,
>> request an additional 10 minutes, leave it active (but not
>> monopolizing the display) for later attention, etc.
>>
>> This condition can persist indefinitely (unless the design
>> times it out).&nbsp; So, what happens when the battery fails
>> some time later?&nbsp; Does that message overlay the alarm
>> message?&nbsp; How do you interact with THAT?&nbsp; And, when you've
>> "cleared" it, does the alarm display reappear?&nbsp; Do you
>> even *remember* the calculation that you were making when
>> this started?
>>
>> [And let's not get into the possibilities for races because
>> you hadn't considered how one "process/interaction" could be
>> interrupted/preempted by another!]
> 
> I didn't get your point of these all. There are simple applications and complex 
> applications, I don't think you need to convince me.

The point is that most (embedded) applications are much more substantial than
a limited domain calculator.

I drew parallels in the calculator example to the oven example I posted
elsewhere.  It *appears* to be a simple application:
- press big button to wake up display
- turn to select cooking mode
- press big button to make that choice
- turn to select cooking temperature
- press button to make that choice
- turn button to select next action (cook, specify time, etc.)
- perform that step
Repeat for the second oven.

Ah, but, while you are specifying the temperature for the second
oven, the cook timer for the first oven may expire (because you started
to specify the second oven's temperature and then dashed off to remove
the toast from the toaster).

Now what?  Do you reply to the query (from the first oven) asking if
you want to shut the oven off or leave it on?  Or, do you continue
trying to specify the temperature for the second oven -- which is your
memory of your most recent interaction with the oven?

The calculator is a closed box.  Nothing interacts with it other than
the user.  It can wait forever for the user to perform the next
action (keypress) without fear of having its resources re-assigned
to some other activity.

If, in the example I posted, the calculator had to indicate battery
failures, expired timers, etc. then it's considerably more involved
in its design.  This needs to be expressible in the state machine in
a way that makes the correctness (or not) of the design apparent
to the designer.

[My oven fails this test!  So, whatever tools the multi-BILLION dollar
corporation that designed it used were inadequate for the task.]

> Anyway even a simple application such as a standard calculator can be complex 
> enough to be implemented as a flat state-machine without any tool.
> 
> However the complexity can be managed and reduced on a diagram of a 
> UML/hierarchical state-machine. After that, click on a build button and you 
> error-free code is done.

Reply by Niklas Holsti ●March 14, 20232023-03-14

On 2023-03-14 16:54, pozz wrote:
> Il 13/03/2023 17:29, Don Y ha scritto:
>> On 3/13/2023 8:35 AM, pozz wrote:
>>> I think the complexity of a FSM is not only related to the number if 
>>> states, but also to the transitions/inputs.
>>
>> Of course.&nbsp; A 0..99 counter can have oodles of states... but the 
>> interactions
>> between them are trivial to the point of being boring.
>>
>> Interconnectedness is the source of *all* complexity.&nbsp; E.g., the more
>> modules your code interacts with, the more complex the code is likely 
>> to be!
>>
>>> It's much simpler to detect errors on a diagram instead on a cryptic 
>>> list of switch/case instructions.
>>
>> That depends on the problem and how expressive (and intuitive!) the
>> drawing.
> 
> A good diagram is always much more expressive than a good code, for 
> developers and for non-developers.

In my experience, diagrams that describe all the details of the code, as 
would be required for generating the code from the diagram, are usually 
much too complex to comprehend easily ("visually"). They tend to be 
mazes where one can perhaps trace out some significant paths with a 
careful finger, unless too many lines cross at one point.

To get a good, visually graspable diagram, IME one must almost always 
simplify and elide details. And then such diagrams are very good entry 
points into the code, if one has to read the code to get a complete 
understanding.

I remember one case where the SW for the central computer of a satellite 
was generated from state-and-message diagrams by an automatic 
"model-based design" tool. In graphical form, the diagrams covered 
numerous A4 pages and each page had several cryptically labelled 
inter-page links for messages coming from other pages and going to other 
pages. It was very difficult to get any kind of overall understanding of 
the SW.

I admit that there are some domains -- for example, servo-control 
systems -- where it is possible to generate significant amounts of code 
from readable diagrams, eg. SIMULINK diagrams. But I don't think it 
works well for most code in embedded systems.

Reply by George Neuner ●March 14, 20232023-03-14

On Mon, 13 Mar 2023 22:59:25 -0700, Don Y
<blockedofcourse@foo.invalid> wrote:

>On 3/13/2023 8:07 PM, George Neuner wrote:
>> On Mon, 13 Mar 2023 15:41:32 -0700, Don Y
>> <blockedofcourse@foo.invalid> wrote:
>> 
>>> lex/yacc are more applicable to parsing strings of characters
>>> as you'd encounter in a (programming) language.
>>>
>>> AFAICT, none of  these tools knows how to optimize states by
>>> noting equivalences (?)  [George would know for sure]
>>> OTOH, when dealing with hardware machines, it's a common step
>>> to reduce via implication tables, etc.
>> 
>> yacc and bison will remove unreachable PDA states. Unreachable states
>> in the parser typically are not deliberate but rather result from
>> ambiguities in the grammar.  lex, being LALR(1), does not tolerate
>> much ambiguity - however bison can generate more powerful LR(1) and
>> GLR(1) parsers, and these are far more likely to have unreachable
>> states accidentally generated.
>> 
>> Similarly, lex will remove unreachable DFA states. Again, these
>> typically are not deliberate, but rather result from lex combining the
>> various input regex into a single DFA having multiple entry points and
>> multiple accepting states.
>
>But, the FSM parallel would be an "orphan" state that would (*likely*)
>be readily apparent in a state diagram.  "You can't get there from here".
>
>In hardware designs, it is not uncommon to have multiple equivalent
>states that aren't easily recognized as such (because you tend to
>ignore "don't cares" -- which are ripe for collapsing equivalents).

When you merge two FSM you often get redundant "don't care" nodes, but
you also can get nodes which either are impossible to enter [dead
code], or impossible to leave [halt], because there are no legal
transitions that will permit it. Joining FSM involves identifying and
pruning both types of nodes.

>> Then there is flex.
>> 
>> flex has some DFA optimizations available. First, flex can compress
>> the data tables which encode its DFA states.  Second, flex can
>> discover "equivalence" classes: groups of characters which result in
>> the same action.  And finally, flex can [sometimes] discover
>> "meta-equivalence" classes: commonly expected sequences of characters
>> and/or other equivalence classes.
>
>But, those are mapping equivalent *inputs* together, not equivalent
>*states*.  I.e., treating "BEGIN" and "begin" as equivalent.

No.  Consider the case of just recognizing a decimal digit: compare
the graph using the alternation: (0|1|2|3|4|5|6|7|8|9), vs the graph
using the class [:digit:].

Using the OR alternation, including start you have 11 nodes. Start has
10 transitions exiting, and each digit node has a single transition
entering.

Using the digit class, you have 2 nodes, with 10 transitions that all
get you from start to the digit class node.

Obviously this is simplistic, because the members of the character
class form a subgraph which itself has to be recognized.  The
important point here is that the subgraph as a whole can represent a
/single/ node in a much more complex graph - its constituent
characters need not be repeated in the complex graph. More on this
below.

A complex DFA that combines many different regex may present other
opportunities to recognize given (possibly arbitrary) sets of
characters - opportunites that may not be apparent from looking at the
constituent regex.

>Would it recognize a common sequence of state transitions
>that occurs in two different places in the grammar?  E.g.,
>specifying the syntax for a numeric quantity in two different
>places only to realize that it's actually the same part of
>the grammar expressed as two different instances?

When given the option to find equivalence classes, flex can identify
sets of characters that are used repeatedly.  Those characters are
gathered into an "equivalence" that then can be a node in the DFA
instead of redundantly repeating individual characters.

Remember DFA are deterministic - a node can't take different actions
depending on which of multiple transitions entered (or left) it ... so
if you want the same character to be recognized in a different context
(leading to a different action), you must repeat it in a different
node.

This is where being able to identify, essentially arbitrary, sets of
character and coalesce them into a recognizer "class" is useful.  If a
given set of N(>1) characters is used M times in the graph, then by
coalescing them you remove M(N-1) nodes from your graph.  The number
of /transitions/ in the graph remains the same, but recall that it is
the /nodes/ that consume space in the lexer tables.

><number> ::= <digit><digits>
>
><value> ::= <digit><digits>
>
><expr> ::= <number> <op> <number> | <value> <op> <value>
>
>(silly example; but the inputs in each case are the same)

You're mixing abstraction levels here: <digit>, <digits>, <number>,
and <value> are lexical tokens, whereas <expr> is syntax.

However ...

Knowing that yacc and bison CAN handle characters as tokens, and
assuming you have defined <digit> and <digits> elsewhere in your
grammar, neither yacc nor bison can find this kind of equivalence. In
yacc it will result in a reduce/reduce error. In bison what happens
depends on the kind of parser you asked for (LALR,SLR,LR,GLR), but in
any case the result won't be pretty.

Assuming instead that you meant for <number> and <value> to be
recognized by the lexer rather than the parser ... flex (not lex)
could discover that <number> and <value> are equivalent, but since
they would lead to different actions: returning a different token -
both would included in the DFA.  However, whichever one happened to be
tried first would be the only one that ever was recognized, and your
parser would only ever get one of the expected tokens.

>> yacc and lex are ancient and haven't been maintained for decades. They
>> have limitations that will never be addressed, and bugs that will
>> never be fixed. If you really need backward compatibility with them,
>> it is available using bison and flex ... and if you don't need bacward
>> compatibility, bison and flex are much more powerful and modern tools
>> for use with new projects.
>> [There are other modern alternatives to yacc and lex also, but
>> discussion of them is beyond the scope of this missive.]
>
>Now, to the question at hand (or, a version thereof).
>
>It's relatively common to design an algorithm or a hardware machine
>as a "state transition diagram" and then reduce to an implementation.
>(The question, posed here, is how much this can be automated and
>the benefits that might acrue).

Algorithms for turning graphs into table driven FSM, or equivalently a
switch / case statement, are well known.  

Assuming an appropriate graphical IDE, a designer certainly could
specify a state graph and code for actions, and have a program
generated from it.  Given the right input from the designer, a great
deal of checking could be done against the graph to verify that it
covers enumerated inputs and transitions, that specified inputs lead
to specified actions, that action code exists, etc.

But what is NOT possible is to verify that all /possible/ inputs and
state transitions have been enumerated.  Nor is it possible to verify
that required actions have been specified, or necessarily that the
actions are being taken in proper context ... those are things for
which the tool simply MUST trust the graph designer.

George

Reply by Don Y ●March 14, 20232023-03-14

On 3/14/2023 6:29 PM, George Neuner wrote:
> On Mon, 13 Mar 2023 22:59:25 -0700, Don Y
> <blockedofcourse@foo.invalid> wrote:
> 
>> On 3/13/2023 8:07 PM, George Neuner wrote:
>>> On Mon, 13 Mar 2023 15:41:32 -0700, Don Y
>>> <blockedofcourse@foo.invalid> wrote:
>>>
>>>> lex/yacc are more applicable to parsing strings of characters
>>>> as you'd encounter in a (programming) language.
>>>>
>>>> AFAICT, none of  these tools knows how to optimize states by
>>>> noting equivalences (?)  [George would know for sure]
>>>> OTOH, when dealing with hardware machines, it's a common step
>>>> to reduce via implication tables, etc.
>>>
>>> yacc and bison will remove unreachable PDA states. Unreachable states
>>> in the parser typically are not deliberate but rather result from
>>> ambiguities in the grammar.  lex, being LALR(1), does not tolerate
>>> much ambiguity - however bison can generate more powerful LR(1) and
>>> GLR(1) parsers, and these are far more likely to have unreachable
>>> states accidentally generated.
>>>
>>> Similarly, lex will remove unreachable DFA states. Again, these
>>> typically are not deliberate, but rather result from lex combining the
>>> various input regex into a single DFA having multiple entry points and
>>> multiple accepting states.
>>
>> But, the FSM parallel would be an "orphan" state that would (*likely*)
>> be readily apparent in a state diagram.  "You can't get there from here".
>>
>> In hardware designs, it is not uncommon to have multiple equivalent
>> states that aren't easily recognized as such (because you tend to
>> ignore "don't cares" -- which are ripe for collapsing equivalents).
> 
> When you merge two FSM you often get redundant "don't care" nodes, but
> you also can get nodes which either are impossible to enter [dead
> code], or impossible to leave [halt], because there are no legal
> transitions that will permit it. Joining FSM involves identifying and
> pruning both types of nodes.

Then how did you decide they were equivalent?  Clearly, (at least)
one had a different set of options/transitions that is not supported
in the "merged" implementation.

>>> Then there is flex.
>>>
>>> flex has some DFA optimizations available. First, flex can compress
>>> the data tables which encode its DFA states.  Second, flex can
>>> discover "equivalence" classes: groups of characters which result in
>>> the same action.  And finally, flex can [sometimes] discover
>>> "meta-equivalence" classes: commonly expected sequences of characters
>>> and/or other equivalence classes.
>>
>> But, those are mapping equivalent *inputs* together, not equivalent
>> *states*.  I.e., treating "BEGIN" and "begin" as equivalent.
> 
> No.  Consider the case of just recognizing a decimal digit: compare
> the graph using the alternation: (0|1|2|3|4|5|6|7|8|9), vs the graph
> using the class [:digit:].
> 
> Using the OR alternation, including start you have 11 nodes. Start has
> 10 transitions exiting, and each digit node has a single transition
> entering.

Are you using "node" as a synonym for "state"?

E.g.,

     State    Powered_Up
On '1'  Goto   Entry   Executing   Accept_Digit()
On '2'  Goto   Entry   Executing   Accept_Digit()
On '3'  Goto   Entry   Executing   Accept_Digit()
On '4'  Goto   Entry   Executing   Accept_Digit()
On '5'  Goto   Entry   Executing   Accept_Digit()
On '6'  Goto   Entry   Executing   Accept_Digit()
On '7'  Goto   Entry   Executing   Accept_Digit()
On '8'  Goto   Entry   Executing   Accept_Digit()
On '9'  Goto   Entry   Executing   Accept_Digit()
..



    State  Operation_Complete
On '1'  Goto   Entry   Executing   Accept_Digit()
On '2'  Goto   Entry   Executing   Accept_Digit()
On '3'  Goto   Entry   Executing   Accept_Digit()
On '4'  Goto   Entry   Executing   Accept_Digit()
On '5'  Goto   Entry   Executing   Accept_Digit()
On '6'  Goto   Entry   Executing   Accept_Digit()
On '7'  Goto   Entry   Executing   Accept_Digit()
On '8'  Goto   Entry   Executing   Accept_Digit()
On '9'  Goto   Entry   Executing   Accept_Digit()
..


In each of these, I would have a single transition exiting the
state labeled "0+1+2+3+4+5+6+7+8+9" invoking Accept_Digit()
and destined for "Entry"

Operation_Complete (i.e., the *end* of some undisclosed sequence
of actions, preparing to restart on another iteration) is equivalent
to Powered_Up -- presumably the state just before that sequence of
actions is first started (assuming "..." are equivalent)

I can detect this with an implication table.  Even if a series of
states are involved (step1, step2, step3)  An algorithm can
similarly detect it.  And, remove one of these two states (or
series of states) from the machine (for example, "Operation_Complete"),
replacing all references to it in the machine with "Powered_Up".

Doing so *silently* represents a risk; it could be that the
two states are intended to be different -- because something
in the "..." is handled differently AND THE DEVELOPER HAS
FORGOTTEN TO ADD THAT!  So, any tool that makes that optimization
has to alert the developer that it is altering the given
machine definition in a way that it *thinks* is permissible.

> Using the digit class, you have 2 nodes, with 10 transitions that all
> get you from start to the digit class node.

I don't see where the extra nodes (states) come from.

> Obviously this is simplistic, because the members of the character
> class form a subgraph which itself has to be recognized.  The
> important point here is that the subgraph as a whole can represent a
> /single/ node in a much more complex graph - its constituent
> characters need not be repeated in the complex graph. More on this
> below.
> 
> A complex DFA that combines many different regex may present other
> opportunities to recognize given (possibly arbitrary) sets of
> characters - opportunites that may not be apparent from looking at the
> constituent regex.
> 
>> Would it recognize a common sequence of state transitions
>> that occurs in two different places in the grammar?  E.g.,
>> specifying the syntax for a numeric quantity in two different
>> places only to realize that it's actually the same part of
>> the grammar expressed as two different instances?
> 
> When given the option to find equivalence classes, flex can identify
> sets of characters that are used repeatedly.  Those characters are
> gathered into an "equivalence" that then can be a node in the DFA
> instead of redundantly repeating individual characters.

OK, but that just replaces N *identical*, "parallel" transitions
with a single one labeled with a class instead of a discrete value.
Just like "0+1+2+3+4+5+6+7+8+9".  It hasn't reduced the number of states.

> Remember DFA are deterministic - a node can't take different actions
> depending on which of multiple transitions entered (or left) it ... so
> if you want the same character to be recognized in a different context
> (leading to a different action), you must repeat it in a different
> node.
> 
> This is where being able to identify, essentially arbitrary, sets of
> character and coalesce them into a recognizer "class" is useful.  If a
> given set of N(>1) characters is used M times in the graph, then by
> coalescing them you remove M(N-1) nodes from your graph.  The number
> of /transitions/ in the graph remains the same, but recall that it is
> the /nodes/ that consume space in the lexer tables.
> 
>> <number> ::= <digit><digits>
>>
>> <value> ::= <digit><digits>
>>
>> <expr> ::= <number> <op> <number> | <value> <op> <value>
>>
>> (silly example; but the inputs in each case are the same)
> 
> You're mixing abstraction levels here: <digit>, <digits>, <number>,
> and <value> are lexical tokens, whereas <expr> is syntax.

I'm making the point that <number> and <value> are equivalent
results.  A machine that determines if an input satisfied
one would similarly satisfy the other.

So, the two machines would be duplicates of each other and
subject to being optimized into one (because <expr> treats
<number> and <value> as equivalent in *its* machine.

> However ...
> 
> Knowing that yacc and bison CAN handle characters as tokens, and
> assuming you have defined <digit> and <digits> elsewhere in your
> grammar, neither yacc nor bison can find this kind of equivalence. In
> yacc it will result in a reduce/reduce error. In bison what happens
> depends on the kind of parser you asked for (LALR,SLR,LR,GLR), but in
> any case the result won't be pretty.
> 
> Assuming instead that you meant for <number> and <value> to be
> recognized by the lexer rather than the parser ... flex (not lex)

Then the question would be whether or not the lexer could
recognize these "little machines" were equivalent and rewite
the grammar so only one was present.

> could discover that <number> and <value> are equivalent, but since
> they would lead to different actions: returning a different token -
> both would included in the DFA.  However, whichever one happened to be
> tried first would be the only one that ever was recognized, and your
> parser would only ever get one of the expected tokens.
> 
>>> yacc and lex are ancient and haven't been maintained for decades. They
>>> have limitations that will never be addressed, and bugs that will
>>> never be fixed. If you really need backward compatibility with them,
>>> it is available using bison and flex ... and if you don't need bacward
>>> compatibility, bison and flex are much more powerful and modern tools
>>> for use with new projects.
>>> [There are other modern alternatives to yacc and lex also, but
>>> discussion of them is beyond the scope of this missive.]
>>
>> Now, to the question at hand (or, a version thereof).
>>
>> It's relatively common to design an algorithm or a hardware machine
>> as a "state transition diagram" and then reduce to an implementation.
>> (The question, posed here, is how much this can be automated and
>> the benefits that might acrue).
> 
> Algorithms for turning graphs into table driven FSM, or equivalently a
> switch / case statement, are well known.

Of course.  But, they can't *create* information.  Anything that
they synthesize has to be present in the input.

E.g., I can convert an ebnf to a railroad diagram.  But, I can't
add information on the actions to be performed when each rule
in the grammar is applied (because the ebnf doesn't include that!)

Working backwards, you can't extract information from that railroad
diagram -- to include in the generated code -- that isn't there!

> Assuming an appropriate graphical IDE, a designer certainly could
> specify a state graph and code for actions, and have a program
> generated from it.  Given the right input from the designer, a great
> deal of checking could be done against the graph to verify that it
> covers enumerated inputs and transitions, that specified inputs lead
> to specified actions, that action code exists, etc.
> 
> But what is NOT possible is to verify that all /possible/ inputs and
> state transitions have been enumerated.  Nor is it possible to verify
> that required actions have been specified, or necessarily that the
> actions are being taken in proper context ... those are things for
> which the tool simply MUST trust the graph designer.

And, as above, if you want the code to do X, then, somehow, you must
put "X" into the graph.  In the examples, here, we've lumped everything
that needs to be "done" into single function calls associated with each
of the transitions.  As a single verb-noun is unlikely to be capable of
describing much detail, you're still stuck with code -- that the developer
had to explicitly write.

The dream that you can convert a drawing into an arbitrary *application*
exists as a pipe.

Reply by George Neuner ●March 22, 20232023-03-22

Hi Don,

Sorry for the delay here ... you know what's going on.


On Tue, 14 Mar 2023 19:40:07 -0700, Don Y
<blockedofcourse@foo.invalid> wrote:

>On 3/14/2023 6:29 PM, George Neuner wrote:
>> On Mon, 13 Mar 2023 22:59:25 -0700, Don Y
>> <blockedofcourse@foo.invalid> wrote:
>> 
>>> On 3/13/2023 8:07 PM, George Neuner wrote:
>>
>> When you merge two FSM you often get redundant "don't care" nodes, but
>> you also can get nodes which either are impossible to enter [dead
>> code], or impossible to leave [halt], because there are no legal
>> transitions that will permit it. Joining FSM involves identifying and
>> pruning both types of nodes.
>
>Then how did you decide they were equivalent?  Clearly, (at least)
>one had a different set of options/transitions that is not supported
>in the "merged" implementation.

You merge multiple DFA by constructing an equivalent NDFA where all
the transitions that lead to the same action are coalesced into a
single node (effectively eliminating the redundancies). Some of the
impossible halt states may also be eliminated as redundancies.

Once the minimum state NDFA is built, you turn /that/ back into a DFA
to increase performance.


>>>> Then there is flex.
>>>>
>>>> flex has some DFA optimizations available. First, flex can compress
>>>> the data tables which encode its DFA states.  Second, flex can
>>>> discover "equivalence" classes: groups of characters which result in
>>>> the same action.  And finally, flex can [sometimes] discover
>>>> "meta-equivalence" classes: commonly expected sequences of characters
>>>> and/or other equivalence classes.
>>>
>>> But, those are mapping equivalent *inputs* together, not equivalent
>>> *states*.  I.e., treating "BEGIN" and "begin" as equivalent.
>> 
>> No.  Consider the case of just recognizing a decimal digit: compare
>> the graph using the alternation: (0|1|2|3|4|5|6|7|8|9), vs the graph
>> using the class [:digit:].
>> 
>> Using the OR alternation, including start you have 11 nodes. Start has
>> 10 transitions exiting, and each digit node has a single transition
>> entering.
>
>Are you using "node" as a synonym for "state"?

I am using graph terminology - mostly because all FA builders start by
constructing a state /graph/.  After the graph is complete, it may be
implemented using tables, or Duff's device, etc.  ... but it doesn't
start that way. 8-)


>E.g.,
>
>     State    Powered_Up
>On '1'  Goto   Entry   Executing   Accept_Digit()
>On '2'  Goto   Entry   Executing   Accept_Digit()
>On '3'  Goto   Entry   Executing   Accept_Digit()
>On '4'  Goto   Entry   Executing   Accept_Digit()
>On '5'  Goto   Entry   Executing   Accept_Digit()
>On '6'  Goto   Entry   Executing   Accept_Digit()
>On '7'  Goto   Entry   Executing   Accept_Digit()
>On '8'  Goto   Entry   Executing   Accept_Digit()
>On '9'  Goto   Entry   Executing   Accept_Digit()
>..
>
>    State  Operation_Complete
>On '1'  Goto   Entry   Executing   Accept_Digit()
>On '2'  Goto   Entry   Executing   Accept_Digit()
>On '3'  Goto   Entry   Executing   Accept_Digit()
>On '4'  Goto   Entry   Executing   Accept_Digit()
>On '5'  Goto   Entry   Executing   Accept_Digit()
>On '6'  Goto   Entry   Executing   Accept_Digit()
>On '7'  Goto   Entry   Executing   Accept_Digit()
>On '8'  Goto   Entry   Executing   Accept_Digit()
>On '9'  Goto   Entry   Executing   Accept_Digit()
>..
>
>In each of these, I would have a single transition exiting the
>state labeled "0+1+2+3+4+5+6+7+8+9" invoking Accept_Digit()
>and destined for "Entry"
>
>Operation_Complete (i.e., the *end* of some undisclosed sequence
>of actions, preparing to restart on another iteration) is equivalent
>to Powered_Up -- presumably the state just before that sequence of
>actions is first started (assuming "..." are equivalent)
>
>I can detect this with an implication table.  Even if a series of
>states are involved (step1, step2, step3)  An algorithm can
>similarly detect it.  And, remove one of these two states (or
>series of states) from the machine (for example, "Operation_Complete"),
>replacing all references to it in the machine with "Powered_Up".

Absolutely it /could/ be done with table based algorithms, but I've
never seen it done that way in any real (non toy) implementation ...
graph based solutions scale better to large problems.


>> Using the digit class, you have 2 nodes, with 10 transitions that all
>> get you from start to the digit class node.
>
>I don't see where the extra nodes (states) come from.

Think about the "class" as a state in an /NDFA/ ... there are multiple
transitions that can get you in - one transition (action) that gets
you out.

When the "class" is implemented it becomes (for lack of better term) a
subroutine (subgraph) in the resulting DFA. The more the subgraph can
be reused, the more savings in the DFA.



>> Obviously this is simplistic, because the members of the character
>> class form a subgraph which itself has to be recognized.  The
>> important point here is that the subgraph as a whole can represent a
>> /single/ node in a much more complex graph - its constituent
>> characters need not be repeated in the complex graph. More on this
>> below.
>> 
>> A complex DFA that combines many different regex may present other
>> opportunities to recognize given (possibly arbitrary) sets of
>> characters - opportunites that may not be apparent from looking at the
>> constituent regex.
>> 
>>> Would it recognize a common sequence of state transitions
>>> that occurs in two different places in the grammar?  E.g.,
>>> specifying the syntax for a numeric quantity in two different
>>> places only to realize that it's actually the same part of
>>> the grammar expressed as two different instances?
>> 
>> When given the option to find equivalence classes, flex can identify
>> sets of characters that are used repeatedly.  Those characters are
>> gathered into an "equivalence" that then can be a node in the DFA
>> instead of redundantly repeating individual characters.
>
>OK, but that just replaces N *identical*, "parallel" transitions
>with a single one labeled with a class instead of a discrete value.
>Just like "0+1+2+3+4+5+6+7+8+9".  It hasn't reduced the number of states.

It will reduce the number of states IF the class is reused with the
same action.  Consider

integer:  [+-]?[:digit:]+  
float:    (((integer)\.)|((integer)?\.(integer)))(\e(integer))?

The [:digit:] class is resused 5 times. The definition of "integer"
also forms a (5x) reused meta-class that flex could recognize if told
to look for them.

Since, in this example, the [:digit:] class is always used with the
same action, it will be implemented in the DFA state graph just once.
Since the class itself consists of ~13 states: START that waits for
input, 0..9 that accept input, common EXIT, and common ERROR out if
something else is input ... treating it AS a class saves 52 states (13
x 4) in the state graph.

The common exit and error states out may be eliminated from the final
FA (assuming no conflicting uses of [:digit:], but they will be
included in initial construction of the state graph (think of them
like subroutine preamble/postamble).


>> Remember DFA are deterministic - a node can't take different actions
>> depending on which of multiple transitions entered (or left) it ... so
>> if you want the same character to be recognized in a different context
>> (leading to a different action), you must repeat it in a different
>> node.


 
>>> <number> ::= <digit><digits>
>>>
>>> <value> ::= <digit><digits>
>>>
>>> <expr> ::= <number> <op> <number> | <value> <op> <value>
>>>
>>> (silly example; but the inputs in each case are the same)
>> 
>> You're mixing abstraction levels here: <digit>, <digits>, <number>,
>> and <value> are lexical tokens, whereas <expr> is syntax.
>
>I'm making the point that <number> and <value> are equivalent
>results.  A machine that determines if an input satisfied
>one would similarly satisfy the other.
>
>So, the two machines would be duplicates of each other and
>subject to being optimized into one (because <expr> treats
><number> and <value> as equivalent in *its* machine.

Yes they are duplicates at some level of abstraction, but that level
is above what the tool can recognize and deal with.  The programmer
labeled them differently and so the tool has to assume both are
required, even if in use there is no practical way to distinguish them
in the input.


 
>> Knowing that yacc and bison CAN handle characters as tokens, and
>> assuming you have defined <digit> and <digits> elsewhere in your
>> grammar, neither yacc nor bison can find this kind of equivalence. In
>> yacc it will result in a reduce/reduce error. In bison what happens
>> depends on the kind of parser you asked for (LALR,SLR,LR,GLR), but in
>> any case the result won't be pretty.
>> 
>> Assuming instead that you meant for <number> and <value> to be
>> recognized by the lexer rather than the parser ... flex (not lex)
>
>Then the question would be whether or not the lexer could
>recognize these "little machines" were equivalent and rewite
>the grammar so only one was present.

You'd need a combination tool that produced parser and lexer from a
unfied spec.  There are such tools: e.g., ANTLR ... but I'm not aware
of any tools that do /that/ kind of optimization.


It's all about the context: there's no practical way to merge
identical recognizers if they directly lead to different actions. In
the examples above, [:digit:] could be reused only because every use
of it simply accumulated another input character ... the differences
occurred when a non-digit character was entered.

If instead you did something like:

integer: [:digit:]           return 'i'
hex:     [:digit:]|['a'-'f'] return 'h';

This would blow up in your face because 0..9 would never be recognized
a hex digit, but more importantly the 2 uses of the class lead
/immediately/ to different actions so the class subroutine (subgraph)
would have to be repeated in the FA with different exit actions.


>> could discover that <number> and <value> are equivalent, but since
>> they would lead to different actions: returning a different token -
>> both would included in the DFA.  However, whichever one happened to be
>> tried first would be the only one that ever was recognized, and your
>> parser would only ever get one of the expected tokens.



>>> Now, to the question at hand (or, a version thereof).
>>>
>>> It's relatively common to design an algorithm or a hardware machine
>>> as a "state transition diagram" and then reduce to an implementation.
>>> (The question, posed here, is how much this can be automated and
>>> the benefits that might acrue).
>> 
>> Algorithms for turning graphs into table driven FSM, or equivalently a
>> switch / case statement, are well known.
>
>Of course.  But, they can't *create* information.  Anything that
>they synthesize has to be present in the input.
>
>E.g., I can convert an ebnf to a railroad diagram.  But, I can't
>add information on the actions to be performed when each rule
>in the grammar is applied (because the ebnf doesn't include that!)
>
>Working backwards, you can't extract information from that railroad
>diagram -- to include in the generated code -- that isn't there!
>
>> Assuming an appropriate graphical IDE, a designer certainly could
>> specify a state graph and code for actions, and have a program
>> generated from it.  Given the right input from the designer, a great
>> deal of checking could be done against the graph to verify that it
>> covers enumerated inputs and transitions, that specified inputs lead
>> to specified actions, that action code exists, etc.
>> 
>> But what is NOT possible is to verify that all /possible/ inputs and
>> state transitions have been enumerated.  Nor is it possible to verify
>> that required actions have been specified, or necessarily that the
>> actions are being taken in proper context ... those are things for
>> which the tool simply MUST trust the graph designer.
>
>And, as above, if you want the code to do X, then, somehow, you must
>put "X" into the graph.  In the examples, here, we've lumped everything
>that needs to be "done" into single function calls associated with each
>of the transitions.  As a single verb-noun is unlikely to be capable of
>describing much detail, you're still stuck with code -- that the developer
>had to explicitly write.
>
>The dream that you can convert a drawing into an arbitrary *application*
>exists as a pipe.

Absolutely: somehow actions have to be specified, whether as arbitrary
user entered code attached to graph nodes, or as predefined "action"
nodes linked into the graph.  

But as I said above, there are things that simply can't be checked
without embedding significant domain knowledge into the tool itself.
That essentially precludes any notion of a generic tool ... even if
the tool included an expert system, it's likely that no generic
interface to the expert system could be designed that would
satisfactorily deal with the needs of many different domains.

Reply by Don Y ●March 22, 20232023-03-22

On 3/22/2023 1:37 PM, George Neuner wrote:
>>>>> Then there is flex.
>>>>>
>>>>> flex has some DFA optimizations available. First, flex can compress
>>>>> the data tables which encode its DFA states.  Second, flex can
>>>>> discover "equivalence" classes: groups of characters which result in
>>>>> the same action.  And finally, flex can [sometimes] discover
>>>>> "meta-equivalence" classes: commonly expected sequences of characters
>>>>> and/or other equivalence classes.
>>>>
>>>> But, those are mapping equivalent *inputs* together, not equivalent
>>>> *states*.  I.e., treating "BEGIN" and "begin" as equivalent.
>>>
>>> No.  Consider the case of just recognizing a decimal digit: compare
>>> the graph using the alternation: (0|1|2|3|4|5|6|7|8|9), vs the graph
>>> using the class [:digit:].
>>>
>>> Using the OR alternation, including start you have 11 nodes. Start has
>>> 10 transitions exiting, and each digit node has a single transition
>>> entering.
>>
>> Are you using "node" as a synonym for "state"?
> 
> I am using graph terminology - mostly because all FA builders start by
> constructing a state /graph/.  After the graph is complete, it may be
> implemented using tables, or Duff's device, etc.  ... but it doesn't
> start that way. 8-)

I build FSMs similarly.  But, you can't commit graphs to
ASCII text whereas tables are a natural consequence.

The difference seems largely to be that DFA are geared towards
expressing "languages" (sequences of symbols) whereas FSMs
are geared towards expressing sequences of events/actions.

E.g., you can build an expression to describe the legal
symbol sequences to create a particular type of token
(NUMBER, BEGIN, END, etc.) with the assumption that every
symbol *in* those tokens is handled by a single action
(accumulate_digit).

By contrast, an FSM will often have a variety of very different
symbols recognized in a given state and acted upon differently
(POWER_FAIL, POWER_RESTORED, LOW_BATTERY, CLEAR, ENTER, BARCODE_DETECTED,
etc.). These tend to have more "work" associated with their
recognition than a set of equivalent symbols (e.g., digits).

And, while each may be handled differently from the others,
they tend to be handled the same way when encountered in
different states.  I.e., a POWER_FAIL is processed the same
each place it is considered significant.

So, you want a way of expressing this common set of conditions/symbols/events
in a way that can be reused in many different states.

>> E.g.,
>>
>>      State    Powered_Up
>> On '1'  Goto   Entry   Executing   Accept_Digit()
>> On '2'  Goto   Entry   Executing   Accept_Digit()
>> On '3'  Goto   Entry   Executing   Accept_Digit()
>> On '4'  Goto   Entry   Executing   Accept_Digit()
>> On '5'  Goto   Entry   Executing   Accept_Digit()
>> On '6'  Goto   Entry   Executing   Accept_Digit()
>> On '7'  Goto   Entry   Executing   Accept_Digit()
>> On '8'  Goto   Entry   Executing   Accept_Digit()
>> On '9'  Goto   Entry   Executing   Accept_Digit()
    Check Exceptions
>> ..
>>
>>     State  Operation_Complete
>> On '1'  Goto   Entry   Executing   Accept_Digit()
>> On '2'  Goto   Entry   Executing   Accept_Digit()
>> On '3'  Goto   Entry   Executing   Accept_Digit()
>> On '4'  Goto   Entry   Executing   Accept_Digit()
>> On '5'  Goto   Entry   Executing   Accept_Digit()
>> On '6'  Goto   Entry   Executing   Accept_Digit()
>> On '7'  Goto   Entry   Executing   Accept_Digit()
>> On '8'  Goto   Entry   Executing   Accept_Digit()
>> On '9'  Goto   Entry   Executing   Accept_Digit()
    Check Exceptions
>> ..

     State   Exceptions  (this is a misnomer but used for syntactic simplicity)
On Barcode_Detected  Goto   HandleBarcode   Executing   OverwriteAccumulator()
On Power_Fail        Goto   SignalOutage    Executing   AlertOperator()
On Low_Battery       Goto   SafeguardData   Executing   OrderlyShutdown()

>> In each of these, I would have a single transition exiting the
>> state labeled "0+1+2+3+4+5+6+7+8+9" invoking Accept_Digit()
>> and destined for "Entry"
>>
>> Operation_Complete (i.e., the *end* of some undisclosed sequence
>> of actions, preparing to restart on another iteration) is equivalent
>> to Powered_Up -- presumably the state just before that sequence of
>> actions is first started (assuming "..." are equivalent)
>>
>> I can detect this with an implication table.  Even if a series of
>> states are involved (step1, step2, step3)  An algorithm can
>> similarly detect it.  And, remove one of these two states (or
>> series of states) from the machine (for example, "Operation_Complete"),
>> replacing all references to it in the machine with "Powered_Up".
> 
> Absolutely it /could/ be done with table based algorithms, but I've
> never seen it done that way in any real (non toy) implementation ...
> graph based solutions scale better to large problems.
> 
>>> Using the digit class, you have 2 nodes, with 10 transitions that all
>>> get you from start to the digit class node.
>>
>> I don't see where the extra nodes (states) come from.
> 
> Think about the "class" as a state in an /NDFA/ ... there are multiple
> transitions that can get you in - one transition (action) that gets
> you out.
> 
> When the "class" is implemented it becomes (for lack of better term) a
> subroutine (subgraph) in the resulting DFA. The more the subgraph can
> be reused, the more savings in the DFA.
> 
>>> Obviously this is simplistic, because the members of the character
>>> class form a subgraph which itself has to be recognized.  The
>>> important point here is that the subgraph as a whole can represent a
>>> /single/ node in a much more complex graph - its constituent
>>> characters need not be repeated in the complex graph. More on this
>>> below.
>>>
>>> A complex DFA that combines many different regex may present other
>>> opportunities to recognize given (possibly arbitrary) sets of
>>> characters - opportunites that may not be apparent from looking at the
>>> constituent regex.
>>>
>>>> Would it recognize a common sequence of state transitions
>>>> that occurs in two different places in the grammar?  E.g.,
>>>> specifying the syntax for a numeric quantity in two different
>>>> places only to realize that it's actually the same part of
>>>> the grammar expressed as two different instances?
>>>
>>> When given the option to find equivalence classes, flex can identify
>>> sets of characters that are used repeatedly.  Those characters are
>>> gathered into an "equivalence" that then can be a node in the DFA
>>> instead of redundantly repeating individual characters.
>>
>> OK, but that just replaces N *identical*, "parallel" transitions
>> with a single one labeled with a class instead of a discrete value.
>> Just like "0+1+2+3+4+5+6+7+8+9".  It hasn't reduced the number of states.
> 
> It will reduce the number of states IF the class is reused with the
> same action.  Consider
> 
> integer:  [+-]?[:digit:]+
> float:    (((integer)\.)|((integer)?\.(integer)))(\e(integer))?
> 
> The [:digit:] class is resused 5 times. The definition of "integer"
> also forms a (5x) reused meta-class that flex could recognize if told
> to look for them.

OK.  It's a special case of alternation *and* equivalence.
I build "state subroutines" to handle sets of symbols that
are handled the same way but "invoked" from different states
(see Exceptions).  But, the individual symbols can invoke
different actions *and* different next states -- as long as
they are consistent in each "application".

> Since, in this example, the [:digit:] class is always used with the
> same action, it will be implemented in the DFA state graph just once.
> Since the class itself consists of ~13 states: START that waits for
> input, 0..9 that accept input, common EXIT, and common ERROR out if
> something else is input ... treating it AS a class saves 52 states (13
> x 4) in the state graph.
> 
> The common exit and error states out may be eliminated from the final
> FA (assuming no conflicting uses of [:digit:], but they will be
> included in initial construction of the state graph (think of them
> like subroutine preamble/postamble).
> 
>>> Remember DFA are deterministic - a node can't take different actions
>>> depending on which of multiple transitions entered (or left) it ... so
>>> if you want the same character to be recognized in a different context
>>> (leading to a different action), you must repeat it in a different
>>> node.
> 
>>>> <number> ::= <digit><digits>
>>>>
>>>> <value> ::= <digit><digits>
>>>>
>>>> <expr> ::= <number> <op> <number> | <value> <op> <value>
>>>>
>>>> (silly example; but the inputs in each case are the same)
>>>
>>> You're mixing abstraction levels here: <digit>, <digits>, <number>,
>>> and <value> are lexical tokens, whereas <expr> is syntax.
>>
>> I'm making the point that <number> and <value> are equivalent
>> results.  A machine that determines if an input satisfied
>> one would similarly satisfy the other.
>>
>> So, the two machines would be duplicates of each other and
>> subject to being optimized into one (because <expr> treats
>> <number> and <value> as equivalent in *its* machine.
> 
> Yes they are duplicates at some level of abstraction, but that level
> is above what the tool can recognize and deal with.

That was what I was hoping *might* be possible.  I can automate
implication analysis to detect and *suggest* these equivalence
reductions (cuz it's a simple process and my tables reduce to
just simple blocks of tuples).

But, I don't want the tool to automatically make that reduction
because the developer may not have *intended* the two states to be
equivalent.  I.e., claiming that they are equivalent alerts the
developer of the fact that he may have forgotten something
(or, incorrectly specified a symbol/action).

OTOH, for nontrivial FSM, it is often too hard for a developer to
recognize sequences of states that *can* be equivalenced -- esp
with the notational shorthands that I've developed.

E.g.,

      State  Operation_Finished
On Barcode_Detected  Goto   HandleBarcode   Executing   OverwriteAccumulator()
On '1'  Goto   Entry   Executing   Accept_Digit()
On '2'  Goto   Entry   Executing   Accept_Digit()
On '3'  Goto   Entry   Executing   Accept_Digit()
On Power_Fail        Goto   SignalOutage    Executing   AlertOperator()
On Low_Battery       Goto   SafeguardData   Executing   OrderlyShutdown()
On '4'  Goto   Entry   Executing   Accept_Digit()
On '5'  Goto   Entry   Executing   Accept_Digit()
On '6'  Goto   Entry   Executing   Accept_Digit()
On '7'  Goto   Entry   Executing   Accept_Digit()
On '8'  Goto   Entry   Executing   Accept_Digit()
On '9'  Goto   Entry   Executing   Accept_Digit()
.

is equivalent to Operation_Complete.  And, would be so if it
additionally (redundantly) listed "Check Exceptions"

Finally, mnemonics for symbols resolve to specific values.
So, there's nothing that prevents a developer from assigning
two mnemonics to the same value -- intentionally or accidentally.

[I have some mechanisms that minimize the chance of this
happening.  But, there's nothing to prevent me from defining
two semantically equivalent symbols that get bound to
different values -- e.g., Power_Fail and Power_Die]

> The programmer
> labeled them differently and so the tool has to assume both are
> required, even if in use there is no practical way to distinguish them
> in the input.
>   
>>> Knowing that yacc and bison CAN handle characters as tokens, and
>>> assuming you have defined <digit> and <digits> elsewhere in your
>>> grammar, neither yacc nor bison can find this kind of equivalence. In
>>> yacc it will result in a reduce/reduce error. In bison what happens
>>> depends on the kind of parser you asked for (LALR,SLR,LR,GLR), but in
>>> any case the result won't be pretty.
>>>
>>> Assuming instead that you meant for <number> and <value> to be
>>> recognized by the lexer rather than the parser ... flex (not lex)
>>
>> Then the question would be whether or not the lexer could
>> recognize these "little machines" were equivalent and rewite
>> the grammar so only one was present.
> 
> You'd need a combination tool that produced parser and lexer from a
> unfied spec.  There are such tools: e.g., ANTLR ... but I'm not aware
> of any tools that do /that/ kind of optimization.
> 
> It's all about the context: there's no practical way to merge
> identical recognizers if they directly lead to different actions. In
> the examples above, [:digit:] could be reused only because every use
> of it simply accumulated another input character ... the differences
> occurred when a non-digit character was entered.

Yes.  My point wrt FSM having lots of different input symbols and
different associated next-states/actions.

> If instead you did something like:
> 
> integer: [:digit:]           return 'i'
> hex:     [:digit:]|['a'-'f'] return 'h';
> 
> This would blow up in your face because 0..9 would never be recognized
> a hex digit, but more importantly the 2 uses of the class lead
> /immediately/ to different actions so the class subroutine (subgraph)
> would have to be repeated in the FA with different exit actions.

Yes.  If the tool places an implicit priority on the rules
based on the order in which they are encountered.  I intentionally
don't specify this in the design of the tables, leaving the
"post processor" some latitude in how it implements them
and the runtime some potential efficiencies.

[This also lets me embed ambiguities -- for certain clients  :> ]

E.g.,
      State  Operation
On Barcode_Detected  Goto   HandleBarcode   Executing   OverwriteAccumulator()
On '1'  Goto   Entry   Executing   Accept_Digit()
On '2'  Goto   Entry   Executing   Accept_Digit()
On '3'  Goto   Entry   Executing   Accept_Digit()

and

      State  Operation
On '1'  Goto   Entry   Executing   Accept_Digit()
On '2'  Goto   Entry   Executing   Accept_Digit()
On '1'  Goto   Entry   Executing   Reject_Digit()
On '3'  Goto   Entry   Executing   Accept_Digit()
On Barcode_Detected  Goto   HandleBarcode   Executing   OverwriteAccumulator()

can be made equivalent -- or different.  Barcodes could take priority
over individual digits -- or not.  Etc.  The dispatch can trade
speed for space -- or vice versa.

[Of course, any parser generator can offer similar options.
But, as most parsers are used in a different context, the
tradeoffs aren't as productive]

>>> Assuming an appropriate graphical IDE, a designer certainly could
>>> specify a state graph and code for actions, and have a program
>>> generated from it.  Given the right input from the designer, a great
>>> deal of checking could be done against the graph to verify that it
>>> covers enumerated inputs and transitions, that specified inputs lead
>>> to specified actions, that action code exists, etc.
>>>
>>> But what is NOT possible is to verify that all /possible/ inputs and
>>> state transitions have been enumerated.  Nor is it possible to verify
>>> that required actions have been specified, or necessarily that the
>>> actions are being taken in proper context ... those are things for
>>> which the tool simply MUST trust the graph designer.
>>
>> And, as above, if you want the code to do X, then, somehow, you must
>> put "X" into the graph.  In the examples, here, we've lumped everything
>> that needs to be "done" into single function calls associated with each
>> of the transitions.  As a single verb-noun is unlikely to be capable of
>> describing much detail, you're still stuck with code -- that the developer
>> had to explicitly write.
>>
>> The dream that you can convert a drawing into an arbitrary *application*
>> exists as a pipe.
> 
> Absolutely: somehow actions have to be specified, whether as arbitrary
> user entered code attached to graph nodes, or as predefined "action"
> nodes linked into the graph.

And, the actions must be *complete*.  Or, you have to drag in
another abstraction mechanism.

A state diagram for a hardware machine is complete as it spells out
all of the gazintas and cumzoutas.

A railroad diagram effectively describes *syntax* but rarely much
more.

A "programming tool" with the same level of visibility would
likely have a boatload of cryptic syntax/notations that would
make reliable use of the tool dubious (unless a full-time job).

> But as I said above, there are things that simply can't be checked
> without embedding significant domain knowledge into the tool itself.
> That essentially precludes any notion of a generic tool ... even if
> the tool included an expert system, it's likely that no generic
> interface to the expert system could be designed that would
> satisfactorily deal with the needs of many different domains.

Reply by George Neuner ●March 26, 20232023-03-26

On Wed, 22 Mar 2023 18:15:43 -0700, Don Y
<blockedofcourse@foo.invalid> wrote:
>On 3/22/2023 1:37 PM, George Neuner wrote:

>I build FSMs similarly.  But, you can't commit graphs to
>ASCII text whereas tables are a natural consequence.

Actually you can serialize graphs as text, but the result may be
varying degrees of difficult to read for a human.

Trivial example: fire up Racket and enter

  #lang racket
  (require racket/serialize)
  (let (
       [outer (mcons 0 (mcons 1 (mcons 2 (mcons 3 '()))))]
       [inner (mcons 9 (mcons 8 (mcons 7 '())))]
       [cycle '()]
       )
    (writeln outer)   
    (writeln inner)   
    (set-mcdr! (mcdr (mcdr (mcdr outer))) outer)
    (set-mcdr! (mcdr (mcdr inner)) inner)
    (set-mcar! (mcdr outer) inner)
    (writeln outer)   
    (writeln (serialize outer))
    (writeln (deserialize (serialize outer)))
    )

This creates a simple graph having 2 cycles and prints it. See what
happens. The result of (serialize _) is a string that can be saved to
a file. You can substitute structs for mcons (mutable cons) if you
want to experiment with making more complicated graphs.

Tables are great as implementation ... but for a construction tool
that needs to store and modify graphs, it /can/ be a pain to
reconstruct complicated graphs from table representations.  It's a
hell of a lot easier to read back a suitably serialized form.

>The difference seems largely to be that DFA are geared towards
>expressing "languages" (sequences of symbols) whereas FSMs
>are geared towards expressing sequences of events/actions.

The terms "FA" (finite automaton) and "FSM" (finite state machine)
are, in fact, synonymous.

What is confusing is that we got to this point through discussion of
parsing and lexing tools - which ARE geared toward languages.
Moreover, yacc and bison do NOT implement a general FA, but rather a
particular variety of FA that useful for language parsing and which
involves an auxiliary stack.

Purely as a techical matter, (f)lex can create general FA assuming
that transition conditions can be represented as character input to
the reader.  The "reader" function is completely redefineable: the
default is to read from STDIN, but, in fact, a custom reader can do
absolutely anything under the hood so long as it returns a character
(or EOF) when called.

In practice you would not want to do this. A decent UML tool would be
a much better choice.

>By contrast, an FSM will often have a variety of very different
>symbols recognized in a given state and acted upon differently
>(POWER_FAIL, POWER_RESTORED, LOW_BATTERY, CLEAR, ENTER, BARCODE_DETECTED,
>etc.). These tend to have more "work" associated with their
>recognition than a set of equivalent symbols (e.g., digits).
>
>And, while each may be handled differently from the others,
>they tend to be handled the same way when encountered in
>different states.  I.e., a POWER_FAIL is processed the same
>each place it is considered significant.

Yes. And you could (at least theoretically) represent this in flex by
encoding POWER_FAIL, etc. as characters or strings and sending those
characters or strings as input to the reader when those events occur.
Internal state transitions can be handled the same way: send
characters to the reader.

Again, this is an abuse of the tool. Just because you can do it does
not mean you should do it.

>I build "state subroutines" to handle sets of symbols that
>are handled the same way but "invoked" from different states
>(see Exceptions).  But, the individual symbols can invoke
>different actions *and* different next states -- as long as
>they are consistent in each "application".

flex (not lex) permits defining contextual "start" states, which the
code can arbitrarily switch among. The same input can be treated
differently in different start states.  These really are coroutines -
not subroutines - and the user code decides which state to switch to
next, but flex does provides a stack so you can use them as
subroutines (without having to track the nesting yourself).

>> If instead you did something like:
>> 
>> integer: [:digit:]           return 'i'
>> hex:     [:digit:]|['a'-'f'] return 'h';
>> 
>> This would blow up in your face because 0..9 would never be recognized
>> a hex digit, but more importantly the 2 uses of the class lead
>> /immediately/ to different actions so the class subroutine (subgraph)
>> would have to be repeated in the FA with different exit actions.
>
>Yes.  If the tool places an implicit priority on the rules
>based on the order in which they are encountered.  I intentionally
>don't specify this in the design of the tables, leaving the
>"post processor" some latitude in how it implements them
>and the runtime some potential efficiencies.

The tool places priority on the longest, most precise match. It falls
back on definition order when the input - as given - matches multiple
patterns.

But again, start states can (sometimes) be used to get around this
behavior.

George

Reply by Don Y ●March 26, 20232023-03-26

On 3/25/2023 9:45 PM, George Neuner wrote:
>> The difference seems largely to be that DFA are geared towards
>> expressing "languages" (sequences of symbols) whereas FSMs
>> are geared towards expressing sequences of events/actions.
> 
> The terms "FA" (finite automaton) and "FSM" (finite state machine)
> are, in fact, synonymous.

Yes, though (IME) taught to different audiences and in different ways.

My hardware classes talked about FSMs, Meely/Moore, "state diagrams"
and optimization techniques.

My software classes talked about DFAs, EBNFs, *railroad* diagrams
but never a mention of optimization tools or techniques.

They also seem to be applied differently.  E.g., in a (hardware) FSM,
it is not uncommon to list a logical expression as the stimulus
for a transition (e.g., "LineFeed & /Last_Line" vs. "LineFeed & LastLine"
directing the machine to two different states with two different outputs
or actions).  In DFAs, it was always just sequences of symbols -- the
sorts of things that would specify a grammar (inherently serial, one-at-a-time
"conditions").

> What is confusing is that we got to this point through discussion of
> parsing and lexing tools - which ARE geared toward languages.

From:
    "AFAICT, none of  these tools knows how to optimize states by
     noting equivalences (?)  [George would know for sure]
     OTOH, when dealing with hardware machines, it's a common step
     to reduce via implication tables, etc."

I.e., one could express a FSM as a yacc grammar -- write an "action
routine" (admittedly, probably very terse due to the format of a yacc
file) for each "symbol" as if symbols were events/input conditions.
So, conceivably, the lexer could generate LF_Not_LastLine and
LF_LastLine symbols that the yacc parser could act on (with suitable
actions assigned on the "right hand side")

Given this duality, the pertinence of my above comment is evident:
could yacc (et al.) identify the same sorts of equivalent states
that I would identify and eliminate with implication table analysis
if it was an FSM?

> Moreover, yacc and bison do NOT implement a general FA, but rather a
> particular variety of FA that useful for language parsing and which
> involves an auxiliary stack.
> 
> Purely as a techical matter, (f)lex can create general FA assuming
> that transition conditions can be represented as character input to
> the reader.  The "reader" function is completely redefineable: the
> default is to read from STDIN, but, in fact, a custom reader can do
> absolutely anything under the hood so long as it returns a character
> (or EOF) when called.

Therein lies a notable limitation.  In a (hardware) FSM, there are no limits
to the number of inputs that can CONCURRENTLY be examined by the machine.
E.g., I could label a transition with:
     A*/B*/C*D*E*F*/G*H*I*J*/K*/L*/M + N*O*P*Q + /R*/S*/T*U*V + W + X*Y*Z
To represent this to lex/yacc, I would have to reduce it to a "narrow"
symbol -- possible if there are only a limited number of such combinations
in the grammar (as sourced by the lexer).

> In practice you would not want to do this. A decent UML tool would be
> a much better choice.
> 
>> By contrast, an FSM will often have a variety of very different
>> symbols recognized in a given state and acted upon differently
>> (POWER_FAIL, POWER_RESTORED, LOW_BATTERY, CLEAR, ENTER, BARCODE_DETECTED,
>> etc.). These tend to have more "work" associated with their
>> recognition than a set of equivalent symbols (e.g., digits).
>>
>> And, while each may be handled differently from the others,
>> they tend to be handled the same way when encountered in
>> different states.  I.e., a POWER_FAIL is processed the same
>> each place it is considered significant.
> 
> Yes. And you could (at least theoretically) represent this in flex by
> encoding POWER_FAIL, etc. as characters or strings and sending those
> characters or strings as input to the reader when those events occur.
> Internal state transitions can be handled the same way: send
> characters to the reader.
> 
> Again, this is an abuse of the tool. Just because you can do it does
> not mean you should do it.

It would have appeal *if* it could perform reductions/optimizations
that would otherwise have to be done by hand (or with another tool)

>> I build "state subroutines" to handle sets of symbols that
>> are handled the same way but "invoked" from different states
>> (see Exceptions).  But, the individual symbols can invoke
>> different actions *and* different next states -- as long as
>> they are consistent in each "application".
> 
> flex (not lex) permits defining contextual "start" states, which the
> code can arbitrarily switch among. The same input can be treated
> differently in different start states.  These really are coroutines -
> not subroutines - and the user code decides which state to switch to
> next, but flex does provides a stack so you can use them as
> subroutines (without having to track the nesting yourself).
> 
>>> If instead you did something like:
>>>
>>> integer: [:digit:]           return 'i'
>>> hex:     [:digit:]|['a'-'f'] return 'h';
>>>
>>> This would blow up in your face because 0..9 would never be recognized
>>> a hex digit, but more importantly the 2 uses of the class lead
>>> /immediately/ to different actions so the class subroutine (subgraph)
>>> would have to be repeated in the FA with different exit actions.
>>
>> Yes.  If the tool places an implicit priority on the rules
>> based on the order in which they are encountered.  I intentionally
>> don't specify this in the design of the tables, leaving the
>> "post processor" some latitude in how it implements them
>> and the runtime some potential efficiencies.
> 
> The tool places priority on the longest, most precise match. It falls
> back on definition order when the input - as given - matches multiple
> patterns.

In a (hardware) FSM, one would see all of the "possible exits" from a
particular state and could notice ambiguities:
       X*Y*/Z
       X*Y
clearly overlap.

Furthermore, one could detect these conflicts with a simple tool;
it need not understand the entire machine, just look at a single state
and the transitions leaving it.

> But again, start states can (sometimes) be used to get around this
> behavior.

What's interesting (being a hardware-software person) is that, despite
the obvious duality, the approaches taken to these technologies is so
disjointed.  DFA tend to use a parser-generator of preference while FSMs
(in software) have a variety of different implementations with dramatic
design and runtime differences in efficiencies.

Similarly, that hardware FSMs tend to be designed with total disregard
to the possible applicability of parser generators, regex compilers, etc.

Its as if each domain has its own notion of how the technology should
be applied and implemented.