EmbeddedRelated.com
Forums
The 2026 Embedded Online Conference

Parsers for extensible grammars?

Started by Don Y October 22, 2014
On 2014-10-23, Don Y <this@is.not.me.com> wrote:
> > I hadn't considered the idea of expressing "new" commands in terms > of "old"/common commands. > > What I am looking for is the ability to define a set of commands > that will "always" be used/useful. Then, allow others to add > to that set of commands to address specific needs that are > necessary/unique to their environment. (i.e., the environment > in which these common commands are deployed) > > Contrived example: > > MESSAGE <text> > PAUSE <delay> > STDIN <device> > STDOUT <device> >
[snip] Are you familiar with how the VMS operating system handles a similar issue for it's CLI command language ? The VMS CLI is called DCL and DCL has an extendable command language which allows users/programmers to add new commands, which follow certain DCL imposed syntax rules, to the process specific command table. Unlike with Unix and friends, much of a DCL command line, such as if a qualifier/option is supported for this command, or if required values for a option is missing, is actually validated by DCL before the executable behind the command itself is actually run. This is possible because each command has a command definition file which describes what the list of options, their types, and what combinations are disallowed, for each command. This command definition file is compiled into a binary form which is directly accessible from DCL. While this isn't a direct match for what you are describing, it sounds pretty close and you might get some ideas from it. The VMS documentation is online at: http://h71000.www7.hp.com/doc/os84_index.html and you want the first part of the "HP OpenVMS Command Definition, Librarian, and Message Utilities Manual", which is available from: http://h71000.www7.hp.com/doc/82final/6100/aa-qsbde-te.PDF (Ignore the Message Utilities and Librarian sections; they are not relevant here.) Any other manuals mentioned in this manual are also available from the above os84_index.html link. I recommend you follow the PDF link for each manual; the HTML documentation is not as well done as it should be. Simon. PS: And yes, this is a part of my day job. :-) (For now...) -- Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP Microsoft: Bringing you 1980s technology to a 21st century world
On 10/24/2014 7:02 PM, Simon Clubley wrote:
> On 2014-10-23, Don Y <this@is.not.me.com> wrote: >> >> I hadn't considered the idea of expressing "new" commands in terms >> of "old"/common commands. >> >> What I am looking for is the ability to define a set of commands >> that will "always" be used/useful. Then, allow others to add >> to that set of commands to address specific needs that are >> necessary/unique to their environment. (i.e., the environment >> in which these common commands are deployed)
> [snip] > > Are you familiar with how the VMS operating system handles a similar > issue for it's CLI command language ?
I$DISCARDED$MY$3000AXP$A$DECADE$AGO
> The VMS CLI is called DCL and DCL has an extendable command language > which allows users/programmers to add new commands, which follow > certain DCL imposed syntax rules, to the process specific command table. > > Unlike with Unix and friends, much of a DCL command line, such as if a > qualifier/option is supported for this command, or if required values > for a option is missing, is actually validated by DCL before the > executable behind the command itself is actually run.
Ah, I didn't realize that! Interesting approach -- assuming there are no other ways to invoke binaries that bypass this.
> This is possible because each command has a command definition file > which describes what the list of options, their types, and what > combinations are disallowed, for each command. This command definition > file is compiled into a binary form which is directly accessible from > DCL. > > While this isn't a direct match for what you are describing, it sounds > pretty close and you might get some ideas from it.
It's worth a closer look!
> The VMS documentation is online at: > > http://h71000.www7.hp.com/doc/os84_index.html > > and you want the first part of the "HP OpenVMS Command Definition, > Librarian, and Message Utilities Manual", which is available from: > > http://h71000.www7.hp.com/doc/82final/6100/aa-qsbde-te.PDF > > (Ignore the Message Utilities and Librarian sections; they are not > relevant here.)
Thanks!
> Any other manuals mentioned in this manual are also available from > the above os84_index.html link. I recommend you follow the PDF link > for each manual; the HTML documentation is not as well done as it > should be. > > Simon. > > PS: And yes, this is a part of my day job. :-) (For now...) >
On 2014-10-25, Don Y <this@is.not.me.com> wrote:
> On 10/24/2014 7:02 PM, Simon Clubley wrote: >> The VMS CLI is called DCL and DCL has an extendable command language >> which allows users/programmers to add new commands, which follow >> certain DCL imposed syntax rules, to the process specific command table. >> >> Unlike with Unix and friends, much of a DCL command line, such as if a >> qualifier/option is supported for this command, or if required values >> for a option is missing, is actually validated by DCL before the >> executable behind the command itself is actually run. > > Ah, I didn't realize that! Interesting approach -- assuming there are no > other ways to invoke binaries that bypass this. >
Commands can exist outside of the above infrastructure if one wishes; they are called foreign commands and are treated pretty much like a Unix command would be - there's no validation of options prior to the executable starting and hence there's no nicely pre-parsed options and option values ready to be read from within the program itself. Foreign commands are used for example when porting Unix tools to VMS as it means the existing command line parsing code in the tool can be used pretty much unchanged. On VMS, if your executable takes no command line input, you can also just run the executable with the run command. However it's your choice - you can have DCL do some of the validation work for you by having your program integrated into the above DCL infrastructure or you can do it all yourself in your program just as you do on Unix and friends. IOW, having foreign commands available as an option doesn't stop you from also having the native DCL integrated approach. If you try to run a DCL integrated executable as a foreign command, no values will be available to read from within your program so you are forced to run it via the DCL mechanism above which means you also get the DCL level validation as well. Simon. -- Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP Microsoft: Bringing you 1980s technology to a 21st century world
On 10/25/2014 4:15 AM, Simon Clubley wrote:
> On 2014-10-25, Don Y <this@is.not.me.com> wrote: >> On 10/24/2014 7:02 PM, Simon Clubley wrote: >>> The VMS CLI is called DCL and DCL has an extendable command language >>> which allows users/programmers to add new commands, which follow >>> certain DCL imposed syntax rules, to the process specific command table. >>> >>> Unlike with Unix and friends, much of a DCL command line, such as if a >>> qualifier/option is supported for this command, or if required values >>> for a option is missing, is actually validated by DCL before the >>> executable behind the command itself is actually run. >> >> Ah, I didn't realize that! Interesting approach -- assuming there are no >> other ways to invoke binaries that bypass this. > > Commands can exist outside of the above infrastructure if one wishes; > they are called foreign commands and are treated pretty much like a > Unix command would be - there's no validation of options prior to > the executable starting and hence there's no nicely pre-parsed options > and option values ready to be read from within the program itself.
You can *freely* mark any command as either type? I.e., can I mark a command that relies on the DCL stuff for option parsing as a FOREIGN command? And, thus, screw it up (at runtime)?
> Foreign commands are used for example when porting Unix tools to VMS as > it means the existing command line parsing code in the tool can be > used pretty much unchanged. > > On VMS, if your executable takes no command line input, you can also > just run the executable with the run command. > > However it's your choice - you can have DCL do some of the validation > work for you by having your program integrated into the above DCL > infrastructure or you can do it all yourself in your program just as > you do on Unix and friends. > > IOW, having foreign commands available as an option doesn't stop you > from also having the native DCL integrated approach. If you try to run > a DCL integrated executable as a foreign command, no values will be > available to read from within your program so you are forced to run it > via the DCL mechanism above which means you also get the DCL level > validation as well.
So, this (DCL) mechanism is meant as an *aid*/service -- not as a means of ensuring software integrity (?).
On 10/24/2014 4:56 AM, Paul E Bennett wrote:
> Don Y wrote: > >> Hi, >> >> [I probably should direct this at George... :> ] >> >> I'm writing a command parser. Some set of commands are "common". >> But, other instances of the parser are augmented with additional >> commands/syntax. >> >> [These additions are known at compile time, not "dynamic"] >> >> Ideally, I want a solution that allows folks developing those >> "other" commands to just "bolt onto" what I have done. E.g., >> creating a single "grammar definition" (lex/yacc) is probably >> not a good way to go (IME, most small parsers tend to be ad hoc). >> >> [Note: I can't even guarantee that the extensions will be >> consistent or "harmonious" with the grammar that I implement] >> >> A naive approach (?) might be for my code to take a crack at >> a particular "statement"/command and, in case of FAIL, invoke >> some bolt-on parser to give it a chance to make sense out of >> the input. If *it* FAILs, then the input is invalid, etc. >> >> This sounds like an incredible kluge. Any better suggestions? > > Hi Don, > > Have you looked at Forth? What you are describing sounds like the problem > that Forth solved back in 1968. It is the sort of thing that Forth > programmers have been doing for nearly 50 years and it works very > successfully for us. > > See <http://www.mpeforth.com/> and <http://www.forth.com/>
Don is a hard person to suggest things to. I'm actually surprised that his response was as positive as it was. I have had great success at solving many problems with Forth. There are text file formats can be read by treating various words as commands, defining those words in Forth to execute while reading the file and storing the numeric data where it needs to be stored. Easy peasy. Not quite the same thing as what Don is likely doing, but very similar. Don may be looking for something that will let him check the input for the correct syntax, number of values, etc. But as usual he has not really defined the problem he is trying to solve. -- Rick
On 10/23/2014 11:30 PM, Les Cargill wrote:
> Don Y wrote: >> Hi, >> >> [I probably should direct this at George... :> ] >> >> I'm writing a command parser. Some set of commands are "common". >> But, other instances of the parser are augmented with additional >> commands/syntax. >> >> [These additions are known at compile time, not "dynamic"] >> >> Ideally, I want a solution that allows folks developing those >> "other" commands to just "bolt onto" what I have done. E.g., >> creating a single "grammar definition" (lex/yacc) is probably >> not a good way to go (IME, most small parsers tend to be ad hoc). >> >> [Note: I can't even guarantee that the extensions will be >> consistent or "harmonious" with the grammar that I implement] >> >> A naive approach (?) might be for my code to take a crack at >> a particular "statement"/command and, in case of FAIL, invoke >> some bolt-on parser to give it a chance to make sense out of >> the input. If *it* FAILs, then the input is invalid, etc. >> >> This sounds like an incredible kluge. Any better suggestions? > > > If all else fails, strstr() keywords, then have a corresponding > parsers for each keyword. > > A better approach is a container (struct) of keywords or arguments ( if > it's not a keyword, it's an argument ) , an integer for the index of > keywords in s string table, and an index for the position of the token > within the command. > > Extensibility should not be difficult.
That is starting to sound a lot like Forth. -- Rick
rickman wrote:
> On 10/23/2014 11:30 PM, Les Cargill wrote: >> Don Y wrote: >>> Hi, >>> >>> [I probably should direct this at George... :> ] >>> >>> I'm writing a command parser. Some set of commands are "common". >>> But, other instances of the parser are augmented with additional >>> commands/syntax. >>> >>> [These additions are known at compile time, not "dynamic"] >>> >>> Ideally, I want a solution that allows folks developing those >>> "other" commands to just "bolt onto" what I have done. E.g., >>> creating a single "grammar definition" (lex/yacc) is probably >>> not a good way to go (IME, most small parsers tend to be ad hoc). >>> >>> [Note: I can't even guarantee that the extensions will be >>> consistent or "harmonious" with the grammar that I implement] >>> >>> A naive approach (?) might be for my code to take a crack at >>> a particular "statement"/command and, in case of FAIL, invoke >>> some bolt-on parser to give it a chance to make sense out of >>> the input. If *it* FAILs, then the input is invalid, etc. >>> >>> This sounds like an incredible kluge. Any better suggestions? >> >> >> If all else fails, strstr() keywords, then have a corresponding >> parsers for each keyword. >> >> A better approach is a container (struct) of keywords or arguments ( if >> it's not a keyword, it's an argument ) , an integer for the index of >> keywords in s string table, and an index for the position of the token >> within the command. >> >> Extensibility should not be difficult. > > That is starting to sound a lot like Forth. >
Forth could easily be a better choice. I need to spend more time on Forth. I tend towards Tcl because I have a large codebase of scripts for it. It also excels at socket/serial port handling. -- Les Cargill
On 10/25/2014 2:14 PM, Les Cargill wrote:
> rickman wrote: >> On 10/23/2014 11:30 PM, Les Cargill wrote: >>> Don Y wrote: >>>> Hi, >>>> >>>> [I probably should direct this at George... :> ] >>>> >>>> I'm writing a command parser. Some set of commands are "common". >>>> But, other instances of the parser are augmented with additional >>>> commands/syntax. >>>> >>>> [These additions are known at compile time, not "dynamic"] >>>> >>>> Ideally, I want a solution that allows folks developing those >>>> "other" commands to just "bolt onto" what I have done. E.g., >>>> creating a single "grammar definition" (lex/yacc) is probably >>>> not a good way to go (IME, most small parsers tend to be ad hoc). >>>> >>>> [Note: I can't even guarantee that the extensions will be >>>> consistent or "harmonious" with the grammar that I implement] >>>> >>>> A naive approach (?) might be for my code to take a crack at >>>> a particular "statement"/command and, in case of FAIL, invoke >>>> some bolt-on parser to give it a chance to make sense out of >>>> the input. If *it* FAILs, then the input is invalid, etc. >>>> >>>> This sounds like an incredible kluge. Any better suggestions? >>> >>> >>> If all else fails, strstr() keywords, then have a corresponding >>> parsers for each keyword. >>> >>> A better approach is a container (struct) of keywords or arguments ( if >>> it's not a keyword, it's an argument ) , an integer for the index of >>> keywords in s string table, and an index for the position of the token >>> within the command. >>> >>> Extensibility should not be difficult. >> >> That is starting to sound a lot like Forth. >> > > Forth could easily be a better choice. I need to spend > more time on Forth. > > I tend towards Tcl because I have a large codebase of scripts > for it. It also excels at socket/serial port handling.
In many ways Forth is amazingly simple. You define words (subroutines) that have actions. Words are stored in the dictionary. Forth has built in a "parser" that scans the input for words and numbers (in that order). The dictionary is searched for words in the input stream and when found they are executed. If no word is found Forth checks to see if the "word" is actually a number. If it is a number it is pushed onto the stack. Pretty simple, no? The action of a word can make use of system words to further parse the input stream. This is done if the input "grammar" is not RPN style with the values first (nouns) and the words (verbs) last. This is done even for some Forth words like "TO" which is used to store a value in a variable, e.g. "99 TO BottlesOfBeer". Most people find the use of the stack to be a problem for them while it is really no big deal. It's just different. Forth has other issues which relate to the fact that not so many people use it. But it seems to be a very useful tool to me. -- Rick
rickman wrote:

> On 10/24/2014 4:56 AM, Paul E Bennett wrote: >> Don Y wrote: >> >>> Hi, >>> >>> [I probably should direct this at George... :> ] >>> >>> I'm writing a command parser. Some set of commands are "common". >>> But, other instances of the parser are augmented with additional >>> commands/syntax. >>> >>> [These additions are known at compile time, not "dynamic"] >>> >>> Ideally, I want a solution that allows folks developing those >>> "other" commands to just "bolt onto" what I have done. E.g., >>> creating a single "grammar definition" (lex/yacc) is probably >>> not a good way to go (IME, most small parsers tend to be ad hoc). >>> >>> [Note: I can't even guarantee that the extensions will be >>> consistent or "harmonious" with the grammar that I implement] >>> >>> A naive approach (?) might be for my code to take a crack at >>> a particular "statement"/command and, in case of FAIL, invoke >>> some bolt-on parser to give it a chance to make sense out of >>> the input. If *it* FAILs, then the input is invalid, etc. >>> >>> This sounds like an incredible kluge. Any better suggestions? >> >> Hi Don, >> >> Have you looked at Forth? What you are describing sounds like the problem >> that Forth solved back in 1968. It is the sort of thing that Forth >> programmers have been doing for nearly 50 years and it works very >> successfully for us. >> >> See <http://www.mpeforth.com/> and <http://www.forth.com/> > > Don is a hard person to suggest things to. I'm actually surprised that > his response was as positive as it was. > > I have had great success at solving many problems with Forth. There are > text file formats can be read by treating various words as commands, > defining those words in Forth to execute while reading the file and > storing the numeric data where it needs to be stored. Easy peasy. Not > quite the same thing as what Don is likely doing, but very similar. > > Don may be looking for something that will let him check the input for > the correct syntax, number of values, etc. But as usual he has not > really defined the problem he is trying to solve.
I don't know if Don has tried playing with Forth or not, I just made a suggestion that what he was seeking to do sounded somewhat Forth-like in nature. With somewhat more difficulty he could probably look at re-creating a MSDOS type environment which would also suit the bill (new commands in batch files etc or added programmes in the COMMAND directory). You are probably right about a lack of Clear, Concise, Correct, Coherent, Complete & Confirmable (Testable) specification of the requirements. -- ******************************************************************** Paul E. Bennett IEng MIET.....<email://Paul_E.Bennett@topmail.co.uk> Forth based HIDECS Consultancy.............<http://www.hidecs.co.uk> Mob: +44 (0)7811-639972 Tel: +44 (0)1235-510979 Going Forth Safely ..... EBA. www.electric-boat-association.org.uk.. ********************************************************************
On 2014-10-25, Don Y <this@is.not.me.com> wrote:
> On 10/25/2014 4:15 AM, Simon Clubley wrote: >> >> Commands can exist outside of the above infrastructure if one wishes; >> they are called foreign commands and are treated pretty much like a >> Unix command would be - there's no validation of options prior to >> the executable starting and hence there's no nicely pre-parsed options >> and option values ready to be read from within the program itself. > > You can *freely* mark any command as either type? I.e., can I mark > a command that relies on the DCL stuff for option parsing as a FOREIGN > command? And, thus, screw it up (at runtime)? >
Yes, but it would fail in a clean way with a status code returned from the CLI routine called by the program to return the (non-existent) pre-parsed information. The idea behind pointing you to the DCL CLD material was to give you some possible ideas about how extendable CLIs with validated options are handled in another environment (ie: VMS). Like I said at the time, it's not an exact match for your requirements but I thought you might be interested in seeing how a similar problem was handled in VMS, including the syntax used in the command definition file (and compare it to how the problem is fully pushed to the executable itself in Unix land).
>> >> IOW, having foreign commands available as an option doesn't stop you >> from also having the native DCL integrated approach. If you try to run >> a DCL integrated executable as a foreign command, no values will be >> available to read from within your program so you are forced to run it >> via the DCL mechanism above which means you also get the DCL level >> validation as well. > > So, this (DCL) mechanism is meant as an *aid*/service -- not as a means > of ensuring software integrity (?). >
It's a way of expressing an expected structure for a command line which can be somewhat validated by DCL before the program even starts and provides a robust, operating system level, method for a program to obtain command line parameters and options in a way (and with functionality) that leaves getopt and friends standing in the dust. The relevance here is that I've encountered people who have never been exposed to the VMS way of handling this and who think that ad-hoc getopt style functionality is the only _possible_ way to parse command lines. I just wanted to make reference to another way of doing this in case it gave you some ideas even though there's probably nothing you can _directly_ use here. Sometimes Don you leave your questions wide open, presumably in order to invite a wide range of options in response, including ones you had never even considered. The difficulty with that is that sometimes it's hard to understand what additional unspoken constraints might exist. :-) Simon. -- Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP Microsoft: Bringing you 1980s technology to a 21st century world
The 2026 Embedded Online Conference