EmbeddedRelated.com
Forums

Parser, again!

Started by jmariano December 14, 2013
Dear All,

I'm in need of some advice! (not the full solution). A word of caution: i'm not an computer science guy, so probably i'm not using the correct therms! 

I'm a part-time prototype developer at my university. My latest project is a box with a microcontroller that measures and actuates on stuff. The box is under the command of a PC, using RS232 or USB, in a master-slave model, the PC being the master. I want to use a message based command language, similar to SCPI, but not so complicated (no tree structure). I was thinking in something like START, STOP SETADC 1000, REAADC 1, etc. 

I have to define the syntax and program the parser (in C) on the uc side. Since I don't have very strict specification on the syntax, I can define it in such a way that makes it more easy to analyse, or more robust or etc.

So, my questions are:

1 - Regarding the language definition: Are there god examples of such language that I can get inspiration from? And references? I'm sure someone as already thought about this in a formal way. I'm looking for practical advice like shall I use fixed length commands (6 character, for example), start-of-message character (#)? Why? And the arguments, separated by commas, spaces? etc.  

2 - Regarding the parser: Is it really a parser that I need or is it something else? Where can I read about this? I just don't what to read the full dragon book just to get to the conclusion that it was the wrong book! 

Any thoughts are welcome.

Regards

Mariano 
On Saturday, December 14, 2013 11:08:50 PM UTC+2, jmariano wrote:
> Dear All, > > I'm in need of some advice! (not the full solution). A word of caution: > i'm not an computer science guy, so probably i'm not using the > correct therms! > > I'm a part-time prototype developer at my university. My latest > project is a box with a microcontroller that measures and actuates > on stuff. The box is under the command of a PC, using RS232 or USB, > in a master-slave model, the PC being the master. I want to use a message > based command language, similar to SCPI, but not so complicated > (no tree structure). I was thinking in something like START, STOP > SETADC 1000, REAADC 1, etc. > > > I have to define the syntax and program the parser (in C) on the uc side. > Since I don't have very strict specification on the syntax, I can define > it in such a way that makes it more easy to analyse, or more robust or etc. > > So, my questions are: > > > > 1 - Regarding the language definition: Are there god examples of such > language that I can get inspiration from? And references? I'm sure > someone as already thought about this in a formal way. I'm looking > for practical advice like shall I use fixed length commands > (6 character, for example), start-of-message character > (#)? Why? And the arguments, separated by commas, spaces? etc. > > 2 - Regarding the parser: Is it really a parser that I need or is it > something else? Where can I read about this? I just don't what to read > the full dragon book just to get to the conclusion that it was the > wrong book! > > Any thoughts are welcome. > > Regards > Mariano
Here is a protocol I defined some 5-10 years ago to talk to a slave; it was meant for machine use (i.e. commands do not have to sound "natural language" like but they were still ASCII so one could talk/debug etc. using a terminal. Might be a useful reading, pretty short I hope. http://tgi-sci.com/misc/sdvctl.txt Dimiter ------------------------------------------------------ Dimiter Popoff, TGI http://www.tgi-sci.com ------------------------------------------------------ http://www.flickr.com/photos/didi_tgi/sets/72157600228621276/
jmariano wrote:
[ ... ]
> I'm a part-time prototype developer at my university. My latest project is > a box with a microcontroller that measures and actuates on stuff. The box > is under the command of a PC, using RS232 or USB, in a master-slave model, > the PC being the master. I want to use a message based command language, > similar to SCPI, but not so complicated (no tree structure). I was > thinking in something like START, STOP SETADC 1000, REAADC 1, etc. > > I have to define the syntax and program the parser (in C) on the uc side. > Since I don't have very strict specification on the syntax, I can define > it in such a way that makes it more easy to analyse, or more robust or > etc. >
[ ... ] I've got a dead-simple "language" that I've used a few times in uC systems. It can accumulate a single numeric value from successive input digits, and apply postfix operators to that value. A simple example might code START and STOP as A and Z, for instance, and SET and READ as S and R, then a command sequence could be A1000S1RZ Usually I use "!" as a reset that clears any accumulated value that might be left over from anything that may have happened before; whitespace characters are explicit no-ops in case people are typing this stuff in. Generally speaking an operator using the numeric value consumes it, and zeros it so the next value can be accumulated. It's dead simple to code the interpreter, and it's dead simple to create these command strings in a PC-based program. As long as the actions you have to perform are equally simple. Mel.
jmariano wrote:
> Dear All, > > I'm in need of some advice! (not the full solution). A word of > caution: i'm not an computer science guy, so probably i'm not using > the correct therms! > > I'm a part-time prototype developer at my university. My latest > project is a box with a microcontroller that measures and actuates on > stuff. The box is under the command of a PC, using RS232 or USB, in a > master-slave model, the PC being the master. I want to use a message > based command language, similar to SCPI, but not so complicated (no > tree structure). I was thinking in something like START, STOP SETADC > 1000, REAADC 1, etc. > > I have to define the syntax and program the parser (in C) on the uc > side. Since I don't have very strict specification on the syntax, I > can define it in such a way that makes it more easy to analyse, or > more robust or etc. > > So, my questions are: > > 1 - Regarding the language definition: Are there god examples of such > language that I can get inspiration from? And references? I'm sure > someone as already thought about this in a formal way. I'm looking > for practical advice like shall I use fixed length commands (6 > character, for example), start-of-message character (#)? Why? And the > arguments, separated by commas, spaces? etc. >
So this is a rough sketch of how I'd attack this. I may not even use this style ( bracing and what not ) as production code, but it's shortened to fit in a Usenet post. I have not run it thru a compiler. enum { OK, WTF, RANGE } errcode; // almost certainly incorrect int procSETADC(const char *cmd) { // parse out the argument const int LARGEST = 4200; const int SMALLEST = 12; const char SETADC[] = "SETADC "; const char rhs = strstr(cmd,SETADC); if (rhs==NULL) return WTF; char *converter = rhs+strlen(SETADC); if (!isdigit(*converter)) return WTF; int number = atoi(converter); if (number < SMALLEST) return RANGE; if (number > LARGEST) return RANGE; // Now actually write the ADC, with the error checking // and the validation and the glaivin... ... return OK; } int procSTART(const char *cmd) { ... } typedef struct { char *command; int (*callback)(const char *command); } msgentry; const static msgentry table[] = { { "START ", procSTART }, { "SETADC ", procSETADC }, ... }; static const int TABLESIZE = ( ( sizeof(table) ) / (sizeof(table[0])) ); int eval(char *cmdstr) { int i; for (i=0;i<TABLESIZE;i++) { const char *cstr = table[i].command; const int len = strlen(cstr); if (strncmp(cmdstr,cstr,len)!=0 { continue; } int errcode = table[i].callback((const char *)cmdstr); return errcode; } return WTF; }
> 2 - Regarding the parser: Is it really a parser that I need or is it > something else? Where can I read about this? I just don't what to > read the full dragon book just to get to the conclusion that it was > the wrong book! > > Any thoughts are welcome. > > Regards > > Mariano >
-- Les Cargill
On 12/14/2013 3:08 PM, jmariano wrote:
> > The box is under the command of a PC, using RS232 or USB, in a > master-slave model, the PC being the master. I want to use a message > based command language, similar to SCPI, but not so complicated (no > tree structure). I was thinking in something like START, STOP SETADC > 1000, REAADC 1, etc.
MODBUS protocol? Vladimir Vassilevsky DSP and Mixed Signal Designs www.abvolt.com
On Sat, 14 Dec 2013 13:08:50 -0800 (PST), jmariano
<jmariano65@gmail.com> wrote:

>Dear All, > >I'm in need of some advice! (not the full solution). A word of caution: i'm not an computer science guy, so probably i'm not using the correct therms! > >I'm a part-time prototype developer at my university. My latest project is a box with a microcontroller that measures and actuates on stuff. The box is under the command of a PC, using RS232 or USB, in a master-slave model, the PC being the master. I want to use a message based command language, similar to SCPI, but not so complicated (no tree structure). I was thinking in something like START, STOP SETADC 1000, REAADC 1, etc. > >I have to define the syntax and program the parser (in C) on the uc side. Since I don't have very strict specification on the syntax, I can define it in such a way that makes it more easy to analyse, or more robust or etc. > >So, my questions are: > >1 - Regarding the language definition: Are there god examples of such language that I can get inspiration from? And references? I'm sure someone as already thought about this in a formal way. I'm looking for practical advice like shall I use fixed length commands (6 character, for example), start-of-message character (#)? Why? And the arguments, separated by commas, spaces? etc. > >2 - Regarding the parser: Is it really a parser that I need or is it something else? Where can I read about this? I just don't what to read the full dragon book just to get to the conclusion that it was the wrong book! > >Any thoughts are welcome.
Assuming you control both devices, and your commands and responses are as simple as you're describing, don't over think it. You don't want to reinvent XML here. You can make things pretty simple. Make a command a line, delimited by a CR (makes it easy to type these from a terminal emulator for testing), and the command is then everything from one CR to the next. Set a reasonable limit on the length of commands (and responses). Make the command start in the first column, then a space, then comma delimited parameters. Allow both numbers and quoted strings. Set a reasonable limit on the number of parameters. Parsing that is pretty simple; remember to allow escapes for quotes if you want those in your strings. Keep the command handling fairly ad-hoc, each handler routine would start by verifying the number and type of received parameters, then validating that parameters themselves, then execute the action, and format the response. Make the responses simple too. Start with a numeric result code ("0" = OK, anything else is an error of some sort), followed by a space, and any returned values (again, numbers and strings, comma separated). Again, minimal work to parse. If it's an error, you can leave the rest of the line as descriptive text. Absolutely include a version command that returns some an identification string and a version code for the communications protocol/command set the device supports. The controlling PC should use that at startup to adapt to the device (or refuse to communicate if it doesn't know the protocol version). Be somewhat strict in what you accept (on both sides), and don't worry too much. Now if you need high performance, complicated commands or responses, really long commands/responses, high enough reliability that you need to detect transmission errors, you need authentication, etc., you'd probably want to be more sophisticated than that.
On 14/12/13 21:08, jmariano wrote:
> Dear All, > > I'm in need of some advice! (not the full solution). A word of caution: i'm not an computer science guy, so probably i'm not using the correct therms! > > I'm a part-time prototype developer at my university. My latest project is a box with a microcontroller that measures and actuates on stuff. The box is under the command of a PC, using RS232 or USB, in a master-slave model, the PC being the master. I want to use a message based command language, similar to SCPI, but not so complicated (no tree structure). I was thinking in something like START, STOP SETADC 1000, REAADC 1, etc. > > I have to define the syntax and program the parser (in C) on the uc side. Since I don't have very strict specification on the syntax, I can define it in such a way that makes it more easy to analyse, or more robust or etc. > > So, my questions are: > > 1 - Regarding the language definition: Are there god examples of such language that I can get inspiration from? And references? I'm sure someone as already thought about this in a formal way. I'm looking for practical advice like shall I use fixed length commands (6 character, for example), start-of-message character (#)? Why? And the arguments, separated by commas, spaces? etc. > > 2 - Regarding the parser: Is it really a parser that I need or is it something else? Where can I read about this? I just don't what to read the full dragon book just to get to the conclusion that it was the wrong book! > > Any thoughts are welcome.
On trap that I fell into myself once, and have seen other people fall into /many/ times is: - it is a very limited requirement, just a very few "peek/poke" commands - but we don't know everything at the outset, so we'll put in hooks to add new commands Let's implement it as nothing more than a set of macros - oh, we need to have arguments evaluated - it would be much neater if we had if-then-else - and loops At which point the very limited requirements have mutated to produce something that a bastard language that grows like Topsy <http://cjewords.blogspot.co.uk/2009/08/growd-like-topsy.html> If that is a possibility, it is probably much cleaner simpler, faster (speed and soon) to embed a Forth interpreter from the outset. Yes, I know the XP/agile fraternity will frown on that. Tough; some of that brigade doesn't know their limits! Alternatively, I'm merely reminding you of the aphorism "Any sufficiently complicated C or Fortran program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp."
Sigh, it escaped a little too fast; minor edits below.

On 15/12/13 08:53, Tom Gardner wrote:
> On 14/12/13 21:08, jmariano wrote: >> I'm in need of some advice! (not the full solution). A word of caution: i'm not an computer science guy, so probably i'm not using the correct therms! >> >> I'm a part-time prototype developer at my university. My latest project is a box with a microcontroller that measures and actuates on stuff. The box is under the command of a PC, using RS232 or USB, >> in a master-slave model, the PC being the master. I want to use a message based command language, similar to SCPI, but not so complicated (no tree structure). I was thinking in something like START, >> STOP SETADC 1000, REAADC 1, etc. >> >> I have to define the syntax and program the parser (in C) on the uc side. Since I don't have very strict specification on the syntax, I can define it in such a way that makes it more easy to >> analyse, or more robust or etc. >> >> So, my questions are: >> >> 1 - Regarding the language definition: Are there god examples of such language that I can get inspiration from? And references? I'm sure someone as already thought about this in a formal way. I'm >> looking for practical advice like shall I use fixed length commands (6 character, for example), start-of-message character (#)? Why? And the arguments, separated by commas, spaces? etc. >> >> 2 - Regarding the parser: Is it really a parser that I need or is it something else? Where can I read about this? I just don't what to read the full dragon book just to get to the conclusion that it >> was the wrong book! >> >> Any thoughts are welcome.
One trap that I fell into myself once, and have seen other people fall into /many/ times is: - it is a very limited requirement, just a very few "peek/poke" commands - but we can't define everything at the outset, so we'll put in hooks to add new commands Let's implement it as nothing more than a set of macros - oh, we need to have arguments evaluated - it would be much neater if we had if-then-else - and loops At which point the very limited requirements have mutated to produce a bastard language that grows like Topsy <http://cjewords.blogspot.co.uk/2009/08/growd-like-topsy.html> especially the last pararaph. If that is a possibility, it is probably much cleaner simpler, faster (i.e. execution speed and delivery date) to embed a Forth interpreter from the outset. Yes, I know the XP/agile fraternity will frown on that. Tough; some of that brigade doesn't know their limits! Alternatively, I'm merely reminding you of the aphorism "Any sufficiently complicated C or Fortran program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp."
On Sun, 15 Dec 2013 00:51:06 -0600, Robert Wessel
<robertwessel2@yahoo.com> wrote:

>On Sat, 14 Dec 2013 13:08:50 -0800 (PST), jmariano ><jmariano65@gmail.com> wrote: > >>Dear All, >> >>I'm in need of some advice! (not the full solution). A word of caution: i'm not an computer science guy, so probably i'm not using the correct therms! >> >>I'm a part-time prototype developer at my university. My latest project is a box with a microcontroller that measures and actuates on stuff. The box is under the command of a PC, using RS232 or USB, in a master-slave model, the PC being the master. I want to use a message based command language, similar to SCPI, but not so complicated (no tree structure). I was thinking in something like START, STOP SETADC 1000, REAADC 1, etc. >> >>I have to define the syntax and program the parser (in C) on the uc side. Since I don't have very strict specification on the syntax, I can define it in such a way that makes it more easy to analyse, or more robust or etc. >> >>So, my questions are: >> >>1 - Regarding the language definition: Are there god examples of such language that I can get inspiration from? And references? I'm sure someone as already thought about this in a formal way. I'm looking for practical advice like shall I use fixed length commands (6 character, for example), start-of-message character (#)? Why? And the arguments, separated by commas, spaces? etc. >> >>2 - Regarding the parser: Is it really a parser that I need or is it something else? Where can I read about this? I just don't what to read the full dragon book just to get to the conclusion that it was the wrong book! >> >>Any thoughts are welcome. > > >Assuming you control both devices, and your commands and responses are >as simple as you're describing, don't over think it. You don't want >to reinvent XML here. > >You can make things pretty simple. Make a command a line, delimited >by a CR (makes it easy to type these from a terminal emulator for >testing), and the command is then everything from one CR to the next.
There are advantages of also having a separate character for the start of message indicator, such as SOH, "!" exlamation mark colon etc. A line in which the other end is just powered up (transient) or a line being idle for a long time can collect garbage characters, which can be easily ignored until the next valid start character (Modbus RTU is an example that violates this principle). If CR is used as a message terminator, an additional LF can be appended after a valid message to tidy up the display on a terminal emulator, but this does not harm the decoding of the next message, since everything (including LF) is ignored until the start character. With manual command entry, there is an issue with BCC/CRC, since it is quite hard to generate those manually in the fly, there should be a way to switch this feature on and off either by a separate command or specifying that some specific CRC value such as 00 or FFh means that CRC check should not be performed (compare with UDP header CRCs).
>Set a reasonable limit on the length of commands (and responses). Make >the command start in the first column, then a space, then comma >delimited parameters. Allow both numbers and quoted strings. Set a >reasonable limit on the number of parameters. Parsing that is pretty >simple; remember to allow escapes for quotes if you want those in your >strings.
I have used one (or more) spaces and/or tabs as token separators to help readability. On the receiving side, any message function code or positional parameter are easily separated to tokens (e.g. null terminated strings) after handling the escape sequences if needed for strings with spaces. Since the CPU power is not going to be an issue, after splitting the message into token strings, try to decode each token into a decimal/hexadecimal value regardless it is needed or not. Then perform a table search for supported function codes, such as "W" (Write), "RD" (ReaD) or "ACK" to get a function code index and make a branch table. Once the function and hence required parameters are known, just get the predecoded numeric parameters with a single assignments for each parameter. If you are really low of RAM (but with plenty of ROM) i.e. can't accomodate a 80 - 255 character command/message line, you have to do the parsing on a token by token basis. In the worst case with _very_ little RAM, you might even have do the ASCII numeric (or hexadecimal) to integer conversion for each received numeric digit in the receiver interrupt service. However, this requires a quite complex state machine and the error recovery might be quite nasty. Depending on the capabilities available on the slave device, the available solutions might be quite different. While in the past devices communicated at very low data rates such as 110, 300 or 1200 bit/s, the absolute minimum capability these days seems to be 9600 bit/s (1 byte/ms), while 115k2 (12 bytes/ms) seems to be normal with receiver multibyte FIFOs, Thus, there is not much point of trying to optimize message frame sized with binary (or even compressed binary) communication for ad-hoc protocols, just use ASCII protocol, which simplifies development and helps problem solving with simple tools such as terminal emulators.
>Absolutely include a version command that returns some an >identification string and a version code for the communications >protocol/command set the device supports. The controlling PC should >use that at startup to adapt to the device (or refuse to communicate >if it doesn't know the protocol version).
Since the PC software is easy to update, but the slave firmware is not, the master must know how to handle those old PLCs.
>Be somewhat strict in what you accept (on both sides), and don't worry >too much.
In the real world, you need to be quite strict what you send, but you should expect any interpretation (accordng to standars) what the other partners wants to tell you. In many standards, read carefully which features are designated as Mandatory/Optional.
On 15.12.13 02:10, Vladimir Vassilevsky wrote:
> On 12/14/2013 3:08 PM, jmariano wrote: >> >> The box is under the command of a PC, using RS232 or USB, in a >> master-slave model, the PC being the master. I want to use a message >> based command language, similar to SCPI, but not so complicated (no >> tree structure). I was thinking in something like START, STOP SETADC >> 1000, REAADC 1, etc. > > MODBUS protocol? > > > Vladimir Vassilevsky > DSP and Mixed Signal Designs > www.abvolt.com
Before he start on it - DO NOT! Modbus has been designed with far too many networking blunders. -- -Tauno Voipio