| Register | FAQ | Calendar | Search | Today's Posts | Mark Forums Read |
|
#1
| |||
| |||
| I've been using BNF. up until now, things have been OK with the optional symbols. what is the proper way to implement an optional symbol where <code40> is optional? <b> ::= <code40> "r" | "r" but I would prefer <b> ::= <a> "r" <a> ::= <code40> | <nullsymbol???> I don't know. I don't think this is right when it occurs 0 or 1 times. I seem to remember there was a symbol in BNF used to represent the null terminal or something - it's not listed in wikipedia (lousy wikipedia). what was it? Anyway, the problem I am up against is that I have to define a symbol that has 18 options. fully expanded, that makes 2^18 options, and I have 200 more symbols to define for autocad, many with more options... [Either is OK for optional symbols. For your Autocad problem, I would suggest a much looser grammar that just accepts a list of symbols, and check semantically to knock out duplicates. That both keeps the grammar managable, and lets you produce much better errors, e.g. "duplicate foo" or "foo not allowed after bar" rather than just "syntax error". -John] |
|
#2
| |||
| |||
| jmichae3@yahoo.com writes: > I've been using BNF. > up until now, things have been OK with the optional symbols. > > what is the proper way to implement an optional symbol where <code40> > is optional? > <b> ::= <code40> "r" | "r" > but I would prefer > <b> ::= <a> "r" > <a> ::= <code40> | <nullsymbol???> > > I don't know. I don't think this is right when it occurs 0 or 1 > times. I seem to remember there was a symbol in BNF used to represent > the null terminal or something - it's not listed in wikipedia (lousy > wikipedia). what was it? I suspect you may be remembering two different things, either of which might be useful to you: (1) If you use your second option, then you are looking for some conventional symbol to use for the empty string, in the place where you have <nullsymbol???>. In grammars that are written in relatively mathematical contexts, the conventional symbol to use for this is the Greek letter epsilon. On the other hand, people who use the Greek letter also tend to use an arrow symbol where you have ::=, etc. So, what do people do who are limiting themselves to the ASCII character set? The answer is that most adopt some particular nonterminal like <epsilon> or <empty> or <null> to mean the same thing -- but there is no standardization of this. Whichever one you use, you don't need to just rely on readers to understand it, nor do you need to include a prose explanation. Instead, you can define your nonterminal using a production with a totally empty right-hand side: <nullsymbol> ::= Of course, that may need a prose explanation too, depending on your audience. But the good news is that it appears only once in your grammar, and elsewhere you use the relatively self-documenting <nullsymbol> (or whatever you call it) as opposed to having the whole grammar strewn with productions with empty right-hand sides, as in <a> ::= <code40> | where the second alternative for <a> is empty, but looks like a mistake. (2) The other thing you may be remembering is that there are extended versions of BNF, known as EBNF, where additional notations can be included on the right-hand side of productions in order to indicate such matters as optional portions. This would allow a third alternative way of writing your grammar, conceptually similar to your third one but a little tidier, because it doesn't require introducing the nonterminal <a>. Unfortunately, there is again no standardization of EBNF notation. The two most common ways of indicating optional elements are probably enclosing them in square brackets or following them with question marks. So, your example would wind up looking like <b> ::= [<code40>] "r" or <b> ::= <code40>? "r" > > Anyway, the problem I am up against is that I have to define a symbol > that has 18 options. fully expanded, that makes 2^18 options, and I > have 200 more symbols to define for autocad, many with more options... > > [Either is OK for optional symbols. For your Autocad problem, I would > suggest a much looser grammar that just accepts a list of symbols, and > check semantically to knock out duplicates. That both keeps the > grammar managable, and lets you produce much better errors, e.g. > "duplicate foo" or "foo not allowed after bar" rather than just > "syntax error". -John] John's use of the phrase "produce much better errors" sounds like he thinks you are using your grammar to produce a parser, as opposed to just writing the grammar down as documentation. I'm not clear whether that is in fact the case. But even if you are just writing the grammar down for humans to read, there's a lot to what John says. John's sugestion is particularly apt if the options can appear in any order -- which I suspect is the case. In that case, you have far more than 2^18 fully-expanded possibilities -- not that you would want to write even 2^18 productions. Moreover, there is no nice trick of BNF or EBNF that would come to your rescue. Your approach using <a> to represent an optional <code40>, defined using <nullsymbol> (or some other notation for the empty string) would work OK for 18 options, which need to appear in a fixed order if used. It would be mildly clutzy, since you'd need 18 new nonterminals like your <a>. (I'd name them <optionalCode40> and so forth.) If you are fortunate enough to use EBNF, but again have a required fixed order, the clutziness would go away. You could just write the 18 options out with each in brackets or followed by a question mark, and life would be good. Or to adopt John's phrase, you would have kept your grammar manageable. But nothing like that is possible if the order is variable. In that case, even if you are writing for humans, your best bet is to escape from the BNF or EBNF into some more general means of communication -- which for a human, might well be a natural-language explanation. -Max Hailperin Professor of Computer Science Gustavus Adolphus College |
|
#3
| |||
| |||
| On Aug 16, 2:22 pm, Max Hailperin <m...@gustavus.edu> wrote:> > John's sugestion is particularly apt if the options can appear in any > order -- which I suspect is the case. In that case, you have far more > than 2^18 fully-expanded possibilities -- not that you would want to > write even 2^18 productions. Moreover, there is no nice trick of BNF > or EBNF that would come to your rescue. Well, theoretically one could write a program to generate the output but I think with 2^18 alternatives no one is going to read that file anyway. > Your approach using <a> to represent an optional <code40>, defined > using <nullsymbol> (or some other notation for the empty string) would > work OK for 18 options, which need to appear in a fixed order if used. > It would be mildly clutzy, since you'd need 18 new nonterminals like > your <a>. (I'd name them <optionalCode40> and so forth.) > > If you are fortunate enough to use EBNF, but again have a required > fixed order, the clutziness would go away. You could just write the > 18 options out with each in brackets or followed by a question mark, > and life would be good. Or to adopt John's phrase, you would have > kept your grammar manageable. > > But nothing like that is possible if the order is variable. In that > case, even if you are writing for humans, your best bet is to escape > from the BNF or EBNF into some more general means of communication -- > which for a human, might well be a natural-language explanation. In other grammars I have seen the preferred method to deal with this situation is to write something like this: rule: modifier+; modifier: A | B; One has to check afterwards though, if a modifier is never repeated (also add a note in the grammar). I actually would choose this solution even for a fixed order, as one can provide a better error message if you provide the input in the wrong order. Johannes |
|
#4
| |||
| |||
| > If you are fortunate enough to use EBNF, but again have a required > fixed order, the clutziness would go away. You could just write the > 18 options out with each in brackets or followed by a question mark, > and life would be good. Or to adopt John's phrase, you would have > kept your grammar manageable. > > But nothing like that is possible if the order is variable. In that > case, even if you are writing for humans, your best bet is to escape > from the BNF or EBNF into some more general means of communication -- > which for a human, might well be a natural-language explanation. > > -Max Hailperin > Professor of Computer Science > Gustavus Adolphus College I did not know about following an optional symbol with ? but I did miss in wikipedia where they surround optionals with []. My options, as far as I can tell from the reference, are in a sequence - fortunately for me - I think. otherwise I would write it as <option> := "a" | "b" | "c" | "d" "e" | "f" <sequence> ::= <option>* I guess this means I am going to have to "look ahead" |
|
#5
| |||
| |||
| On minor point is response to what Max Hailperin <max@gustavus.edu> writes: > But nothing like that is possible if the order is variable. In that > case, even if you are writing for humans, your best bet is to escape > from the BNF or EBNF into some more general means of communication -- > which for a human, might well be a natural-language explanation. That's not completely true. I distinctly recall reading a paper where the author introduced an EBNF notation for generating permuations, which is exactly what this problem calls for. Now, permutations don't have trivial tranformations into BNF, in the sense that to describe permutations in BNF one must list each of the alternatives and that grows quite rapidly as the number of symbols to be permuted increases. However, it is quite feasible to generate code that recognizes permutations, and thus adding an operator that does such to one's EBNF is reasonable to my mind. Others may object that EBNF should be restricted to allowing only things expressible as regular expressions on the right-hand-side. However, as I remember the notation, it was restricted to combining fixed lists of symbols, so it technically was a regular expression, just not a traditional one. I did a Google search looking for permutation grammars but didn't find anything on the first page, although the link to Royal Holloway seemed promising, as Adrian Johnstone and Elizabeth Scott could easily be interested in such an extension. Note, if someone find this paper, it would be nice to see the reference posted here. I wouldn't mind reading that paper again. Hope this helps, -Chris ************************************************** **************************** Chris Clark Internet: christopher.f.clark@compiler-resources.com Compiler Resources, Inc. or: compres@world.std.com 23 Bailey Rd Web Site: http://world.std.com/~compres Berlin, MA 01503 voice: (508) 435-5016 USA fax: (978) 838-0263 (24 hours) |
|
#6
| |||
| |||
| > > But nothing like that is possible if the order is variable. In that > > case, even if you are writing for humans, your best bet is to escape > > from the BNF or EBNF into some more general means of communication -- > > which for a human, might well be a natural-language explanation. > > That's not completely true. I distinctly recall reading a paper where > the author introduced an EBNF notation for generating permuations, > which is exactly what this problem calls for. Now, permutations don't > have trivial tranformations into BNF, in the sense that to describe > permutations in BNF one must list each of the alternatives and that > grows quite rapidly as the number of symbols to be permuted increases. > However, it is quite feasible to generate code that recognizes > permutations, and thus adding an operator that does such to one's EBNF > is reasonable to my mind. Others may object that EBNF should be > restricted to allowing only things expressible as regular expressions > on the right-hand-side. However, as I remember the notation, it was > restricted to combining fixed lists of symbols, so it technically was > a regular expression, just not a traditional one. I believe the reference Chris is referring to is: [Cameron 1993] Robert D. Cameron, "Extending Context-Free Grammars with Permutation Phrases," LOPLAS, Vol. 2 No. 1-4, pp. 85-94, 1993. Meta-S grammars have both strict and relaxed permutation phrases. S ::= <<A B C>>; A ::= "amor"; B ::= "vincit"; C ::= "omnia"; Ignoring whitespace rules, the above would match any permutation of "amor vincit omnia". A, B, and C must all generate strings, once and only once, and must not generate the empty string, but the order of A, B, and C is free. S ::= <<{A B C}>>; // the { } syntax means that one of the terms can generate lambda A ::= "amor"; B ::= "vincit"; C ::= ["omnia"]; In the above, C is optional. So, the grammar can generate "vincit amor" (for instance). Several other grammar formalisms have allowed for permutation phrases, as well. I am not certain if they allow for both the lambda-generating and non-lambda stricter form. Cheers, -- Quinn Tyler Jackson |
![]() |
| Thread Tools | |
| Display Modes | |
In an effort to better serve ads to our visitors, cookies are used on objectmix.com. For more information, check out our Privacy Policy.