Extended BNF grammar for Formalized English (FE)

The grammar below is not up-to-date. For example, coreferences may now be written before or after the quantifier&concept type, e.g. "Tom the cat", "the cat Tom" and "the cat named Tom" are now allowed. Unless preceded ",", "and" or "(", each relation must be preceded by "is", "are", "has" or "have", as in "Tom is on a table that is on a mat that is near a bed". This is needed to have a LR(1) grammar and also avoids human misinterpretations about which relation is connected to which concept.

"?" means 0 or 1 times, "*" means 0 to N times, "+" means 1 to N times)

FE := (Tree ("."|"?"))+ Tree := Concept Branches* QuotedTree := "~"? "`" Tree "'" Context? Context := "(" Branches2 ")" Branches := With Relation1 Tree (And With? Relation Tree)* | With? Relation2 Tree (And With? Relation Tree)* | "is"("a"|"an")Tree (And "is"("a"|"an")Tree)* Branches2 := With? Relation Tree (And With? Relation Tree)* | "is"("a"|"an") Tree (And "is"("a"|"an") Tree)* With := ("with"|"at"|"has for"|"have for"|"for"|"is"|"are"| ("can"|"may")("be"|"have for")) "the"? And := "and" | "," Relation := Relation1 | Relation2 Relation1 := (RelationType|Coreference) "of"? Annotation? Context? "<="? Relation2 := ("=>" | "<=>" | "<=" ) Concept | ("=" | "!=" | "<" | "=<" | ">" | ">=" | "or") Concept RelationType := Term_or_string Coreference := "*" Term_or_number Concept := ConceptCore Annotation? ConceptCore := CorefOrIndiv Quantifier Restrictor CQ? | Quantifier Restrictor CorefOrIndiv? CQ? | GroupOf Quantifier? Restrictor CorefDecl? Collection? | GroupOf Quantifier? CorefDecl? Collection | (Number | "~"Coreference | CQ | CorefOrIndiv CQ?) CorefOrIndiv := CorefDecl | "named"? Term_or_string ("\\" ConceptType)? //Term_or_string: individual (ex: Tom) or attribute (ex: high) CorefDecl := "*"Term_or_number | "*"Term_or_number "!=" "*"Term_or_number | "*"Term_or_number "!=" Term_or_number CQ := Collection | QuotedTree Restrictor := Qualifier? ConceptType | Qualifier? "[" ConceptType Branches "]" ConceptType := Term_or_string Qualifier := "good"|"bad" | "important"|"small"|"big"|"great" | "certain" Quantifier := "a" | "an" | "some" | "the" | "any" | "every" | "most" "of"? "the"? | "at" "least" Number "%"? "of"? "the"? | "at" "most" Number "%"? "of"? "the"? | "between" Number "%"? "and" Number "%"? "of"? "the"? | Number "to" Number "%"? "of"? "the"? | "from" Number "to" Number "%"? "of"? "the"? | "mostly" | "several" "of"? "the"? | Number "%"? "of"? "the"? | ("many"|"few"|"dozens"|"hundreds" |"thousands"|"millions"|"billions") "of"? "the"? GroupOf := CorefOrIndiv?("a"|"the")("group""of" | "bag""of" | "set""of"|"sequence""of"|"alternative") | "together" Collection := "{" (Set|Bag|OrderedSet|OrderedBag|XOR_Set|OR_Bag) "}" CollSize? Set := Element ("," Element)* Bag := Element ("&" Element)* OrderedSet := Element ("<" Element)* OrderedBag := Element ("=<" Element)* XOR_Set := Element ("/" Element)* OR_Bag := Element ("|" Element)* Element := Concept | "*" CollSize := "@" Number Term_or_number:= Term | number Term_or_string:= Term | string Term := TermLetter1 TermLetter* TermLetter1 := [a-z] | "#"[a-z] TermLetter := [a-z] | "#" | "_" | "-" | "/" | "?" | "&" | "~" | Digit | [.?][a-z0-9?#~] //thus "." ok within a term but not at the end | "://" //thus a URL may be a term Number := ("+"|"-")? Digit+ ("." Digit* )? Digit := [0-9] //Additional notes on the lexical parsing: - uppercase letters are parsed as if they were lowercase letters - white spaces and the HTML imbreakable space encoding "&nbsp;" are ignored - Java/C++ comments ("/* ... */" and "//...") are ignored - HTML tags are ignored but the content of HTML comments is parsed - annotations are enclosed within "(^" and "^)" - strings may be double quoted or enclosed within "$(" and ")$" (<b>because of the use of quotes and bacquotes to embedd sentences, strings cannot also be simple quoted in FE</b>)