cl-lexer (1-4) README

Summary

 README |   74 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 74 insertions(+)

    
download this patch

Patch contents

--- cl-lexer-1.orig/README
+++ cl-lexer-1/README
@@ -0,0 +1,74 @@
+LEXER package
+
+The LEXER package implements a lexical-analyzer-generator called DEFLEXER,
+which is built on top of both REGEX and CLAWK. Many of the optimizations in the
+recent rewrite of the regex engine went into optimizing the sorts of patterns
+generated by DEFLEX.
+
+The default lexer doesn't implement full greediness. If you have a rule for
+ints followed by a rule for floats, the int rule will match on the part before
+the decimal before the float rule gets a change to look at it. You can fix this
+by specifying :flex-compatible as the first rule. This gives all patterns a
+chance to examine the text and takes the one that matches the longest string
+(first pattern wins in case of a tie). The down side of this option is that it
+slows down the analyser. If you can solve the issue by reordering your rules
+that's the way to do it.
+
+I'm currently writing an AWK->CLAWK translator using this as the lexer, and
+it's working fine. As far as I can tell, the DEFLEXER-generated lexing
+functions should be fast enough for production use.
+
+Currently, the LEX/FLEX/BISON feature of switching productions on and off using
+state variables is not supported, but it's a pretty simple feature to add. If
+you're using LEXER and discover you need this feature, let me know.
+
+It also doesn't yet support prefix and postfix context patterns. This isn't
+quite so trivial to add, but it's planned for a future release of regex, so
+LEXER will be getting it someday.
+
+Anyway, Here's a simple DEFLEXER example:
+
+  (deflexer test-lexer
+    ("[0-9]+([.][0-9]+([Ee][0-9]+)?)"
+      (return (values 'flt (num %0))))
+    ("[0-9]+"
+      (return (values 'int (int %0))))
+    ("[:alpha:][:alnum:]*"
+      (return (values 'name %0)))
+    ("[:space:]+") )
+
+  > (setq *lex* (test-lexer "1.0 12 fred 10.23e45"))
+  <closure>
+ 
+  > (funcall *lex*)
+  FLT
+  1.0
+ 
+  > (funcall *lex*)
+  INT
+  12
+ 
+  > (funcall *lex*)
+  NAME
+  "fred"
+
+  > (funcall *lex*)
+  FLT
+  1.0229999999999997E46
+
+  > (funcall *lex*)
+  NIL
+  NIL
+
+You can also write this lexer using the :flex-compatible option, in which case
+you can write the int and flt rules in any order.
+
+(deflexer test-lexer
+  :flex-compatible
+  ("[0-9]+"
+    (return (values 'int (int %0))))
+  ("[0-9]+([.][0-9]+([Ee][0-9]+)?)"
+    (return (values 'flt (num %0))))
+  ("[:space:]+")
+ )
+