2.1: adda/syntax/parser algorithm for postfix operators:
. when checking for existence,
the word (is) could be used in 2 ways:
(is x) checks for existence,
(x is t) checks for x's type = t .
. another idea is having support for
english-style postfix operators:
then you could write (x exists)
and you'd parse this like an infix
except you're not looking for a 2nd arg .
. its syntax for being declared
could be similar to that of other functions:
myPrefix(arg.argType).ReturnType,
(arg.t)myPostfix.ReturnType,
myInfix(arg1,arg2:t).ReturnType .
-- anything with 2 args is an infix operator .
Showing posts with label parse. Show all posts
Showing posts with label parse. Show all posts
2013-03-31
2012-08-01
generic parser
2.23: adda/translate/generic parser:
. if there is a generic parser,
then it has to record spaces(n) and new lines,
because the meaning of symbol sequences
will often depend on whether a space
is separating them .
. this may not help me anyway since my interpretation
can depend on whether a name is a type id or not?
. one thing that does simplify things
is a parser that does find and tree
the enclosures and delimiters
-- {} () [] , ; : . --
along with some other un-redefinables .
. it also supports implicit enclosures
(the use of indentation to indicate a grouping)
so it needs to translate ( newline & (some spaces) )
into the beginning or continuation of an enclosure;
and, there needs to be a parameter for
what the currently expected indenting level is .
. if it can find (symbol.type) as being a type def,
then it can also find ( .symbol ) as being a type; [2.29:
but, types shouldn't require a preceding dot .]
. strings of alphanum's that start with alphabet
are always names .
. enclosures include set generators .
. if there is a generic parser,
then it has to record spaces(n) and new lines,
because the meaning of symbol sequences
will often depend on whether a space
is separating them .
. this may not help me anyway since my interpretation
can depend on whether a name is a type id or not?
. one thing that does simplify things
is a parser that does find and tree
the enclosures and delimiters
-- {} () [] , ; : . --
along with some other un-redefinables .
. it also supports implicit enclosures
(the use of indentation to indicate a grouping)
so it needs to translate ( newline & (some spaces) )
into the beginning or continuation of an enclosure;
and, there needs to be a parameter for
what the currently expected indenting level is .
. if it can find (symbol.type) as being a type def,
then it can also find ( .symbol ) as being a type; [2.29:
but, types shouldn't require a preceding dot .]
. strings of alphanum's that start with alphabet
are always names .
. enclosures include set generators .
Labels:
adda,
parse,
syntax,
translation
2012-03-31
parser uses string descriptors
3.3: adda/translate/parser/use string descriptors:
. one way to parse a string
is use the lexical analyzer to find the relevant substrings,
and then make lexical nodes that point at copies of string;
the way I'm thinking of now
uses string descriptors instead of copies:
put the string in an array of characters,
and then a descriptor is an index into that array,
along with intended length .
. the advantage to this way is that
if the parser gets confused,
a repair algorithm can work with the result
searching not only the tree,
but also the string,
and seeing how they relate .
. one way to parse a string
is use the lexical analyzer to find the relevant substrings,
and then make lexical nodes that point at copies of string;
the way I'm thinking of now
uses string descriptors instead of copies:
put the string in an array of characters,
and then a descriptor is an index into that array,
along with intended length .
. the advantage to this way is that
if the parser gets confused,
a repair algorithm can work with the result
searching not only the tree,
but also the string,
and seeing how they relate .
Labels:
adda,
parse,
translation
2012-01-31
parsing unmatching opening characters
1.5: adda/parse/algorithm for parsing containers:
. there can be non-matching container parts,
because they can be surounded by complete containers,
eg, ( ... [ ...{} ),
. the recursive way is this:
find an opening to some container,
and then call the function that reads that container;
what else that function will need, though,
is knowing whether it was sent by
another container-reading function;
so, in the example above, ( ... [ ...{} ),
getParen is calling getBracket( ")" ),
so then getBracket knows it needs to be stopping for
both {")", "]"}, whichever come first .
. if getBracket returns nil,
then getParen will know that the "[" should be taken as
a character, not the beginning of a subtree .
. any 3rd possibility?
no, but it does involve recursive backtracking,
with multiple readers on a single stream: [1.31:
. both caller and called could
fail to find their closing match .
. then to not have to do work over again,
it should have some way remembering
whether a given opening character has a match .
. another needed dimension is being fault tolerant:
if the coder forgot a closing character,
then there should be other ways of parsing
that will be able to guess something is missing;
eg, a line has found an new function definition
in the middle of a function body,
it could guess that the body above is
missing a closing character .]
. there can be non-matching container parts,
because they can be surounded by complete containers,
eg, ( ... [ ...{} ),
. the recursive way is this:
find an opening to some container,
and then call the function that reads that container;
what else that function will need, though,
is knowing whether it was sent by
another container-reading function;
so, in the example above, ( ... [ ...{} ),
getParen is calling getBracket( ")" ),
so then getBracket knows it needs to be stopping for
both {")", "]"}, whichever come first .
. if getBracket returns nil,
then getParen will know that the "[" should be taken as
a character, not the beginning of a subtree .
. any 3rd possibility?
no, but it does involve recursive backtracking,
with multiple readers on a single stream: [1.31:
. both caller and called could
fail to find their closing match .
. then to not have to do work over again,
it should have some way remembering
whether a given opening character has a match .
. another needed dimension is being fault tolerant:
if the coder forgot a closing character,
then there should be other ways of parsing
that will be able to guess something is missing;
eg, a line has found an new function definition
in the middle of a function body,
it could guess that the body above is
missing a closing character .]
Labels:
adda,
enclosures,
parse
2009-12-28
custom precedence rules
10.6: adda/syntax/prec'order:
. unlike the etree parser
where precedence was table-driven,
adda must know precedence also by type;
and then each type can additionally provide
a prec'table that can reduce the need for writing paren's .
. how does a type`declaration specify precedence?
it can list the symbols in groups of
{values, uniop's, biop's, others},
and then under biop's,
the order given is also the prec'order .
. do the highest precedence last,
so they are listed in the same order they would be
in a parse tree
when no paren's are involved; eg:
( +, -
; *, /
; ** ).
[10.9: just noticed some biop's will have equal precedence,
so there needs to be both comma's and semi-colons
to distinguish equal from unequal levels ] .
override:
. a place where paren's need be part of parse.tree
is explicit paren's when none is needed?
but still don't need a paren' op[etree symbol for explicit paren's],
rather, when paren's are optional yet user-preferred,
use multiple operations vs a single multiarg operation:
eg,
a+b+c: +(a,b,c),
vs:
a+(b+c): +(a, +(b,c)),
[10.9:
. must consider not only user's preference,
but also how eval' order of numbers can affect precision .]
tolerant and precise parsing
10.8: adda/tolerant and precise:
. part of being a tolerant lang'
can be to have a strict syntax simply for verifying
what the user meant;
ie, in case there's some doubt,
then they can ask to see the code
. this can be folded into html
where the selection or entire file
can be shown as adda code .
. when the debugger has a link to code,
it can show both the strict version
and the user's version,
with the lax dialect and comments .
Subscribe to:
Posts (Atom)