Americium Dream Documents: parse

Showing posts with label parse. Show all posts

2013-03-31

parser algorithm for postfix operators

2.1: adda/syntax/parser algorithm for postfix operators:
. when checking for existence,
the word (is) could be used in 2 ways:
(is x) checks for existence,
(x is t) checks for x's type = t .
. another idea is having support for
english-style postfix operators:
then you could write (x exists)
and you'd parse this like an infix
except you're not looking for a 2nd arg .
. its syntax for being declared
could be similar to that of other functions:
myPrefix(arg.argType).ReturnType,
(arg.t)myPostfix.ReturnType,
myInfix(arg1,arg2:t).ReturnType .
-- anything with 2 args is an infix operator .

generic parser

2.23: adda/translate/generic parser:

. if there is a generic parser,
then it has to record spaces(n) and new lines,
because the meaning of symbol sequences
will often depend on whether a space
is separating them .
. this may not help me anyway since my interpretation
can depend on whether a name is a type id or not?

. one thing that does simplify things
is a parser that does find and tree
the enclosures and delimiters
-- {} () [] , ; : . --
along with some other un-redefinables .

. it also supports implicit enclosures
(the use of indentation to indicate a grouping)
so it needs to translate ( newline & (some spaces) )
into the beginning or continuation of an enclosure;
and, there needs to be a parameter for
what the currently expected indenting level is .

. if it can find (symbol.type) as being a type def,
then it can also find ( .symbol ) as being a type; [2.29:
but, types shouldn't require a preceding dot .]

. strings of alphanum's that start with alphabet
are always names .
. enclosures include set generators .

2012-03-31

parser uses string descriptors

3.3: adda/translate/parser/use string descriptors:
. one way to parse a string
is use the lexical analyzer to find the relevant substrings,
and then make lexical nodes that point at copies of string;
the way I'm thinking of now
uses string descriptors instead of copies:
put the string in an array of characters,
and then a descriptor is an index into that array,
along with intended length .
. the advantage to this way is that
if the parser gets confused,
a repair algorithm can work with the result
searching not only the tree,
but also the string,
and seeing how they relate .

2012-01-31

parsing unmatching opening characters

1.5: adda/parse/algorithm for parsing containers:
. there can be non-matching container parts,
because they can be surounded by complete containers,
eg, ( ... [ ...{} ),
. the recursive way is this:
find an opening to some container,
and then call the function that reads that container;
what else that function will need, though,
is knowing whether it was sent by
another container-reading function;
so, in the example above, ( ... [ ...{} ),
getParen is calling getBracket( ")" ),
so then getBracket knows it needs to be stopping for
both {")", "]"}, whichever come first .
. if getBracket returns nil,
then getParen will know that the "[" should be taken as
a character, not the beginning of a subtree .
. any 3rd possibility?
no, but it does involve recursive backtracking,
with multiple readers on a single stream: [1.31:
. both caller and called could
fail to find their closing match .
. then to not have to do work over again,
it should have some way remembering
whether a given opening character has a match .
. another needed dimension is being fault tolerant:
if the coder forgot a closing character,
then there should be other ways of parsing
that will be able to guess something is missing;
eg, a line has found an new function definition
in the middle of a function body,
it could guess that the body above is
missing a closing character .]

2009-12-28

custom precedence rules

10.6: adda/syntax/prec'order:

. unlike the etree parser

where precedence was table-driven,

adda must know precedence also by type;

and then each type can additionally provide

a prec'table that can reduce the need for writing paren's .

. how does a type`declaration specify precedence?

it can list the symbols in groups of

{values, uniop's, biop's, others},

and then under biop's,

the order given is also the prec'order .

. do the highest precedence last,

so they are listed in the same order they would be

in a parse tree

when no paren's are involved; eg:

( +, -

; *, /

; ** ).

[10.9: just noticed some biop's will have equal precedence,

so there needs to be both comma's and semi-colons

to distinguish equal from unequal levels ] .

override:

. a place where paren's need be part of parse.tree

is explicit paren's when none is needed?

but still don't need a paren' op[etree symbol for explicit paren's],

rather, when paren's are optional yet user-preferred,

use multiple operations vs a single multiarg operation:

eg,

a+b+c: +(a,b,c),

vs:

a+(b+c): +(a, +(b,c)),

[10.9:

. must consider not only user's preference,

but also how eval' order of numbers can affect precision .]

tolerant and precise parsing

10.8: adda/tolerant and precise:

. part of being a tolerant lang'

can be to have a strict syntax simply for verifying

what the user meant;

ie, in case there's some doubt,

then they can ask to see the code

. this can be folded into html

where the selection or entire file

can be shown as adda code .

. when the debugger has a link to code,

it can show both the strict version

and the user's version,

with the lax dialect and comments .

Americium Dream Documents

2013-03-31

parser algorithm for postfix operators

2012-08-01

generic parser

2012-03-31

parser uses string descriptors

2012-01-31

parsing unmatching opening characters

2009-12-28

custom precedence rules

tolerant and precise parsing

(As an Amazon Associate I earn from qualifying purchases.); pages of alpha doc's

posts by category

Blog Archive

tags

About Me

Facebook

search Wikipedia

Search This Blog