2011-02-28

literals with type-defined syntax

2.28: adda/dstr/literals with type-defined syntax:
. one way to allow custom syntax
-- elegantly, without a bolt-on --
is to say that types can define
their own literal reader;
ie, a mini' type-specific compiler .
. this allows the grammar of literals to be
something other than a composite of native types,
defined instead as whatever is accepted
by the type's reader .
. during the adda.compiler's first pass,
it sorts out what's adda code
from what is either a string literal,
or a type-specific literal
having a reader-defined grammar .
. in subsequent passes,
it then uses the appropriate reader
to complete the translation of literals .

. a custom reader is not a security threat;
because while it is returning adda binary code,
that code itself is not runnable;
it must still be translated by trusted app's .

. if a type defines more than one reader,
then not mentioning a reader simply calls the default .
it can also be explicit in the usual way:
eg, .t`yet-another-reader
is a reader belonging to type t .

. the pattern:  x.anytype = "(...)
results in calling anytype's default reader
-- just as with: x.string = "(string's literal);
( notice .string's default reader
doesn't have to be trivial;
in c.lang, the reader treats "(\)
as an escape character;
eg, \n -> newline, \t -> tab, ... ).

. to help adda readily identify
all the type-defined syntax readers,
they should all return the same special type,
say, .addb (adda binary),
so then in type t's interface,
any functions of type .addb
will be registered as readers for type t;
eg,
( read(x.$).addb
, another-style(x.$).addb, ...)

. a quote lexeme -- '{}, '[], '() --
means do a read (translate text to adda code)
but don't eval,
whereas, a double-quote means don't even read:
it is to be considered { .string, .$ };
so, in the case of readers,
their parameter must be a double-quoted enclosure,
or some expression that returns .string;
otherwise, it's eval'd as adda code
before being given to the reader
and then it might not even be the expected .string type .

. places where literals are encountered:
# static typing:
. the var's are declared to have a particular type;
here the type is obvious;
so the type qualifier is not needed;
eg,
x.anytype`= "(this is greek to adda)
-- that invokes .anytype's default reader;
x.string`= c-style"(string's literal\n)
-- .string's c-style reader is invoked .
# dynamic typing:
-- the type is discovered at run-time --
. adda can't find a reader at compile time
unless the reader's type is specified:
eg,
say .tall can point to all types:
xc.tall`= .string`c-style"(featuring escapes\n) .
-- now xc can point at a .string at compile`time .
. if an undeclared function is given:
eg,
g.tall`= f "(something like x );
then the work is left to the run-time exec:
it sees if the current object assigned to g
does have a type that includes
the function type: f(x.$).*
(where * can be any type);
if so, then g gets whatever type obj'
that f returns .

. string literals can have the same problem as comments:
it's easy to lose the boundaries of multi-line constructs;
and, when that happens,
the code can act strangely because
the compiler thought half the code
was a comment or string;
 conversely,
if the comment or string accidently contains
the string delimiter,
some of the comment or string will be
compiled as if were code;
and if that succeeds,
the results will certainly be unintended!
. to assist the human reader,
there could be a redundant enclosure syntax:
if a string can't fit on one line,
the enclosure boundaries should be
on their own separate lines;
"(
example with
2 lines .
);
. if the text includes quote enclosures on their own line,
then the text could go in its own file:
x`= "( [!]myliteral.txt )
-- when {adda, adde} sees [ ! ] as one word,
then what follows is a command for generating text .
. a .txt file would be taken as literal text;
whereas an .adda file would be eval'd to an object
that would then be printed if not alread .string .