6.21: adda/lexicon/$symbol can indicate
literal symbol just like quotes mean literal string:
[7.2:
. there are 3 categories of symbols:
# names composed of digits are literal values of type number;
# names preceded by double-quote are literal values of type character string;
# names preceded by single-quote are unevaled symbols
(references to a symbol that was previously declared);
# names preceded by ($) are declaring literal values of type symbol;
(it's not a variable or constant representing some other value).
. in english we would tend to put a word in quotes to mean this,
but in a programming language, we want to reserve quotes for
values of type character string .
. this use for $ was first mentioned as
a namespace for value literals .]
Showing posts with label literals. Show all posts
Showing posts with label literals. Show all posts
2012-07-03
2012-05-31
numeric literals
5.12: adda/syntax/value literals:
. since numeric literals from an arbitrary base
will be using the usual symbols,
we could declare them to be a numeric sub type:
11.B = 2*1+1,
11.B8= 8*1+1,
11.H = 16*1+1 .
. however, that doesn't work because then we are
reserving that symbol for the number literals;
what we need is a way to say BEEF.someType,
and still be able to say BEEF(base16)
in the same context .
. I had previously noted that
[@] 5.10: mis.adda/type/number's base like a dimension?
math's traditional way for expressing base
is with a subscript, hence BEEF#16;
my problem with that was that it
precluded many symbols from being array names;
but, now I see it could still be possible
if we use type names instead of the base's number:
H.type: number; BEEF#H .
but that does get noisy when combining with arrays!
eg, BEEF# BEEF#H = BEEF#48879 .
number sign for value of a type:
. how about a new context notation
for accessing a type's value:
Typemark#valueLiteral,
and then as a special case of that,
numeric bases are types:
eg, H#BEEF
-- not unlike Ada's 16#BEEF#;
but, we can't use 16#BEEF because
it's confusing when used as an array subscript:
eg, A# 16#10 ambiguously means either
A#(16,10) or A#(16#10) (returns an array) .
eg, B#10 for base 2, H#10 for hexadecimals,
O#10 for octals, and T#10 for tetroctals
(base 32 = 4*8 = tetra-oct-al = tetroctal).
review of multi-subscript arrays:
. AT.type: #.int; -- a named array type .
A#.AT; -- an array of array;
means the same as A#.#.int .
. then A is accessed as either A#i#j, or A#(i,j) .
currency sign for value literal spaces:
. another idea is that bases are not really subtypes,
so what we need is a new syntax for value literals:
. it could be like the above except replacing (#) with ($)
-- currency is the sign of value (as in worth);
eg, B$10 = 2, O$10 = 8, and H$10 = 16,
while $10 = still means usa cash;
but, $green can be an abbreviation for
color$green since green is obviously not a number .
. by having the option of using ($) on enumerations,
we can have separate name spaces for them,
so that I can use both the variable green
and the literal color $green in the same context .
TypeId#value vs ValueType$valueLiteral:
. types can be thought of as arrays of values,
so t#x evals x as one of t's values .
. B$10 = the value 10 as parsed by the binary value type .
Color#green works only if green is not redefined;
because in the expression (aType#x),
x can be any expression, not just a literal;
Color$green is always unambiguous;
because the ($) says what follows is
one of a type Color's literals .
. RGB$(0,0,1) -- RGB color model for color literals .
5.10: mis.adda/type/number's base like a dimension?:
. can the notation indicating a number's base
be unified with the dimension system ?
not elegantly, because it confuses a concept:
. the number is a measure,
and the dimension indicates
which property is being measured;
whereas,
in the case of the numeric base indicator,
it indicates how to parse the text .
. we need special syntax for an identifier
to differentiate a typed symbol
from a typed literal value;
because,
the set of literals is indefinitely large from being
defined as a regular expression,
such that there are no identifiers left for symbols .
. numeric literals are from a reserved set of identifiers:
ie, no other symbols than numbers
can start with the digit characters;
but then in bases above 10,
the numeric literal can be confused with a symbol .
. notice too how bases share identifiers:
10 means something different in every base .
. therefore, each literal for non-default base
must have a syntax that says not only the base,
but the fact that this a numeric literal, not a symbol .
. the core system expects ( [, " ) to mean something special,
and all other symbols mean a symbol name .
. a string literal starts with (")
a numeric literal starts with a digit,
an operator symbol starts with non-alphanumeric;
and any other symbol starts with
([),(_), or an alphabetic .
-- literal expressions start with (')
but that's a 2nd meaning of literal because
other literals mean adda should neither parse nor eval,
whereas literal expressions should be parsed .
5.10: adda/type/
defining value literals with regular expressions:
. the general way of handling expressions of literals
is to let the type mgt define it,
so numbers need not be part of the core lang at all,
instead being a module that comes with the core system .
. to generalize the way numeric type define literals,
a type mgt can defines its literals either as
enumerations or as regular expressions .
5.31:
. but if type mgt's are handling all reading of literals,
then constants types aren't known by their assigned values;
because, the value could be a literal,
and no longer implies a particular type .
. one way the above idea could still be useful,
is in talking about compilers that are partial,
and are completed by importing a set of native types,
so then these native types complete the definition of the compiler;
eg, the compiler generator's job is to
analyze all the given native types,
making sure that their literal definitions
are not overlapping .
. it would be a lot simpler to stay classical,
and make some native literals be part of the core language .
5.10: adda/type/review of dimension systems:
. a number can have 3 parts
qty, dimension of qty, and thing being measured:
3 liter water
-- water is a noun measurable by either
volume, mass, moles, or monetary value .
5.12:
. dimensions are used by type physical;
the physical measure type includes a pointer to
a symbol of type physical .
. since numeric literals from an arbitrary base
will be using the usual symbols,
we could declare them to be a numeric sub type:
11.B = 2*1+1,
11.B8= 8*1+1,
11.H = 16*1+1 .
. however, that doesn't work because then we are
reserving that symbol for the number literals;
what we need is a way to say BEEF.someType,
and still be able to say BEEF(base16)
in the same context .
. I had previously noted that
[@] 5.10: mis.adda/type/number's base like a dimension?
math's traditional way for expressing base
is with a subscript, hence BEEF#16;
my problem with that was that it
precluded many symbols from being array names;
but, now I see it could still be possible
if we use type names instead of the base's number:
H.type: number; BEEF#H .
but that does get noisy when combining with arrays!
eg, BEEF# BEEF#H = BEEF#48879 .
number sign for value of a type:
. how about a new context notation
for accessing a type's value:
Typemark#valueLiteral,
and then as a special case of that,
numeric bases are types:
eg, H#BEEF
-- not unlike Ada's 16#BEEF#;
but, we can't use 16#BEEF because
it's confusing when used as an array subscript:
eg, A# 16#10 ambiguously means either
A#(16,10) or A#(16#10) (returns an array) .
eg, B#10 for base 2, H#10 for hexadecimals,
O#10 for octals, and T#10 for tetroctals
(base 32 = 4*8 = tetra-oct-al = tetroctal).
review of multi-subscript arrays:
. AT.type: #.int; -- a named array type .
A#.AT; -- an array of array;
means the same as A#.#.int .
. then A is accessed as either A#i#j, or A#(i,j) .
currency sign for value literal spaces:
. another idea is that bases are not really subtypes,
so what we need is a new syntax for value literals:
. it could be like the above except replacing (#) with ($)
-- currency is the sign of value (as in worth);
eg, B$10 = 2, O$10 = 8, and H$10 = 16,
while $10 = still means usa cash;
but, $green can be an abbreviation for
color$green since green is obviously not a number .
. by having the option of using ($) on enumerations,
we can have separate name spaces for them,
so that I can use both the variable green
and the literal color $green in the same context .
TypeId#value vs ValueType$valueLiteral:
. types can be thought of as arrays of values,
so t#x evals x as one of t's values .
. B$10 = the value 10 as parsed by the binary value type .
Color#green works only if green is not redefined;
because in the expression (aType#x),
x can be any expression, not just a literal;
Color$green is always unambiguous;
because the ($) says what follows is
one of a type Color's literals .
. RGB$(0,0,1) -- RGB color model for color literals .
5.10: mis.adda/type/number's base like a dimension?:
. can the notation indicating a number's base
be unified with the dimension system ?
not elegantly, because it confuses a concept:
. the number is a measure,
and the dimension indicates
which property is being measured;
whereas,
in the case of the numeric base indicator,
it indicates how to parse the text .
. we need special syntax for an identifier
to differentiate a typed symbol
from a typed literal value;
because,
the set of literals is indefinitely large from being
defined as a regular expression,
such that there are no identifiers left for symbols .
. numeric literals are from a reserved set of identifiers:
ie, no other symbols than numbers
can start with the digit characters;
but then in bases above 10,
the numeric literal can be confused with a symbol .
. notice too how bases share identifiers:
10 means something different in every base .
. therefore, each literal for non-default base
must have a syntax that says not only the base,
but the fact that this a numeric literal, not a symbol .
. the core system expects ( [, " ) to mean something special,
and all other symbols mean a symbol name .
. a string literal starts with (")
a numeric literal starts with a digit,
an operator symbol starts with non-alphanumeric;
and any other symbol starts with
([),(_), or an alphabetic .
-- literal expressions start with (')
but that's a 2nd meaning of literal because
other literals mean adda should neither parse nor eval,
whereas literal expressions should be parsed .
5.10: adda/type/
defining value literals with regular expressions:
. the general way of handling expressions of literals
is to let the type mgt define it,
so numbers need not be part of the core lang at all,
instead being a module that comes with the core system .
. to generalize the way numeric type define literals,
a type mgt can defines its literals either as
enumerations or as regular expressions .
5.31:
. but if type mgt's are handling all reading of literals,
then constants types aren't known by their assigned values;
because, the value could be a literal,
and no longer implies a particular type .
. one way the above idea could still be useful,
is in talking about compilers that are partial,
and are completed by importing a set of native types,
so then these native types complete the definition of the compiler;
eg, the compiler generator's job is to
analyze all the given native types,
making sure that their literal definitions
are not overlapping .
. it would be a lot simpler to stay classical,
and make some native literals be part of the core language .
5.10: adda/type/review of dimension systems:
. a number can have 3 parts
qty, dimension of qty, and thing being measured:
3 liter water
-- water is a noun measurable by either
volume, mass, moles, or monetary value .
5.12:
. dimensions are used by type physical;
the physical measure type includes a pointer to
a symbol of type physical .
2011-02-28
literals with type-defined syntax
2.28: adda/dstr/literals with type-defined syntax:
. one way to allow custom syntax
-- elegantly, without a bolt-on --
is to say that types can define
their own literal reader;
ie, a mini' type-specific compiler .
. this allows the grammar of literals to be
something other than a composite of native types,
defined instead as whatever is accepted
by the type's reader .
. during the adda.compiler's first pass,
it sorts out what's adda code
from what is either a string literal,
or a type-specific literal
having a reader-defined grammar .
. in subsequent passes,
it then uses the appropriate reader
to complete the translation of literals .
. a custom reader is not a security threat;
because while it is returning adda binary code,
that code itself is not runnable;
it must still be translated by trusted app's .
. if a type defines more than one reader,
then not mentioning a reader simply calls the default .
it can also be explicit in the usual way:
eg, .t`yet-another-reader
is a reader belonging to type t .
. the pattern: x.anytype = "(...)
results in calling anytype's default reader
-- just as with: x.string = "(string's literal);
( notice .string's default reader
doesn't have to be trivial;
in c.lang, the reader treats "(\)
as an escape character;
eg, \n -> newline, \t -> tab, ... ).
. to help adda readily identify
all the type-defined syntax readers,
they should all return the same special type,
say, .addb (adda binary),
so then in type t's interface,
any functions of type .addb
will be registered as readers for type t;
eg,
( read(x.$).addb
, another-style(x.$).addb, ...)
. a quote lexeme -- '{}, '[], '() --
means do a read (translate text to adda code)
but don't eval,
whereas, a double-quote means don't even read:
it is to be considered { .string, .$ };
so, in the case of readers,
their parameter must be a double-quoted enclosure,
or some expression that returns .string;
otherwise, it's eval'd as adda code
before being given to the reader
and then it might not even be the expected .string type .
. places where literals are encountered:
# static typing:
. the var's are declared to have a particular type;
here the type is obvious;
so the type qualifier is not needed;
eg,
x.anytype`= "(this is greek to adda)
-- that invokes .anytype's default reader;
x.string`= c-style"(string's literal\n)
-- .string's c-style reader is invoked .
# dynamic typing:
-- the type is discovered at run-time --
. adda can't find a reader at compile time
unless the reader's type is specified:
eg,
say .tall can point to all types:
xc.tall`= .string`c-style"(featuring escapes\n) .
-- now xc can point at a .string at compile`time .
. if an undeclared function is given:
eg,
g.tall`= f "(something like x );
then the work is left to the run-time exec:
it sees if the current object assigned to g
does have a type that includes
the function type: f(x.$).*
(where * can be any type);
if so, then g gets whatever type obj'
that f returns .
. string literals can have the same problem as comments:
it's easy to lose the boundaries of multi-line constructs;
and, when that happens,
the code can act strangely because
the compiler thought half the code
was a comment or string;
conversely,
if the comment or string accidently contains
the string delimiter,
some of the comment or string will be
compiled as if were code;
and if that succeeds,
the results will certainly be unintended!
. to assist the human reader,
there could be a redundant enclosure syntax:
if a string can't fit on one line,
the enclosure boundaries should be
on their own separate lines;
"(
example with
2 lines .
);
. if the text includes quote enclosures on their own line,
then the text could go in its own file:
x`= "( [!]myliteral.txt )
-- when {adda, adde} sees [ ! ] as one word,
then what follows is a command for generating text .
. a .txt file would be taken as literal text;
whereas an .adda file would be eval'd to an object
that would then be printed if not alread .string .
. one way to allow custom syntax
-- elegantly, without a bolt-on --
is to say that types can define
their own literal reader;
ie, a mini' type-specific compiler .
. this allows the grammar of literals to be
something other than a composite of native types,
defined instead as whatever is accepted
by the type's reader .
. during the adda.compiler's first pass,
it sorts out what's adda code
from what is either a string literal,
or a type-specific literal
having a reader-defined grammar .
. in subsequent passes,
it then uses the appropriate reader
to complete the translation of literals .
. a custom reader is not a security threat;
because while it is returning adda binary code,
that code itself is not runnable;
it must still be translated by trusted app's .
. if a type defines more than one reader,
then not mentioning a reader simply calls the default .
it can also be explicit in the usual way:
eg, .t`yet-another-reader
is a reader belonging to type t .
. the pattern: x.anytype = "(...)
results in calling anytype's default reader
-- just as with: x.string = "(string's literal);
( notice .string's default reader
doesn't have to be trivial;
in c.lang, the reader treats "(\)
as an escape character;
eg, \n -> newline, \t -> tab, ... ).
. to help adda readily identify
all the type-defined syntax readers,
they should all return the same special type,
say, .addb (adda binary),
so then in type t's interface,
any functions of type .addb
will be registered as readers for type t;
eg,
( read(x.$).addb
, another-style(x.$).addb, ...)
. a quote lexeme -- '{}, '[], '() --
means do a read (translate text to adda code)
but don't eval,
whereas, a double-quote means don't even read:
it is to be considered { .string, .$ };
so, in the case of readers,
their parameter must be a double-quoted enclosure,
or some expression that returns .string;
otherwise, it's eval'd as adda code
before being given to the reader
and then it might not even be the expected .string type .
. places where literals are encountered:
# static typing:
. the var's are declared to have a particular type;
here the type is obvious;
so the type qualifier is not needed;
eg,
x.anytype`= "(this is greek to adda)
-- that invokes .anytype's default reader;
x.string`= c-style"(string's literal\n)
-- .string's c-style reader is invoked .
# dynamic typing:
-- the type is discovered at run-time --
. adda can't find a reader at compile time
unless the reader's type is specified:
eg,
say .tall can point to all types:
xc.tall`= .string`c-style"(featuring escapes\n) .
-- now xc can point at a .string at compile`time .
. if an undeclared function is given:
eg,
g.tall`= f "(something like x );
then the work is left to the run-time exec:
it sees if the current object assigned to g
does have a type that includes
the function type: f(x.$).*
(where * can be any type);
if so, then g gets whatever type obj'
that f returns .
. string literals can have the same problem as comments:
it's easy to lose the boundaries of multi-line constructs;
and, when that happens,
the code can act strangely because
the compiler thought half the code
was a comment or string;
conversely,
if the comment or string accidently contains
the string delimiter,
some of the comment or string will be
compiled as if were code;
and if that succeeds,
the results will certainly be unintended!
. to assist the human reader,
there could be a redundant enclosure syntax:
if a string can't fit on one line,
the enclosure boundaries should be
on their own separate lines;
"(
example with
2 lines .
);
. if the text includes quote enclosures on their own line,
then the text could go in its own file:
x`= "( [!]myliteral.txt )
-- when {adda, adde} sees [ ! ] as one word,
then what follows is a command for generating text .
. a .txt file would be taken as literal text;
whereas an .adda file would be eval'd to an object
that would then be printed if not alread .string .
2010-03-31
function type's literal
3.30: mis.adda/dstr/function type's literal as @:
. just as #.t is the array type's literal,
the function should have something besides ()
because then how do you say function's type
symbolically rather than literally ?
isn't it tacky having manditory useless paren's?
. the @ is english's symbol for function application,
so then @.t could be function type's literal .
then to define the arg type symbolically,
say rec.type: (v.t, ...),
you have @rec.t or #rec.t .
. but then you'd expect
f@x
whereas the english expectation from email syntax is
x@f .
. just as #.t is the array type's literal,
the function should have something besides ()
because then how do you say function's type
symbolically rather than literally ?
isn't it tacky having manditory useless paren's?
. the @ is english's symbol for function application,
so then @.t could be function type's literal .
then to define the arg type symbolically,
say rec.type: (v.t, ...),
you have @rec.t or #rec.t .
. but then you'd expect
f@x
whereas the english expectation from email syntax is
x@f .
syntax for sets, powersets, bags, lists, hypermatrix
3.25: adda/dstr/literals:
. when a var is of type set,
it really means its values are a
powerset of some given enum type.
. another approach is using {a | b | c | ...}
to mean a value space,
while {,,,} means a set that involves multiple values .
. let dots do hyper-matrix structuring:
( ,,,; ,,,; . ,,,; . ...)
while in set braces, dots delimit items in a bag:
items in a bag are not ordered but duplicates are allowed;
eg, { 2 . 2 . 3 } is the bag of primes for 12 .
vs list { ;;; } which has duplicates but is ordered .
. when a var is of type set,
it really means its values are a
powerset of some given enum type.
. another approach is using {a | b | c | ...}
to mean a value space,
while {,,,} means a set that involves multiple values .
. let dots do hyper-matrix structuring:
( ,,,; ,,,; . ,,,; . ...)
while in set braces, dots delimit items in a bag:
items in a bag are not ordered but duplicates are allowed;
eg, { 2 . 2 . 3 } is the bag of primes for 12 .
vs list { ;;; } which has duplicates but is ordered .
2009-12-29
user-defined literals
11.7: adda/type"literals/symbol trees:
. most literals can be described abstractly
as symbol expressions:
literals are just given in english-symbol trees
and system sends the etree to the type'mgt,
who then has private ways of interpreting it .
. one of the things a type like that gives the system
is a literal-checking routine,
so then instead of enumerating the literals
with simple or branched enum's,
the compiler uses the literal-checking routine
during compile-time (vs the usual
where user routines are not available during compile-time) .
11.14: adda/type/modular literals:
. the type'face provides an etree description of
acceptable literals (a tall tree in
{ints, floats, strings, symbols})
. the type'body then has to provide
2 system-defined callback functions
that explain
how they want the etree literal represented as binary
and how to express its binary value back to etree .
Subscribe to:
Posts (Atom)