Americium Dream Documents: lexeme

Showing posts with label lexeme. Show all posts

2012-04-30

typed enclosures

4.29: mis.adda/syntax/typed enclosures:
. adda currently reserves the square brackets
as a way to allow more freedom in identifier spellings:
. the space and all printable characters are allowed
as long as every use of square brackets is paired,
because then the ending bracket is well-defined;
eg, [*thi{ i$ ()ne [eg@l iden]ifier] .
. one way to allow multiple uses for square brackets
is to reserve typed brackets: eg,
[this is a single symbol] --. no typing syntax .
[.M(2,3): ...] --. this is typed as being a 2x3 matrix .
{.R: [a,b)} -- a <= x < b for x in reals .
mis:
. this seems too arbitrary, with no compelling use cases ...
sci:
. however, look at the angle brackets:
while the type is .< ... >
the other use is as a value:
<. ... > or <.type: ...>
and of course the 3rd use is as the less-than,
without any dots nearby .

[::] as a context clarifier

4.9: adda/syntax/type/[::] as a context clarifier:
. confusing having syntax"(type`value) when
(`) already has a specific meaning: (x`f);
so, maybe a good idea to use (::) as a
context specifier (type::value) ?
t.type: .< .(fields), .{values}, f(x).t, `method.t >
-- so then (x.t) lets me call { x`method, f(x) }
but if (`) was the context operator,
then we have { t::f, t::method }.

2011-10-31

double-equal to mean prolonged equality

10.12: adda/lexicon/==:
. the == could be like the japanese way of
2-trees representing many trees: a forest .
. 2 equals would say the values are
equal for many moments,
ie, they are pointers to the same obj; [10.13:
or they are part of a notify&update pattern .
it could also be used as a special form of
assignment stmt:
whatever you're getting access to,
you have some way of maintaining synchronization,
as would occur if using aliases .]

2011-02-28

self as used by interfaces and anonymous algorithms

2.28: adda/lexeme/self/for anon'algor's and interfaces:
. the word"self is useful in 2 contexts:
# anonymous algorithm:
it means the current algorithm,
# type`interface:
it means a current instance of that type;
eg, t.type: ( self!: defines a postix operator).
. if there is an algorithm in the header
(eg, to declare algorithms that won't change
for some reason...)
then the usual scope rules prevail .

literals with type-defined syntax

2.28: adda/dstr/literals with type-defined syntax:
. one way to allow custom syntax
-- elegantly, without a bolt-on --
is to say that types can define
their own literal reader;
ie, a mini' type-specific compiler .
. this allows the grammar of literals to be
something other than a composite of native types,
defined instead as whatever is accepted
by the type's reader .
. during the adda.compiler's first pass,
it sorts out what's adda code
from what is either a string literal,
or a type-specific literal
having a reader-defined grammar .
. in subsequent passes,
it then uses the appropriate reader
to complete the translation of literals .

. a custom reader is not a security threat;
because while it is returning adda binary code,
that code itself is not runnable;
it must still be translated by trusted app's .

. if a type defines more than one reader,
then not mentioning a reader simply calls the default .
it can also be explicit in the usual way:
eg, .t`yet-another-reader
is a reader belonging to type t .

. the pattern: x.anytype = "(...)
results in calling anytype's default reader
-- just as with: x.string = "(string's literal);
( notice .string's default reader
doesn't have to be trivial;
in c.lang, the reader treats "(\)
as an escape character;
eg, \n -> newline, \t -> tab, ... ).

. to help adda readily identify
all the type-defined syntax readers,
they should all return the same special type,
say, .addb (adda binary),
so then in type t's interface,
any functions of type .addb
will be registered as readers for type t;
eg,
( read(x.$).addb
, another-style(x.$).addb, ...)

. a quote lexeme -- '{}, '[], '() --
means do a read (translate text to adda code)
but don't eval,
whereas, a double-quote means don't even read:
it is to be considered { .string, .$ };
so, in the case of readers,
their parameter must be a double-quoted enclosure,
or some expression that returns .string;
otherwise, it's eval'd as adda code
before being given to the reader
and then it might not even be the expected .string type .

. places where literals are encountered:
# static typing:
. the var's are declared to have a particular type;
here the type is obvious;
so the type qualifier is not needed;
eg,
x.anytype`= "(this is greek to adda)
-- that invokes .anytype's default reader;
x.string`= c-style"(string's literal\n)
-- .string's c-style reader is invoked .
# dynamic typing:
-- the type is discovered at run-time --
. adda can't find a reader at compile time
unless the reader's type is specified:
eg,
say .tall can point to all types:
xc.tall`= .string`c-style"(featuring escapes\n) .
-- now xc can point at a .string at compile`time .
. if an undeclared function is given:
eg,
g.tall`= f "(something like x );
then the work is left to the run-time exec:
it sees if the current object assigned to g
does have a type that includes
the function type: f(x.$).*
(where * can be any type);
if so, then g gets whatever type obj'
that f returns .

. string literals can have the same problem as comments:
it's easy to lose the boundaries of multi-line constructs;
and, when that happens,
the code can act strangely because
the compiler thought half the code
was a comment or string;
conversely,
if the comment or string accidently contains
the string delimiter,
some of the comment or string will be
compiled as if were code;
and if that succeeds,
the results will certainly be unintended!
. to assist the human reader,
there could be a redundant enclosure syntax:
if a string can't fit on one line,
the enclosure boundaries should be
on their own separate lines;
"(
example with
2 lines .
);
. if the text includes quote enclosures on their own line,
then the text could go in its own file:
x`= "( [!]myliteral.txt )
-- when {adda, adde} sees [ ! ] as one word,
then what follows is a command for generating text .
. a .txt file would be taken as literal text;
whereas an .adda file would be eval'd to an object
that would then be printed if not alread .string .

unary operators not always taking precedence over binary

2.7: news.adda/math/unary not always taking precedence:
. negation has the same precedence as
multiplication and division;
because, negation means mult by -1.
So -a^b should be -1*a^b = -(a^b).
details:
. programmers were accustomed to the C language,
in which unary operators such as negation
have higher precedence than any binary operator;
(and there was no exponent operator in C
to cause them to think twice about the matter).
so, when programmers use an exponent operator,
they may have wished to remain consistent with C;
however, for centuries,
the polynomial -x^2
has meant -1*x^2 = -(x^2)
not (-x)^2 = x^2 .

. look at the HP48G User Guide/order of operations:
priority#1:
Prefix functions (such as sin, ln, ...)
and Postfix functions (such as ! (factorial)).
--[. many could say negation is a prefix -();
2.16: nevertheless,
notice the way math has superscripted powers
(rather than using an operator);
as if it was an extension of the symbol's name
like the way subscripts actually are,
and thus intuitively having higher precedence
than any operation applied to the name .]
priority#2: Power (^) and square root.
priority#3: Negation (-), multiplication, and division.
--[. here is the 2nd place -() fits;
but, only because of its equvalence to -1*();
many think it's obvious that the negative
is part of the number's value .]
priority#4: Addition and subtraction.

. clarity should take precedence over correctness;
so, the system needs to ask new users
-- at least those who use the form (-x^n):

"( how would you eval -2^2 ?
{ 4, -4 } ??
. -1*2^2 is definitely = -1(2^2) = -4 .
whereas (-2)^2 = (-2)(-2) = 4 . )

. furthermore, when exporting adda`binary,
or allowing copies to text
always write it unambiguously { -(2^2), (-2)^2 }.

vertical bar for divisions not xor

2.7, 2.15: adda/vertical bar operator uses:
. the vertical bar operator has several uses
that are quite similar across most fields:
number theory:
. n|m is a truth function: does m divide n evenly?
because of that,
I wondered if it could double as the div operator;
ie, (int / int) -> Q (float or rational);
whereas (int | int) -> Z {truncate, floor, ceiling, ...}
. if you consider context, there is indeed
precidence for (n|m) as a div operator:
in number theory, (n|m) assumes
the result is being passed to a truth variable .
. this is also how (=) doubles as {equals, becomes}
in the basic.lang (eg, if a=b then a=c);
case depending on type:
# number? integer quotient;
# truth? mod = 0 .
. notice though, that the question of divisibility
is actually depending on the mod operation ...;
set theory:
{ f(x) | x`range } -- set generators
f(x) | (x in range) -- definite integrals .
-- like number theory, it concerns division:
the set is infinite until divided by
the finite range of its control variable .
linguistics:
. (a|b|c) means pick exactly one,
reminding me of unique existence;
however, in systems programming
there is often a need to apply both div's
and bit-wise xor's to the same ints;
eg, n xor (n div 31);
so, I'm wondering if I can find
something besides (|) for xor,
since it already fits nicely with div .

. math's existence operator(∃) adds a bang(!)
to express a unique existence:
eg, ( for some! x: p(x,y) )
would mean
( for exactly one x: p(x,y) );
[2.28:
. notice how math uses separate operators
whenever there is quantification
or control var's involved;
even though it could be more elegant
if it reused set generators:
eg, +(i.int|i = 1...n)
is a summation over 1..n;
math would prefer to present this as:
n
∑ (i)
i = 1 .
. likewise, math's ∃x(p(x)) can also be
expressed as
or( p(x)| x in universe );
\/ is math's operator for logical-or;
therefore,
in the pursuit of minimal language,]
a good symbol for unique existence
could be \! .
. xor is used often in computingand is like unique existence
but only for the special pair-wise case:
. the composition of xor's for reducing
a list larger than a pair
is not generally equivalent to an (\!);
because, recursively, an xor's return of false
may indicate either {many, none};
while a return of true indicates uniqueness;
so then here is a counter example
(existence but non-unique):

(true xor true) xor (true xor false)
(many) xor (unique) =
(false) xor (true) = true

but true was supposed to mean unique;
whereas this was an example of many,
which should have registered as false
if testing for unique existence .
. math`s symbol for xor is a circled plus
reminding us it functions as
clock addition modulo 2 .
. if restricting the arg of (\!) to pairs
a computer's xor operation can implement
unique existence .

2009-12-28

wordcode

9.27: todo.adde/wordcode/decomposing word`parts:

. teasing words apart could get done pretty fast,

once you have a list of of basic parts,

and then have a list of

all words in your db that have those parts .

. it lists the words, you study them,

and it makes it easy for you catch the exceptions

or find new patterns in words for more efficient coding .

10.22: adda/unix/tools communicating with binary pipes:

. the unix way

is to have tools communicating with text pipes,

whereas, the goal of adda

is to have a comm'standard that's binary;

. unix is the primary target platform;

so, I'm wondering how to efficiently pack binary

into unix text strings

(where there can be no null's; ie, no bytes = 00) .

sockets:

. use of string may be a requirement of

tool communications within a std unix shell;

but, for connecting tools within your own shell,

unix sockets can provide binary app-to-app pipes .

. that way,

you can have your app's talk to each other in binary

while exports to others can be done by

translating your binary to their {unicode, xml, ...} .

10.22: adda/unix/wrapping binary files in text:

. the new std is to use unicode,

and these values

can be reused for a binary std's wordcodes

(similar to the way chinese text

has a separate character for each word) .

. a more efficent way

is to think of each byte as being one digit of a number

(there are 255 non-zero values in a byte) .

. if practicality requires your number be in base 2**n

then a byte can support a number system of base 128:

(having 1..127 map to the same,

and zero is quickly flipped to be FF#16 (-1) )

. that still leaves each byte's other negative values

to mean something else;

eg, when finding a negaitive byte,

get the binary complement;

and if not 0,

then have the byte represent n+1 consecutive zeroes .

[10.28:

. or more likely,

they could be reserved for indicating

the type or length of the next digit sequence;

eg, then your number stream could be variable-length

like unicode,

except it could have string descriptors,

where a negative would say that until the next descriptor,

the default number length would be 4bytes instead of 1 .

(unlike unix where everything was byte-based,

this would be word-based,

so apon reading the next element of a file,

it uses these descriptors to find complete elements)

] .

11.9: engl/word.ules:

A lexeme is an abstract unit of morphological analysis in linguistics,

that roughly corresponds to a set of forms taken by a single word.

For example, in the English language,

run, runs, ran and running are forms of the same lexeme, run .

Lexemes are often composed of smaller units

with individual meaning called morphemes .

Americium Dream Documents

2012-04-30

typed enclosures

[::] as a context clarifier

2011-10-31

double-equal to mean prolonged equality

2011-02-28

self as used by interfaces and anonymous algorithms

literals with type-defined syntax

unary operators not always taking precedence over binary

vertical bar for divisions not xor

2009-12-28

wordcode

(As an Amazon Associate I earn from qualifying purchases.); pages of alpha doc's

posts by category

Blog Archive

tags

About Me

Facebook

search Wikipedia

Search This Blog