2011-02-28

modulus vs remainder

2.7, 2.16: Using the mod() function with negative numbers
"modulo" as a relation:
[pointing in the same direction on a clock]
any two numbers a and b are congruent modulo m
if (a - b) is a multiple of m.
. math's idea of "integer division":
x . . . . : 2.7, -2.7
floor(x) .: 2.0, -3.0
ceiling(x): 3.0, -2.0 .

. for both mod (modulus) and rem (remainder),
they are related to div by:
A = ( A DIV B ) * B + A % B
where % is either { rem, mod };
. {mod, rem} are similar in that
they are both consistent with a div function;
but mod's div truncates towards -∞ (negative infinity);
whereas, rem's div truncates towards zero .
. mod(-n, d) -- vs rem -- is the complement
of mod(n, d); eg,
MOD(-340,60)= 20
MOD(340,60)= 40
(40 and 20 are complementary modulo 60;
ie, 40+20 = 60).
. truncating toward -∞ (negative infinity)
means that if n (the numerator) is negative;
then the usual integer div needs to be decremented:
div = int(n/d)-1
-- so that the truncation is consistent by
always reducing the value instead of
changing it willy-nilly towards nil
(that'd be adding value when truncating negatives
while subtracting value when truncating positives).

Ada's "mod" (modulus) and "rem" (remainder):
. notice that while Ada supports both {mod, rem}
it has only rem-consistent div (truncate toward zero)
ie, observing the identity (-A)/B = -(A/B) = A/(-B)
for A,B in positives .
by contrast, Python truncates toward -infinity .
. here is Python's mod-consistent div ( % means mod )
. 123 / 10 = 12,  123 % 10 = 3
-123 / 10 = -13, -123 % 10 = 7
. 123 /-10 = -13, 123 % -10 = -7
-123 / -10 = 12, -123 % -10 = -3

translation from ada to c:
. ada's {rem, mod} is defined for (n,d) in Z
(integers, numerator and denominator can be negatives);
let c`% = abs(n) % abs(d):
-- (%) is c's symbol for remainder function --
then depending on the original signs of n,d,
use the following table to know whether to
{complement, negate} c`% .
-- complement (~) means abs(modulus) - x;
so for modulus = 5, the complement
of 1..4
is 4..1, respectively .
for rem:
. (n rem d)`sign = n`sign
details:
. when n,d are both positive,
or only d(modulus) is negative:
eg, 1...4 rem -5 = 1..4
or 1...4 rem 5 = 1..4
--> c`% .
. when n,d are both negative,
or only n is negative:
eg, -1...-4 rem -5 = -1 .. -4;
or -1...-4 rem 5 = -1..-4
--> -c`% .
for mod:
. (n mod d)`sign = d`sign;
if only n or only d is negative,
then complement .
details:
. when n,d are both positive:
eg, 1...4 mod 5 = 1..4
--> c`% .
. when n,d are both negative:
eg, -1...-4 mod -5 = -1 .. -4
--> -c`% .
. when only d(modulus) is negative:
eg, 1...4 mod -5 = -4..-1
--> -~c`% .
. when only n is negative:
eg, -1...-4 mod 5 = 4..1
--> ~c`% .

self as used by interfaces and anonymous algorithms

2.28: adda/lexeme/self/for anon'algor's and interfaces:
. the word"self is useful in 2 contexts:
# anonymous algorithm:
it means the current algorithm,
# type`interface:
it means a current instance of that type;
eg, t.type: ( self!: defines a postix operator).
. if there is an algorithm in the header
(eg, to declare algorithms that won't change
for some reason...)
then the usual scope rules prevail .

literals with type-defined syntax

2.28: adda/dstr/literals with type-defined syntax:
. one way to allow custom syntax
-- elegantly, without a bolt-on --
is to say that types can define
their own literal reader;
ie, a mini' type-specific compiler .
. this allows the grammar of literals to be
something other than a composite of native types,
defined instead as whatever is accepted
by the type's reader .
. during the adda.compiler's first pass,
it sorts out what's adda code
from what is either a string literal,
or a type-specific literal
having a reader-defined grammar .
. in subsequent passes,
it then uses the appropriate reader
to complete the translation of literals .

. a custom reader is not a security threat;
because while it is returning adda binary code,
that code itself is not runnable;
it must still be translated by trusted app's .

. if a type defines more than one reader,
then not mentioning a reader simply calls the default .
it can also be explicit in the usual way:
eg, .t`yet-another-reader
is a reader belonging to type t .

. the pattern:  x.anytype = "(...)
results in calling anytype's default reader
-- just as with: x.string = "(string's literal);
( notice .string's default reader
doesn't have to be trivial;
in c.lang, the reader treats "(\)
as an escape character;
eg, \n -> newline, \t -> tab, ... ).

. to help adda readily identify
all the type-defined syntax readers,
they should all return the same special type,
say, .addb (adda binary),
so then in type t's interface,
any functions of type .addb
will be registered as readers for type t;
eg,
( read(x.$).addb
, another-style(x.$).addb, ...)

. a quote lexeme -- '{}, '[], '() --
means do a read (translate text to adda code)
but don't eval,
whereas, a double-quote means don't even read:
it is to be considered { .string, .$ };
so, in the case of readers,
their parameter must be a double-quoted enclosure,
or some expression that returns .string;
otherwise, it's eval'd as adda code
before being given to the reader
and then it might not even be the expected .string type .

. places where literals are encountered:
# static typing:
. the var's are declared to have a particular type;
here the type is obvious;
so the type qualifier is not needed;
eg,
x.anytype`= "(this is greek to adda)
-- that invokes .anytype's default reader;
x.string`= c-style"(string's literal\n)
-- .string's c-style reader is invoked .
# dynamic typing:
-- the type is discovered at run-time --
. adda can't find a reader at compile time
unless the reader's type is specified:
eg,
say .tall can point to all types:
xc.tall`= .string`c-style"(featuring escapes\n) .
-- now xc can point at a .string at compile`time .
. if an undeclared function is given:
eg,
g.tall`= f "(something like x );
then the work is left to the run-time exec:
it sees if the current object assigned to g
does have a type that includes
the function type: f(x.$).*
(where * can be any type);
if so, then g gets whatever type obj'
that f returns .

. string literals can have the same problem as comments:
it's easy to lose the boundaries of multi-line constructs;
and, when that happens,
the code can act strangely because
the compiler thought half the code
was a comment or string;
 conversely,
if the comment or string accidently contains
the string delimiter,
some of the comment or string will be
compiled as if were code;
and if that succeeds,
the results will certainly be unintended!
. to assist the human reader,
there could be a redundant enclosure syntax:
if a string can't fit on one line,
the enclosure boundaries should be
on their own separate lines;
"(
example with
2 lines .
);
. if the text includes quote enclosures on their own line,
then the text could go in its own file:
x`= "( [!]myliteral.txt )
-- when {adda, adde} sees [ ! ] as one word,
then what follows is a command for generating text .
. a .txt file would be taken as literal text;
whereas an .adda file would be eval'd to an object
that would then be printed if not alread .string .

unary operators not always taking precedence over binary

2.7: news.adda/math/unary not always taking precedence:
. negation has the same precedence as
multiplication and division;
because, negation means mult by -1.
So -a^b should be -1*a^b = -(a^b).
details:
. programmers were accustomed to the C language,
in which unary operators such as negation
have higher precedence than any binary operator;
(and there was no exponent operator in C
to cause them to think twice about the matter).
so, when programmers use an exponent operator,
they may have wished to remain consistent with C;
however, for centuries,
the polynomial -x^2
has meant -1*x^2 = -(x^2)
not (-x)^2 = x^2 .

. look at the HP48G User Guide/order of operations:
priority#1:
Prefix functions (such as sin, ln, ...)
and Postfix functions (such as ! (factorial)).
--[. many could say negation is a prefix -();
2.16: nevertheless,
notice the way math has superscripted powers
(rather than using an operator);
as if it was an extension of the symbol's name
like the way subscripts actually are,
and thus intuitively having higher precedence
than any operation applied to the name .]
priority#2: Power (^) and square root.
priority#3: Negation (-), multiplication, and division.
--[. here is the 2nd place -() fits;
but, only because of its equvalence to -1*();
many think it's obvious that the negative
is part of the number's value .]
priority#4: Addition and subtraction.

. clarity should take precedence over correctness;
so, the system needs to ask new users
-- at least those who use the form (-x^n):
"( how would you eval -2^2 ?
{ 4, -4 } ??
. -1*2^2 is definitely = -1(2^2) = -4 .
whereas (-2)^2 = (-2)(-2) = 4 . )
. furthermore, when exporting adda`binary,
or allowing copies to text
always write it unambiguously { -(2^2), (-2)^2 }.

vertical bar for divisions not xor

2.7, 2.15: adda/vertical bar operator uses:
. the vertical bar operator has several uses
that are quite similar across most fields:
number theory:
. n|m is a truth function: does m divide n evenly?
because of that,
I wondered if it could double as the div operator;
ie, (int / int) -> Q (float or rational);
whereas (int | int) -> Z {truncate, floor, ceiling, ...}
. if you consider context, there is indeed
precidence for (n|m) as a div operator:
in number theory, (n|m) assumes
the result is being passed to a truth variable .
. this is also how (=) doubles as {equals, becomes}
in the basic.lang (eg, if a=b then a=c);
case depending on type:
# number? integer quotient;
# truth? mod = 0 .
. notice though, that the question of divisibility
is actually depending on the mod operation ...;
set theory:
{ f(x) | x`range } -- set generators
f(x) | (x in range) -- definite integrals .
-- like number theory, it concerns division:
the set is infinite until divided by
the finite range of its control variable .
linguistics:
. (a|b|c) means pick exactly one,
reminding me of unique existence;
however, in systems programming
there is often a need to apply both div's
and bit-wise xor's to the same ints;
eg, n xor (n div 31);
so, I'm wondering if I can find
something besides (|) for xor,
since it already fits nicely with div .

. math's existence operator(∃​) adds a bang(!)
to express a unique existence:
eg, ( for some! x: p(x,y) )
would mean
( for exactly one x: p(x,y) );
[2.28:
. notice how math uses separate operators
whenever there is quantification
or control var's involved;
 even though it could be more elegant
if it reused set generators:
eg, +(i.int|i = 1...n)
is a summation over 1..n;
math would prefer to present this as:
n
∑ (i)
i = 1 .
. likewise, math's ∃x(p(x)) can also be
expressed as
or( p(x)| x in universe );
\/ is math's operator for logical-or;
therefore,
in the pursuit of minimal language,]
a good symbol for unique existence
could be \! .
. xor is used often in computingand is like unique existence
but only for the special pair-wise case:
. the composition of xor's for reducing
a list larger than a pair
is not generally equivalent to an (\!);
because, recursively, an xor's return of false
may indicate either {many, none};
while a return of true indicates uniqueness;
so then here is a counter example
(existence but non-unique):
(true xor true) xor (true xor false)
(many) xor (unique) =
(false) xor (true) = true
but true was supposed to mean unique;
whereas this was an example of many,
which should have registered as false
if testing for unique existence .
. math`s symbol for xor is a circled plus
reminding us it functions as
clock addition modulo 2 .
. if restricting the arg of (\!) to pairs
a computer's xor operation can implement
unique existence .

menucode

2.4: adda/dstr/menucode:
. menucode is a variation of the index-style pointer
. it's a finite enumeration that is then mapped to
pointers or indexes .
2.14:
. what pointers and indexes have in common is
being addresses of a memory implementation .
. this is in contrast to opcodes and menucodes
which are specifying what you want to address
rather than where its address is .

clarity in a unicode world

2.2: adda/clarity in a unicode world:
oreilly commenters mention rust:
. Rust is Mozilla's new lower-level language;
it has a policy of ASCII-only lexemes.
. all platforms have Unicode libraries,
so why go ASCII-only ?  [2.9:
At the moment it’s just a matter of
keeping the lexer simple
(no character-class tables to consult)
as well as making it accessable to all unix tools .]

. I first conjectured ASCII-only was for security:
it's for the same reason that
website names should be ascii;
if you have 2 names that look the same,
but one uses european letters,
then people could get them confused .
. the question for a lang then is
how can the compiler give writers freedom
and still help readers stay unconfused?
# highlighting:
. whenever non-ascii could be confused with ascii,
it would be in a contrasting font or style .
# substitution:
. the editor makes it easy to
change variable names:
after parsing, a name is identified as a var';
and, the var links to its scope,
so, after a user has changed the var's name,
the entire text for that scope is re-displayed
so that all instances of the var's name
reflect the user's preference .

integrate with translations:
. like chrome.browser is integrated with
google-translate,
have the editor translate any foreign characters
to the user's {alphabet, vocabulary} .