2011-04-30

rethinking aggregates

4.28: adda/dstr/agg's in literal mode:

. the expression (www.the.com)
is same as (www#the#com)
except that with (www.the.com)
you know the component selectors are literals,
whereas with (www#the#com),
(the) and (com) are likely variables .

. a record field works like a based pointer,
just as an array index does,
except that the distance between components
is not constant .
. records (and even arrays)
can have their components reached literally
via something other than (#)
and the example of (www.the.com)
suggests the dot should be reused for that .
. this way it works like numbers, too:
they have 2 literal parts {frac, int}
connected by a dot
-- a model even more common than url's .
. the use of dots is well-understood,
it looks neat,
and it's easy to reach on the kybd
-- much more so than (`)
(the possessive operator).

. how can reuse of the dot then avoid
confusion with types decl's?
-- eg, A.1 vs A.anArrayType --
it requires that the entire type`name`space
is barred from being reused as a component`name;
ie, if a name already stands for a type,
it can't be used as the name for any component .
. that would be easy if there was a rule like
(type`names must be capitalized),
-- as it is in math --
but most would rather write .int than .Z .

4.29: web.adda/terminology/based pointer:
. what is term for what I like to call
offset pointer? based pointer

4.28: adda/dstr/array-record equivalence:

. wondering why records and arrays
should vary their syntax from each other:
an array is simply a record where
all components are the same type,
yet array#field
vs record`field ?
. but, in database design -- a fundamental --
there's the idea of the multiples (arrays)
vs the units (records);
just as both fractions and integers
may have the same use of digits
but nevertheless very different roles .

. look at the entire lifetime of their use,
including decl's, and the other use of (`),
is it complementary ?
. t.type: <. `f, `+, + .>
; a#.t -- a#.<. `f, `+, + .>
; r.(x.t, y.s) --. a record .
; p/.(x.t, y.s) --. pointer to same .
; b.t --. declares { b`f, b`+, b+... }.
; a#1, r#x, p/x, b`x .

. where did I get the idea that
the operator(`) could be reserved for
self-modifiers?
. can't an aggregate component be a function?
then it works just like in a type def':
when a function is part of an agg'
it has access to every other component of that agg' .
b`++, b#f(x)
. the agg' declarations themselves
don't have hidden locals;
but, if a type def includes an agg' def,
then it defines an agg whose
component functions could be accessing hidden locals .

4.19: adda/cstr/rom-address-mode params:
. [pass by ref] (aka address-mode)
is often more efficient
but in some lang's [pass by copy] (aka value-mode)
is the only sure way to know inputs aren't modified .
. the interface should make clear to the compiler
whether a certain operation is modifying or not;
only then can the compiler be efficiently helpful .
4.20:
functionals have an interface like this:
f(x), f(l,r),
whereas mutators appear like this:
`f, `++, `*(x), `+(x) .

more 2nd-thoughts for use of dot notation:

4.22: todo.adda/type/filter-class generic types:
. studying c++'s generic types, eg atomic:
http://bartoszmilewski.wordpress.com/2008/12/01/c-atomics-and-memory-ordering/
adda's syntax for the parameterized type would be:
atomic(your.type).type: ...
. a generic is normally used like this:
i.atomic(int)
-- i is a version of atomic.type --
but why not i.atomic.int ?
atomic is an important example
of a special class of generic type
in which it offers as output
the same type as was input
providing a modified but nevertheless compatible semantics .
. they would be modeled after arrays,
a#.int
which has so far escaped definition
due to its being a primitive type .
todo:
review how #(x).t, (x).t, /.t
are similar and different,
and how there can be a syntax
for defining a generic array type
had the system not already done it .
. review the named pointer theory
since it has the most in common with that .

4.20: adda/type/more use of multiple dots:
. here's another place where double typing is needed:
msg.Channel.String;
. the var"msg represents a channel
through which is passed obj's of type"string .
. it's like the array, msg#.String,
being a sort of container of strings
but numerous in the time domain
vs space domain .

other lexical change ideas:

4.22: adda/operators/ancestor scope:
. my current system defines a local as
a symbol that includes a type specification;
but, aren't there times when you'd like to
rename or otherwise access an external
as something like:
"( give me the x from however many levels above
that has this particular type) ?
. one way to do that is with a new symbol:
just as the current system allows
../x.t to mean parent scope's object of type x.t,
.../x.t could mean the same but
for any ancestor scope .

4.20: adda/numeric base syntax:
. hex could folow both dimensional syntax and subscript syntax
by treating (space)#(integer) as dimension;
eg, 7FFF #16 or (ART)#32 .
-- using either the space or the parenthetical
is needed to avoid confusing A#16
with the 16th item an array named A .
. an array would allow spacing the other way:
A# 16 -- rather than:
A #16 .

4.12: adda/cstr/pointer arithmetic:
. my first idea for distinguishing between
arithmetic on a pointer vs their targets
was to use an explicit dereference operator;
but ambiguity happens rarely
-- only when the target is a numeric type
and the operation is one that applies to pointers --
so, another idea is to use attribute syntax
to indicate when the operation is on the address .
. my first idea was ptr`addr + i,
but `addr should take the addr of the symbol;
my subsequent idea was to use array notation,
since that is what pointer arithmetic is doing:
ptr#i = ptr + i .

avoiding parentheticals for the lisp-allergic

4.9: mis.adda/syntax/semicolon-comma combination:
. (f x) can be extended with comma
if the semicolon is required for
terminating a function call;
this is good for avoiding paren's .
...
but then you can't tell whether
the ,,,; is for the fx
or for the list that contains the fx .
...
. in places that expect (;)
an optional comma-separated list is also possible;
conversely, in places where a commas delimit,
such as a list, or vector,
a semicolon provides a shortcut to lists of lists
(,,,;,,,) = ((,,,), (,,,)).
with semicolons as param terminators,
(f(,,,), g(,,,)) = (f,,,; g ,,,)
but that is confused with (f(); g())?
speaking of confusion,
if (,,,;,,,) = ((,,,), (,,,))
then you can't have sequences in param's?
you have to parenthesize them:
(,,,; ,,,) means a matrix;
(,,,(;),,,) means a list containing a sequence .
(,;
,,,;
,,) -- ragged array .
4.30: another way:
. after a symbol (whether a function or not)
if it's followed by a colon:
f:,,,; -- then it can terminated by a (;).

hybrid of efficiency and encapsulation for oop

4.8: adda/oop/hybrid of efficiency and encapsulation:

. oop's inheritance is notorious for
sharing instance var's (ivar's);
but, why can't direct access still be
more controlled, like ada param's are ?

4.10: review:
. in typical (popular) oop`inheritance,
efficiency is gained when the interface
commits to a particular list of ivar's;
the inheritor's ivar's get tacked on to
the ivar record being inherited (super's) .
. encapsulation can be maintained anyway
despite the lack of privacy,
because the super can opt to mandate
that only the super's methods
can operate on the super's ivar's .
. if opting instead to share ivar's with inheritors,
then their accesses can be done quickly
since they bypass calling a function;
but, everyone in the inheritance chain
is communicating via shared var's;
and in this way, new bugs can be caused whenever
any party of the inheritance chain gets modified .
4.10:
. one way to allow direct but controlled access
is by having optional watch functions:
. inheritors have direct access
but it's confined to reads and writes;
ie, rather than having continuous access,
it's like the ada`parameter model
where the inout param's are modeled by
copying the initial value,
working on one's own copy,
and then overwriting the param`target
when the function is returning .
. if the super wants more control over
its own ivar's,
it can put a watch on them:
after a write, it can do range testing
or check for internal consistency;
if it raises an error,
the system can know who did that last write .

. the interface shouldn't have to list ivar's;
the ivar's that are listed are simply
those meant for sharing with inheritors .
. the ivar's that actually model object state
are known only to the init functions .

SML (standard metalanguage)

4.24: news.adda/lang"sml (standard metalanguage):
. standard ml is influenced by
ISWIM (I see what it means)
which influenced not only ML,
but also many other functional languages
such as SASL, Miranda, and Haskell .
Landin's SECD machine used call-by-value;
if the imperative features are stripped out
(assignment and the J operator)
leaving a purely functional language,
it then becomes possible to switch to
lazy evaluation (vs eager evaluation);
that was the path of SASL,
KRC (Kent Recursive Calculator),
Hope, Miranda, Haskell, and Clean.
. A goal of ISWIM was to look more like
mathematical notation,
so it replaced ALGOL's ways with
the pythonic off-side rule
(newlines take the place of semicolons;
indentation represents parentheticals or begin-end pairs )
. abc and python are hardly the only off-siders:
* Boo * BuddyScript * Cobra * CoffeeScript
* Curry * F# (if #light "off" is not specified)
* Genie * Miranda * Nemerle * Occam
* PROMAL * Spin * XL * YAML .

news.adda/lang"CoffeeScript/a hll-to-hll translator:
. CoffeeScript compiles to JavaScript
adding syntactic sugar inspired by Ruby and Python
to enhance JavaScript's brevity and readability,
as well as adding more sophisticated features like
array comprehension and pattern matching.

read-only, constance, and uniqueness

4.22: adda/type/const/for shared-link parameters:

. what is the diff'tween {const, rom}?
and did you notice that in formal params,
where rom is needed,
this const syntax is not very applicable ?

. the reason for being concerned with rom
is that people would like to know
whether they can safely pass a reference .
. this should be handled as it is in Ada:
it is safe unless the parameter is
specifically marked as out-mode .
. so the question then becomes,
what is an intuitive symbol for out-mode?
. ada's goto enclosure has been written about before;
and out-mode mode param's are like goto's;
because, they are transfering control to
some surprising places .
todo:
. how was it shown
where the out-mode is happening?
eg, for an array of pointers to strings,
is the array being modified (dangling pointer risk)
or are the strings being modified ?
the out-mode syntax needs to
make this distinction easy to express .

. what if the caller provides a shared link;
and, while the caller meant for the ref'
to provide the value it represented
at the time of the call,
in fact it will be changing dynamically,
because even though the caller is suspended
the caller's co.programs are still modifying the link;
therefore:
when a link represents a value
then the shared value needs a soft-locking system:
it doesn't prevent co.programs from writing
but requires the first writer
to ask the system for a lock-removal service;
in doing that,
the system sees who the lock belongs to
and copies the current value .
. if it knows the lock owner is going to be quick
it can just high-prioritize that call,
and suspend the unlocker until then .
. this is a lot efficiency glue
so the system needs to know at call time
whether an input is large eno' to link to,
since copying makes life much simpler .

4.22: mis.adda/type/const/for non-parameters:
. while const's for param's are not an issue
(see notes about rom above)
[@] adda/type/const/for shared-link parameters
const's for symbol declarations are very useful .
. I've had problems finding a const operator
(below) [@] mis.adda/uniqueness operator
but, having the need restricted to non-parameters
makes the search moot, I thought,
as const's can be initialized with a label
rather than an assignment operator;
nevertheless,
what if uninitialized, but
intending to assign later only once?
then a const symbol would enforce
only one assignment .
. just as math uses (someone!)
to mean (exactly one),
the type system could use (type!)
to mean not just (some
of the values from this type)
but instead (exactly one
of the values from this type
will be used here).

4.22: mis.adda/uniqueness operator:
. a convenient symbol for rom
is the same as for unique,
since rom maintains a unique value across times .
. notice though, that
when I chose (!) for unique
it was because math uses it for the
unique existence quantifier;
but when (!) is used alone, without (some),
math defines it as factorial;
and what does that have to do with unique?
( recall factorial:
x! = *(^i: i= 1..x) -- very multiplicative!
) . in both of these math cases,
it matches the usual english meaning:
(very *) .
. in the case of (some!),
(exactly one) is a very useful version of
(some one) because,
(mapping to exactly one)
is an essential characteristic of functions
-- one of math's foundations .
. intuitively the name could come from
feeling very responsible:
ie, if "(some one) did it
then there is a degree of anonymity,
but if exactly (!) one member is responsible
then that one member is
very much the reason for the season .

integrating types and dimensions

4.22: todo.adda/urgent/types integrated with dimensions:
just as 9.81 m/s/s
means 9.81 * m/s/s
f(x) is a shorthand for
f @ (x);
thus, space is an overloaded infix operator
meaning either a multiply or an apply .
. it is unambiguous because of the
specific places that an apply is expected .
. symbols must declare whether they expect an arg;
therefore what follows such a symbol
must be either {arg, terminator};
otherwise, it can expect either
{ infix
, multiply`factor or dimensional
, terminator}.
todo:
still need to flesh out
how to dimension things
and declare dimensions
--
. "(dimensional) refers to numeric types:
any renaming of a numeric type
must be a dimension because it implies
a numeric amount of a certain substance;
3 tsp substance
-- tsp is a unit of volume;
hence the declaration:
( tsp.vol: 4.92892159 ml )
-- many things can be measured in 3 ways:
unit counts, volumes, and weights of masses .
. in addition to type being a special type,
there is a special class of primary dimensions:
{ volume, mass, length, time,
, electric intensity
, thermal intensity
, radiation intensity }.

unique pointers

4.21: news.adda/dstr/unique pointer:

. the idea behind the unique pointer
is that they can be safely moved between threads
and they never require locking;
--
[. I first saw this idea in ms`singularity;
the msg's are instantanious because
they involve moving
only a shared heap pointer
rather than a record between process heaps .

problems with maintenance:
. after a move for a unique pointer,
the source has to be set to null?
--
I thought a simpler idea would be
a handle to a record (pointer, owner ID );
and then, in order to use the pointer,
you had to set the owner ID to
the process you were passing it to .
. anyone using it would first have to check for
whether they actually owned it at the moment .
once they were done with it
they would hand it to the system (ID=0)
or they could communicate it to a co.process .]

making concurrent programming safer:
. he proposes to do this by
extending the type system.

. two major challenges in multithreaded programming
are identified as:
Avoiding races -- approachable,
Preventing deadlocks -- pie in the sky.

. his main reference is:
Object Types against Races (pdf):
. Cormac Flanagan and Mart ́ Abadi .
This paper investigates an approach for
statically preventing race conditions in an oop language.
The setting of this work is a variant of
Gordon and Hankin’s concurrent object calculus.
We enrich that calculus with a form of
dependent object types
that enables us to verify that threads
invoke and update methods only after
acquiring appropriate locks.
We establish that well-typed programs
do not have race conditions.