2013-11-30

mem'mgt for a top-down stack layout

9.4: adda/mem'mgt/top-down stack layout:
[11.23: intro:
. the adda`oop system (unlike other oop impl's)
seeks to avoid generating garbage,
and that requires that when a function returns an object,
the return-obj's needed memory was allocated by the caller,
and then the function was given a pointer to that memory .
. that means assignments and function calls
are fused into one operation;
y`= f(x) is implicitly:
y`= f(x, ptr to y),
and this is true even when y is a parameter:
g( f(x) ) is implicitly:
stack g(y); stack f(x, ptr to y); call f; call g .


9.4: motivation:
. how does adda explicitely mention
the address of a parameter that is stacked
and waiting for subexpression returns?
we have to name that act'rec's components
as an offset of some sort of stack access .
[11.23: using the example above,
how do we generate (ptr to y) for
"( stack g(y); stack f(x, ptr to y); )? ]
[9.11: yes:
. that is what temporary vars are for;
the designer of a subprogram
is not the only one declaring locals in that scope;
there is also the compiler that is setting up calls,
and decaring temp'vars to put the results in .
9.12:
. in order to point at parameters on the stack,
we are putting the calls' parameters on an adda`stack,
and then in c we are calling our c subprogram
with a parameter that is just a single pointer
to the record on our adda`stack;
or, we can make calls having the usual parameters,
but instead of being values, they are all pointers
to the obj's in the record on our adda`stack .]

5: build it!:

. in adda-style oop, for each expression tree
we are stacking the root operation first
because that call's parameters are providing
the return objects for child calls;
so, to address such a style of temp'vars,
we need to simulate some of the stack work
even in adda on c, as well as in addm .
. how does this fit in with
our stack for addm? [11.12:
adda and addm both place parameters on
a simulated stack;
but then adda actually makes a c call
to run the subprogram on those parameters,
whereas addm has a bytecode version of the subprogram .
11.23: stackless revision:
( see adda/mem'mgt/stackless
for details of how adda might make virtual calls,
so that we can implement pre-emptive multi-tasking .]

. in c, can you take the address of tuple fields?
[... yes, but not bit fields].
[11.12:
so we can impl' adda tuples with c "structs";
because even if the return obj' is a tuple field,
we can still provide a c`pointer to it .]

[11.9: 11.12: 11.23: subheaps vs trailers:
. this explains some terminology:
obj's can be dynamically growable
by having a pointer to a trailer
that shows where some dynamically alloc'd memory is .
. dynamic mem'alloc's are taken from a "heap",
and traditionally taken from a global heap
(one shared by all the subroutines of a process).
. we would rather have each subprogram activation
being in posession of its own heap
(our name for such a private heap is a "subheap"),
because we would like to offer short pointers,
ie, our subprogram's subheap is impl'd as
an array of pointers into the global heap,
and the trailer pointers of our local obj's,
are 16-bit array indices into the subprogram's subheap array .
. a datatype's manager also has the option of
implementing the trailer pointer as:
# an index into the type mgt's local subheap
# an index into the type mgt's global subheap
# a system pointer into the global heap:
. obj's can also have their own subheaps,
either by implementing the trailer pointer as
a global heap pointer instead of a local subheap index;
or, by impl'ing a private node heap;
for instance,
a tree can be represented by an array of nodes,
so that each tree pointer is
not a system.pointer into the global heap
but an index into the tree's private node array .
. to see how a tree can have both a
subheap and a trailer,
imagine the default tree has only 255 nodes,
so that its pointers are byte-sized .
. but if it needs to grow,
then it uses a subheap as its trailer:
the trailer is adding a much larger array,
and that array is considered to be a subheap
because it's serving as the tree's
private node heap .]

how to supply trailer info ?:
[11.12:
. the idea so far has been that trailer pointers are relative to
the act'rec where the obj' was declared;
ie, they index the home-subprogram's subheap;
but then when we pass that obj' as a parameter,
it is residing in a new act'rec,
while in [an obsolete] sharing model
its trailer pointer still refers to
the subheap of the calling act'rec .
. I also had some confusing ideas about
having tuples holding one trailer pointer
for all of its fields;
that would be great for shortening pointers
(most tuples have less than 2^8 fields,
so the pointers could be byte-sized)
but then how can we point at a field
without also pointing the enclosing tuple?
. and if I can never point an obj without also
pointing at the enclosing tuple,
don't I need to point at every nested tuple?
that would give our pointers a varying length,
and we would rather have a space-efficient fixed size .
[ 11.23:
point at every nested tuple? not necessarily:
. we could require only a pointer to
the first-nested tuple
by crafting some fitting rules about tuple types,
and thereby still keep our pointers' size fixed;
but the point still stands because,
we want to remain object-oriented,
and that means the obj's type mgt is always in control
over every impl'detail of the objects it manages;
thus, tuples cannot be deciding
the impl'details of their fields, (that includes
decisions about a field's type of trailer pointer).
. if we didn't want to be obj'-oriented in this case,
we could require trailer pointers to have type.tags,
but that defeats the purpose of giving tuples control:
because, if we let the owning tuple
decide the trailer pointer size,
-- for shortening the trailer pointer --
we still have to leave room for the worst case:
some large tuples will need 16-bit trailer pointers .
. also,
independent of whether we stay obj'-oriented,
we definitely want to support a fixed obj' size;
because we want to avoid "obj's as pointers":
oop assumes all obj's immediate storage must be
a system pointer into the global heap,
but we see most objects as having a usual size,
with only an occassional need for heap space,
and thus they should have in their immediate storage
the value of the object, not the pointer to the obj;
and then optionally there is
a variant of that value record that is providing
a trailer pointer for extending the value .
]
... hence the need to revise my approach .
. when the trailer pointer is a short pointer
then it can't be shared across act'recs
because it's short from being relative to the home act'rec
so if it's shared by parameter with
another act'rec,
it only works if the trailer pointer is
a system pointer into the global heap,
or if we never share trailers with param's
(instead param's are alway full copies),
or if, given a subprogram wants a param to
reference a shared obj,
then the caller needs to pass an adda pointer
that includes the obj's full path: 11.23:
the home act'rec, and the obj's offset
into the act'rec's local tuple .]

[11.9: revised model for component's trailer pointer:
. the obsolete model had this view:
each component or field of a tuple
has its own trailer pointer
but the trailer pointer is an index into a subheap;
and, that subheap could belong to the act'rec,
or it could belong to the enclosing aggregate .
. since that is ambiguous,
we could instead have a system where
each aggregate is local to some act'rec;
and, each of the components of that ag (aggregate)
considers their home to be
the same act'rec that their aggregate belongs to . [11.12:
. so then the trailer pointer of each ag'component
is relative only to an act'rec,
not some base pointer owned by an enclosing ag' .
. on the other hand,
if we did want ag's to have their own subheaps,
then we could redesign adda pointers
to keep track of that context for us?
well, ... ]
. when an adda pointer copies an address,
it has to consider 2 things:
does the enclosing ag' have its own subheap
for isolating the trailers of its components?
if so, then the trailer pointers of the components
might be smaller than 16bits, so what is the size ? ]
[11.11: oops!:
. to stay oo-ish (obj'-oriented),
an ag' cannot have control over the format of
the trailer pointers of its components;
because everything about the object format
belongs to the obj's type mgt .
. it would be a lot simpler to give up on
having subheaps be specific to an ag',
and instead give type mgr's 2 options:
an obj's trailer pointer can be
either a system.pointer to a personal subheap,
or it can be an index into the owning act'rec .][11.12:
. that way our pointers stay simple:
the home act'rec of an obj' is easy to keep track of: [11.23:
it is the root of all obj' creation,
as the scheduler has a list of processes,
each with a stack containing a list of act'recs,
each with a tuple housing the run-time's obj's .]
. if you, as type mgt, have designed your trailer pointer
to be relative to the home act'rec,
or relative to the home act'rec's type-specific subheap,
then this pointer gives you the context you need;
else if your trailer is referenced by a system.pointer, 11.23:
then the obj's home act'rec is of use only for meta info,
such as knowing if there is a declared subtype constraint .]


adda/mem'mgt/aggregate subheaps have to grow in 2 ways:

13: 28: 11.12:
. an ag' (aggregate) may be a tuple, an array,
or a node graph such as a tree;
each component of an ag' can be extended
by growing into its own trailer;
ie, extendable components are variant types;
and, one of those variants can include a trailer pointer .
. some arrays and the tree can have
a growable number of components;
and the ag does this by having its own trailer,
in addition to the trailers of its components;
so, it can be growing in 2 dimensions:
# the number of components,
# the size of each component .
. and then recursively,
each component of an ag' could be an ag' .

28: one subheap per obj:
. whenever an ag' contains another ag' type,
they don't all own their own subheaps:
there is one subheap per object, [11.9:
but each elemental component is an object? . 11.10:
. the overriding principle is that
what an obj's trailer pointer is relative to
is an impl'detail that should be the sole decision
of type mgt, not the enclosing ag's .]

adda/mem'mgt/subheap/reducing pointer size:
16:
. giving each type mgt its own heap space
can help us reduce pointer size;
eg, if each type used less than 2^16 objects,
then pointers could be 2-bytes instead of 4 .
room for tags:
. even slightly smaller pointers are a big help
in making room for subtype.tags .

[11.10: mis:
. the rest of this section forgot to stay oo'd;
it is letting ag's determine the
trailer pointer size of its fields,
when those fields belong to other type mgr's .
. but if you consider hierarchical subheaps to be
what a typical type mgt would be planning
for the internal organization of its own obj
rather than what an act'rec is concerned with,
then this section still has value as a mem'mgt note
relevant to reducing pointer size .
. if we want more control over type mgt,
such as enforcing safe pointers,
then we can bake that into the compiler
through the way it interprets and optimizes
the type mgr's instructions .]

30: moreover: 
. if instead of subheaps being type-specific
we can have them be aggregate-specific,
that would reduce pointer size even further,
and would also facilitate assignment operations
by making it easy to dealloc' the old object,
and less prone to leakage .
[11.9:
. for ag's with a potentially great number of components,
there should be a pointer-shortening scheme;
for instance,
we could agree that, for any large or growable ag,
not only does each component have its own trailer,
but also the ag has its own subheap,
and each trailer of each field in that ag
is local to the subheap of that ag .]
. therefore,
at least in the case of large or growable ag's,
when we provide a pointer to a component
it needs to also include
either a pointer to the parent ag'
or a copy of the parent ag's subheap pointer .]

[11.9: 11.23: moreover/subheap trees:
. a trailer can be a short.pointer,
indexing the home act'rec's subheap,
where we find a system pointer to a block of heap space .
. these heap blocks have a variety of types:
# type leaf:
. a trailer pointer for an elemental datatype
will point to a heap block of type leaf
(meaning the block has no system pointers
in need of deallocation before the leaf is dealloc'd)
# type non-leaf:
. the trailer pointer of a large or growable ag
will point to a block of type non-leaf,
indicating that it is another nested subheap:
an array of system pointers
each pointing to more system mem blocks,
each of which is either a leaf or a subheap .
# .


adda/mem'mgt/subheap/basic types:
30:
. in one idea of "subheap",
a type mgt is given an array of system pointers;
and type mgt can manage memory any way they want
just as long as every system mem'allocation
is attached to the given subheap,
so that in case it messes up,
we have a way of analyzing its mem usage,
and the whole thing is easy to recycle .
. in the following idea of "subheap"
the subheap is again an array of system pointers,
but there is a recursion such that
we can have subheaps nested within subheaps;
and, the run-time deletion process involves
recursive subheap deletions .
16: 30: 11.24:
. a type mgt's subheap has 3 types of mem'alloc's:
# an extendable array of nodes:
( to reduce pointer size, there are several choices
on the limit of node pool size:
256, 2^16, 2^24)
# an extendable byte array:
(implemented as an array of
system pointers to same-size blocks);
. this could implement a packed-byte subheap
esp'ly suitable for a type-specific subheap,
where it sets aside a range of bytes
for each new obj of the given type;
and, pointers to the given type
are an index into the enclosing extendable byte array .
. likewise there could be a twin subheap
(such as an extendable system.pointer array),
for holding the trailers of these new obj's .
30: 11.24:
# an extendable system.pointer array:
. instead of creating a packed-byte subheap,
this describes a subheap as a set of machine addresses
pointing to mem'allocs from the global heap;
the indices into this array can be short handles,
ie, short pointers to pointers impl'd as
16-bit array indices pointing to machine pointers .
. with handles we can change the target object location
without having to update every pointer sharing that obj' .

21: 11.24: mis.adda/mem'mgt/subheap/there are 3 sorts of subheaps:
. the 3 types of subheaps are
tuple, array, and nodular ?
anyway, the nodular description sounds right:
. a pointer to type t
is a pool of nodes of type t;
there can be several pointer types,
and a pool for each one,
all owned by the same object's subheap .
-- eg, a tree might have branch vs leaf pointers .



13: adda/mem'mgt/decomposition:

intro:
. type mgt makes life easy for app coders
by defining the creation, deletion, assignment,
and any other operation an object needs,
along with a description of value literals .
. the creation is done by an init procedure
that inputs an etree for use in defining
the value of the object it is initializing .
[ adda/tree.type/literals:
16: 29:
. literals include etrees (expression trees);
when text is assigned to a tree pointer,
that builds a syntax tree with the compiler's parser .
. the initializer method can accept etrees,
and then it can do its own conversion
from etree to data structure .
]
. the type's specification needs to define
the size of the obj's fixed part,
and whether the object has an extendable part;
ie, whether it has a variant format,
where one variant includes a trailer pointer .
. for instance, a type that needs
4bytes that are extendable,
could define that with this:
size: 4 ...;
and if no extendability was needed, write:
size: 4; .
. extendability requires at least 4 bytes;
and "4..." is the default size .
. a typemark can be given to indicate size;
eg,
size: int32;  -- that says:
the size of this type is the same as the size of int32;
-- that doesn't confine the implementation to using int32;
it simply facilitates named sizing .

. in the body of the type's definition,
where the implementation details are given,
the first thing to do is
describe the types used in the data structure that is
implementing the type being defined;
so, besides the fixed part's implementation
and possibly the extendable part's structure,
there may be some other type declarations;
eg,
trilogic.type: (true, false, unknown);
type: (x.int30, y.trilogic);
-- the "type" label identifies the type mgt's
implementation structure .

. here is one with an implicit trailer:
(one variant's component has a possible trailer)

node.type: (variant.bit?
    # 0:( l,r: /.node; )
    # 1: (leaf.string4)
)-node;
type: /.node .

[11.9: 11.24: pointer to nodes include pointers to subheaps:
. when a growable aggregate is a graph or tree type
(growing by adding nodes to the graph),
there needs to be a root type
that includes not only a pointer to node,
but also a subheap pointer,
showing where all the nodes are located;
this allows the pointers in the tree nodes
to all be short pointers .
. a pointer that is being used as a cursor
into parts of an arbitrary tree,
needs to be structured like a root,
with not only a pointer to a node,
but also a subheap pointer .]

13: 28:
. to simplify implementation of overwrites
a type mgr can erase the entire trailer,
and reallocated a new one;
but type coders are free to reuse their trailer space .
. there are 2 dynamic allocation models:
monolithic (for arrays),
and nodular (for network graphs)
where the pointers are indices
into an extendable array of nodes:
if it writes beyond the bounds of either array,
the mem'mgt can automatically extend the memory .



adda/mem'mgt/subheap/per-object not per-act'rec:

[11.9: intro:
. in our model of concurrency and modularity,
all symbols are local to some act'rec
(an activation record of a process,
either a subprogram or a daemon)
and each act'rec has as its first component
a subheap which is a system.pointer
(eg, something returned by C's malloc call)
pointing to an extendable array of system.pointers .
. any potentially large aggregate (a tree,
any other graph, or any extendable array)
can also have its own subheap,
rather than crowding the act'rec's namespace .
. it's the same principle as in
a hierarchical folder system:
instead of putting all of an ag's trailers
in the act'rec's folder
we let the ag' put a subfolder in the act'rec's folder,
and then that ag's components can have their own namespace,
thereby avoiding collisions with other ag's filenames .
11.24: mis: not oo'd:
. the object-oriented thing to do
is let type mgt decide the details of its obj's;
that means an ag' can't be dictating to its fields
what their trailer pointer is an index into:
. if the type mgt wants the obj's trailer
to not be in the home act'rec,
then it will put the trailer in the global heap,
or in the home act'rec's type-specific subheap .]

13: 21:
. I had thought each act'rec needs its own subheap;
but the act'rec is not actually a source of
dynamically allocated memory:
it allocates a fixed number of objects,
and these objects are then in control of
their own variations in size .
[11.24: but this misses the point:
. the home act'rec's subheap was there to provide type mgt
with the option of impl'ing short pointers
for the trailers of its obj's local to that act'rec .
]
. we need per-object accountability of mem'allocations;
because, objects may need to be moved between act'rec's;
and, the object may need to erase itself
in order to implement an assignment operation .
. that means
[11.24: stay oo'd:
. let the obj's type mgt worry about the details of
object deletion, creation, and copy .
. the type mgt may choose to
give each obj's trailer its own subheap,
but it could also put the obj's trailer
in the home act'rec's type-specific subheap .
]
each object needs its own subheap,
which is where we can find
all memory allocated to that object's components;
it may be just a single trailer,
or it can include several pools of nodes,
all part of a network,
rooted by the object owning the subheap .
[11.9: 11.10: summary: [obs -- see revision below]
. each obj' has its own trailer pointer,
which is an index into some subheap .
[11.24: not necessarily:
. the type's mgt decides the trailer type,
whether index or system pointer .]
. if the obj' is a component of a large ag',
then its ag may have its own subheap
which might be impl'd as either a system.pointer
or an index into the act'rec's top-level subheap;
and, in either case, a subheap is an array of
system.pointers to other mem allocs,
either byte arrays or nested subheaps .
. in the case of a potentially large ag,
it may need its own subheap,
since the act'rec's namespace is limited to 2^16 obj's .
. the ag with its own subheap has its own trailer
that is containing the trailers of its components;
ie, the trailer pointer of such a component
would be an index into the ag's subheap
rather than into the act'rec's subheap .
11.17: revision:
. ag's do not have their own subheap;
only act'rec's and elemental obj's do .
. staying obj'-oriented,
we let the type mgt or the compiler decide
whether the type's nodular subheaps
will be obj'-specific, act'rec-wide, or process-wide .
11.24: the compiler?
. for instance, when working with trees,
there is the option of using either large pointers
all addressing the global heap
where it is easy to combine trees,
or there is the option of using small pointers
all addressing a local subheap,
which saves a lot of space, but costs a lot of time
if any trees need to be combined .
revision II:
11.24: ag's do not have their own subheap?:
. the standing point is only that
while ag's can have a subheap,
they can't dictate to their fields any impl' details
like field trailers being located in the ag's subheap .
11.24: ag's can be objects:
. if tuples are not objects themselves,
what happened to the idea of tuples having bodies,
with private fields located in that body
in addition to the public fields?
. their fields can also be subprograms,
so that subprograms defined within the body
have access to the private and public fields
as if they were globals ?
. what happened to the idea that a type declaration
can define an object and a set of operations
while also specifying that the obj' can have fields?
. so the tuple is an object,
but the tuple's fields are not operated on
by that tuple's type mgt;
rather, each field has an original type mgr,
and that type's mgt has sole control over
that field's internals .]


21: adda/mem'mgt/subheap/subheaps are owned recursively:
. an ag (aggregate: array, tuple, or tree)
may have as components nested ag's,
each with their own subheap .
. this hierarchy of subheaps must be traversable,
so that when an object is to be deleted,
all of its enclosing subheaps are deleted also .
[11.9: 11.10: staying obj'-oriented:
. mem'mgt would be much simpler if we would
just stay oo'd, and rely more on type mgt:
. simply ensure that all locals are traversable,
and ask all of them to delete themselves,
and they will recursively ask their components
to do the same .
. we should give type mgr's the tools they need
to impl' a variety of pointer space sizes
for efficiently matching a target space size
-- from byte-sized addresses to web url's --
but unlike the previous idea for trailer pointers
we don't require type mgt to impl' them as
short pointers (relative to the act'rec) .
. we only need to integrate the pointer.type
so that when it's targeting a component
it knows to keep track of context .][11.17:
. the enclosing ag' was thought to be a context,
but that idea was found to be too complicated;
the only context provided is the home act'rec,
so if type mgt wants to use a short pointer
it has to be an index into
either the subheap of that act'rec,
or the type-specific subheap local to that act'rec .]


adda/mem'mgt/node pools and trailers init:

16: 29:
. a common theme in mem'mgt is having
graphs built of 2 or more sizes of nodes; [11.10:
eg, in trees, there may be 2 node types:
the internal tree nodes,
and the external leaf nodes with data .]
. so, just as a tree pointer's root
is a complete pointer that indicates
which subheap is providing the node pool,
a generalized graph root needs a list of subheaps,
one for each type of node it uses .

[11.11: 11.24: heap mgt toolkit:
. in a heap allocation of conventional oop
(vs the adda style of obj' orientation),
a pointer to an object is created with a call to
something like new TypeOfObject(initialValue),
and that gives us a pointer from the global heap;
whereas in our subheaping system,
most pointers to obj's will be impl'd as
an index into a local subheap,
so we have to know what type
the pointer is targeting,
and then use that type to select the relevant subheap;
for instance in a tree with 2 target types
(the branch nodes, and the leaf nodes),
we could have a variant tuple with a discriminant field,
indicating whether the node is
of type (/.tree, /.tree) or (/.string).]

[11.17: obj-mediated global context register:
. in the current idea of context,
type mgt using a short pointer for the trailer
should expect only the act'rec as the base;
but another module to consider for a base
is some other object's body .
. when object's have many internal components
we can think of them as the root of a file system,
so, just like the act'rec can have
a number of type-specific subheaps,
so too can the body of a complex object .
. we currently have a pointer's structure
as being (home act'rec, obj id);
so now in our tree example
when tree mgt talks to the string mgt
it needs some way to say to string mgt
that it wants the strings trailers to be
located within the tree obj's subheap .
idea#1:
. when we pass control to tree mgt,
it is then updating the global context register
to say what the current subheaps are,
then when the tree gives control to the string mgt
(in charge of tree's string components)
the string mgt looks to the global context register
to find the type-specific subheap it should use .
11.25: idea#oo:
. the oo'd way to look at this,
is to see the tree obj's body as
enclosing the tree method's body;
ie, when we call for an operation on a tree,
we create an act'rec for that operation's method,
and that act'rec's subheap should
inherit from or be composed of
the subheap of the tree obj it is operating on .
. then when the tree mgt asks the string mgt
to operate on a tree-local string,
it is passing to the string mgt
a pointer to the string needing an operation,
and that pointer's act'rec id
should be the caller's act'rec in this case,
because the string is local to this obj-method .]

[11.11: private subheap has type-specific segments:
. every datatype that declares a possible trailer,
is implementing its type within a private subheap;
ie, the subheap of some subprogram or daemon act'rec .
. this could be a really simple idea
working just like the global heap,
but within that private subheap,
we want a separate segment of memory
for each datatype being used
by the act'rec's implementation,
so that our pointers can be smaller,
and finding new space will be easier,
because we are not mixing blocks of different sizes:
each type, and thus each size,
is getting its own segment .

11.25: type-specific subheap is a subfolder for the act'rec's subheap:
. when is a type-specific subheap ever needed?
if the locale has declared a lot of some type
as happens in a large array,
then to ensure we don't run out of short pointers
each type should have its own trailer pointer space;
so, the act'rec's subheap should be
an array of pointer to type-specific subheap,
and then an obj's short.pointer to trailer
will be in the type-specific subheap's
array or associative array .]

16: 29:
. for each nodular type an implementation uses
(eg, a tree has 2 nodular types:
internal nodes and external nodes)
the run-time system is implementing a node pool
that supports both delete node and new node;
if there are no more nodes available,
the array implementing the node pool
needs to be extended, and a free node returned .

16: 29: 11.11: 11.25: similar to nodular types are trailered types:
. if, after inspecting an initializing value,
a type mgt decides an object needs a trailer,
it allocates a new trailer pointer target
from within the type-specific subheap
of the obj's home act'rec .
. if the trailer is composed of nodular types,
the type-specific subheap will need to
allocate space for each node pool .



adda/mem'mgt/trailers/sending subheap access to calls:
16: 21: 29:
. a function's inputs and outputs may include trailers,
so the function needs access to the subheap
that is containing that trailer .
[11.11: 11.26: 11.27:
. most of the time inputs are copies,
so their trailers are local to the callee,
just like the trailers of the callee's other locals;
but, recent notions of adda's mem'mgt had the idea of
sharing inputs by letting the input's trailer pointer
reference the input's original subheap instead of the callee's
(if there was an assurance of read-only ).
. but when the callee wants the shared input to be
an input to an operation,
if callee copies it to an operation's input,
then how do we indicate which in the call chain
owns the subheap that the trailer is in ?
. if it's important that inputs be sharable with the caller,
then the param needs to be a pointer type
rather than the recent idea of
just copying the object's fixed part only
without copying the associated trailer .
therefore:
. if sharing is expected to be important,
the parameter should be a pointer;
and pointers as inputs
will be assumed to have read-only targets;
whereas the possessive notation
will always assume read&write permission;
eg, [f's output]`= ( inout1 inout2 )`f( read-only ).
. the compiler has to check that any input'd pointers
are not subsequently given to the
inout's or outputs of other calls .
]
. if a trailer was known to be in the caller's subheap,
the stack could provide access to the given subheap .
16: mis: nestings:
. that caller's server could be recursive:
so, a return address is not always a local,
but could instead be owned by the caller's caller, etc,
-- how far up the stack should you go?!
29: therefore:
. a var parameter such as a return address
needs to be refered to by a complete pointer,
one that includes the object's trailer's subheap
(11.17: obtainable from the obj's home act'rec:
. a pointer has 2 parts:
the act'rec that declared the obj,
and a reference to that obj relative to the act'rec .
).
. in the case of a return address that is
targeting another call's inputs
( as in a nested call; eg, g(f(x)) )
there won't be a trailer yet
(because it depends on the return's value's size)
but the return's server or callee
would then be providing one  (11.26:
within the subheap of the act'rec referenced by
the given pointer's act'rec field).
[11.17: conclusion:
. the out.mode parameters such as the return
are adda pointers (act'rec, obj),
and then things are simplified by oop:
regardless of where the act'rec is,
the type mgt controls subheap location,
(either in the given act'rec, or private to the obj).]


adda/mem'mgt/trailers/trailer pointers in nested aggregates:
29: 11.17:
. the point of having subheaps is knowing that
when an object or aggregate is deleted
-- such as a subprogram's act'rec --
all related heap objects are recycled,
thereby ensuring incremental garbage collection .
. all object creations start with
the activation of a subprogram or server;
each of these creates a tuple of locals,
-- the act'rec (activation record) --
which in turn are implemented with
aggregates of other objects .
[21:
. the act'rec is the root of a tree of components;
every object can have component objects
each is separately deletable
and the tree must be recursively deleted .]
. recursively, an act'rec or type mgt
knows its list of local components
and can ask each to delete itself .
. finally,
types whose components are based on system mem'allocations
instead of language- or user-defined datatypes,
will arrange their system pointers within
an object-specific subheap,
which is an extendable array of
system pointers (pointing into C's heap).
. this pointer system increases access time;
but, it can potentially reduce pointer size;
and, [21: 29: it avoids pointer cycles:
for instance,
if the system allocation is an array
that implements a pool of graph nodes,
all the graph pointers can be implemented as
a 16-bit array index;
and, instead of detecting pointer cycles,
we simply delete the entire pool of nodes .]
13: [obs]
. when a tuple type has more than one field that is
potentially requiring a trailer,
it may want to have all these fields
sharing the same trailer pointer,
in order save space and preserve locality .
[11.27: obs because:
. that is a non-oo'd design;
an obj's trailer pointer format
should be the responsibility of type mgt .]
13: 29: [obs]
. the compiler can automatically program the
act'rec's deletion routine,
by having all system allocations be tied into
the subheap of the root act'rec;
that is,
any time a type mgt asks for memory,
there is always an implicit act'rec
that is the root of the data structure
that is containing that object .
[11.17: obs because:
. there is always an implicit context,
which could be either the obj's act'rec,
or the obj's private body,
depending on how the type mgt of an obj'
wanted to implement the obj's mem' needs .]
[11.17: [obs:]
. if the type wanted a private subheap,
then it also wants to change the
global context register .
. for example [if type wanted a private subheap],
a subprogram's typical end goal is to
ask some type mgt to modify one of its objects;
and, let's say that is a tree of private strings,
so then the tree will want to call the string mgt
and will change the global context register
to the subheap of the given tree,
and then string mgt will look for its
string-specific subheap
within the current context (a tuple of subheaps).
. this could get indefinitely recursive,
-- sub, sub, sub, tree mgt, string mgt ... --
so that contexts are placed on a stack,
and then the global context register is a
pointer to the top of that stack .
. or we could simply use the call stack,
as it contains pointers to act'recs,
which include the relevant subheaps .
. even if the type mgt's are servers
(like ada`tasks instead of ada`subprograms),
they will still want put some context on the stack,
just like a subprogram would do,
in order to provide subordinate type mgt's
with the context of the object they are servicing .
. if the call to type mgt is asynchronous
then the same thing happens
but on a different stack,
the one belonging to that service's process .]
[11.27: obs because:
. there is no need for a global context register:
if the tree type wants to localize a string heap,
it locally declares strings in the obj's body,
and that obj's body provides the subheap
for the type mgt's operations on that obj .]
13: 29:
. whenever type mgr's ask for memory,
they always have it tied to the act'rec's subheap, [11.17:
or the obj's private subheap; ]
ie, what they are allowed to use
is the index into the subheap
where the needed memory has been located for them .
... on the other hand,
that would be too many pointers in one subheap
(requiring the trailer pointers to be a larger size,
eg, 24bits instead of 16bits )
so, each object should have its own subheap,
and what gets tied to the root act'rec's subheap,
is a list of all the contained subheaps
[11.27: obs because:
. the act'rec subheap includes a set of type-specific subheaps,
such that most of the act'rec's address space
is used by the few types that choose to
not implement a type-specific subheap
(perhaps because it is known that
any given act'rec will declare only one or a few
of that type's obj's .]
... but wouldn't it be just as safe,
given that our compiler controls the type mgrs,
to recursively ask components to delete their self? .
[11.17: amen:
. it is left up to type mgt where the subheap is tied;
ie, objects themselves can have system`pointers
that link directly to c`malloc returns
(thereby forming an object-specific subheap)
or they can have short pointers
that are an index into the home act'rec's subheap .]


21: mis.adda/mem'mgt/trailers/finding recursive components:
[10.2:
. in the evolving view of subheaps,
there are twin systems:
. you see an act'rec has components (locals)
which in turn can be aggregates
that have their own components,
and so they are forming a tree;
but, their components can have trailer pointers
that are indices into the act'rec subheap
that is an array of c`malloc objects
that form a separate tree
with parts of the tree's immediate parts
pointing into parts of subheap .
. and when it comes time to delete,
the system reads system mem'alloc subtype.tags
to know which components contain nested subheaps .
. the preceding obviates the following:
]
. to find recursive components,
go through each node in subheap's array
 looking for leaf of trees,
and see if the leaf obj has subheap to dispose of .
. generally you have to check each type that is a pointer to tuple,
and check if any variant of the tuple
has a component that is known to be extendable,
and if so ask the extension if it has 1 or more subheaps,
or one or more components that could be extendable .
[11.17: all of the above is obsolete:
. the mem'mgt design now is more obj'-oriented,
with deletions happening recursively by
asking each aggregate to delete itself,
and then recursively the ag' asks
each of its components to do the same .]


mis.adda/mem'mgt/we don't have to trust type mgt:
13: 21:
. I had assumed we have to trust type mgt;
but the whole point of adda compiling everything
is that we can then police all code .
. it may be complicated trying to
police how type mgr's interact with malloc,
but it's likely possible .
16: rationale for double pointers:
. regardless of trust,
we need some automatic way of per-act'rec disposal,
and either it takes double pointers,
or [29: a more-confining safe pointer system].
. there could be messy cycles
but we escape that with double pointers .
[29?:
. we need per-obj heap spaces
in order to know which obj' is getting
more space than expected ?
that is going to be too expensive
unless it is trivial to combine
the personal heaps of 2 combined obj's .]
[11.17: process-wide type-specific subheap:
. merges happen when 1 or more copies
are assigned to be added to particular obj;
eg, say we have x tree merged into y;
y`=(x) -- where y and x are both pointers into subtree;
x is filled by handing its address to some init;
that address includes the subheap of
the "y`=(x)" call,
whereas the subheap of y could be anywhere
given recursive subprogram calls relegating an out.mode parameter .
.  if a type mgt wants to be time-efficient
it needs to be allowed to use
a process-wide type-specific subheap
so that the nodes of all local trees
are in the same node pool;
however, what about the client who may prefer
a more space-efficient design? [11.27:
. the way that auto'occurs is when
the trees being combined are declared in the same scope;
eg, most merges might be done by sequential subroutines,
but the trees those subroutines work on,
are all declared in the main program calling the subroutines .]
11.17: the compiler analysis can help us here:
eg, the tree comes in packed and unpacked subtypes,
and the compiler decides which subtype to use
given the client's memory constraints;
eg, for the tree that is used for
parsing a file and then analyzining it
without combining it with other trees,
the compiler would select a packed subtype .]


adda/mem'mgt/isn't etree-building a mass of recopying?:
[11.27: intro:
. the traditional style of oop has it that
every obj is a pointer to its body;
and that body is allocated from
the main program's global heap;
this makes it simple to build trees,
since they are pointer-based data structures .
. in the adda-style of oop,
we want obj's to use pointers less;
and we want our tree nodes to be alloc'd from
the type-specific subheap of some subprogram .
. when I wrote the following,
I was assuming the obsolescent notion that
each tree should have its own node pool,
and then I was seeing that this could be very expensive;
because for each tree-building operation
we are merging node pools .
. adda does allow pointers,
and tree building can be done with pointers,
which involves less copying than in
the situation I worried about in the following .
. adda also allows nodes to be program-wide
from a type-specific subheap,
rather than an obj-specific subheap .]
14: mis:
. if building etrees functionally,
ie, using a tree of function calls
where each call adds one node to the etree,
it seems like everything being output
has to be copied yet again to an input .
. is there any safe way to
short-circuit the modularity,
so that, within the tree of etree-building calls,
instead of copying from outputs to the next inputs
we could have nodes on the outside of the call tree,
so that only pointers are copied ?
. even if that weren't practical or safe-simple,
we still avoid a bit of copying because
our return object actually is the input object
that it seeks to put the return value into,
so the return is not involving a copying assignment .
25:
. in the case of building trees of nodes,
our efficient return assignment
is only saving us a trivial pointer copy;
the problem is that we would like each tree
to own its pool of nodes,
but to allow a scope to build trees
by combining several tree variables,
we need some efficient way of merging the
node pools of separate tree variables .
16: 28: mis:
. a subprogram scope may own several trees;
and, instead of each tree owning its own nodes,
they could all share the same nodes,
but only if they were owned by the same act'rec:
each act'rec has its own type-specific subheap .
. the subprogram that called the tree routines
provided the node space for the obj's
that are flowing through the
ins and outs of the tree routines;
and, only pointers are copied .
29: subtree grafting:
. as was discussed in adda/mem mgt/subheap
/per-object not per-act'rec,
each tree needs to have its own set of nodes
[11.17: ... but that was revised .
]
. we can save time by grafting trees:
the tree's subheap has not just one pool of nodes,
but has a pool for each grafted subtree;
thus we have as one of our leaf types
a reference to another node pool,
a sort of sub-subheap .


adda/mem'mgt/subheaps/merging subheaps:
13: mis:
. if 2 ag's are merged,
can we efficiently merge their subheaps?
but that gets back to needing to be act'rec-centric;
because,
the typical ag is a tree built recursively;
so, when subheaps are obj-specific,
we are merging subheaps for every node added!
21:
. there are some tricks for merging
if the system gives type-mgr's the right tools,
but we can simplify life by thinking functionally,
and not worrying about the cost of copying and deleting .
29: tree-building styles:
. when building a tree one node at a time,
it is best to use system trees,
where each node is a system allocation
( usually from calling C's malloc,
so then all the tree pointers are machine addresses,[11.27:
but being process-wide is the key advantage;
so, in adda, our unpacked tree would be in a
process-wide type-specific subheap,
that might use 24-bit pointers
whereas packed trees would be 8-bit or 16-bit pointers,
depending on the fixed tree size,
and the subheap could local to any scope,
not just global or process-wide .]
)
then when working with large trees,
system trees can be converted into packed trees,
where we implement our own node pools with an array,
and the tree pointers are indices into an array . [11.27:
--
. the key to efficiency not whether the pointers are
array indices or machine addresses;
rather, it has to do with the subheap having a global scope .
]
. when large strings and trees need to be merged
we can use a grafting system
where a string is described as a list of subtrings,
and a tree has some leaves that are
pointers to other internal trees,
ie, using a different node pool,
but considered to be part of the same object .


22: mis.adda/mem'mgt/subheap/the calling module's subheap register:
[10.2: correction:
. subheaps can be more than a trailer pointer array;
they can be nested, like so:
the basic subheap is an array of pointers to
a struct that starts with a variant.tag
which ultimately indicates whether the item is
a leaf mem'alloc,
vs an array of pointers to other mem'allocs .
[11.27:
the basic structure of an act'rec subheap
is to be a set of type-specific subheaps,
along with a trailer pointer array,
for those types that don't have their own subheap .
]
. the act'rec's locals may be nested aggregates
but the fields of those aggregates
are all visible to that act'rec;
it was briefly assumed the trailers of those fields
belong to the top level subheap owned by the act'rec;
but, it should be up to the object's type mgt;
every object pointer comes with a reference to
the subheap of the act'rec that owns the object,
but it's up to object's type mgt to decide
how to make use of the subheap,
whether to use just one subheap entry
and then start nesting within that,
or whether to use several subheap entries .
[11.27:
. that assumes an obsolete system
where the act'rec's subheap had to hold by being a tree,
all the mem'allocs of any of the objects it declared;
the original idea was that
even if type mgt messed up the dealloc,
it would be impl'd by an the owning act'rec's mem'alloc's,
so an act'rec's dealloc would automatically
dealloc all owned obj's including their dynamic alloc's;
but now we are depending type mgt to dealloc correctly .
]
. the point for having subheaps (localized heaps),
is to be saving mem' by abbreviating the trailer pointer size;
but such abbreviation is not required:
a reference passed to a parameter
will include only the owning act'rec's subheap .
. the current model is or has been comfused with
each data structure having its own subheap;
but, the owning unit should be the module;
ie, each act'rec and each type mgt needs its own subheap,
but each aggregate does not .
[10.2: correction:
. it's up to the obj's type mgt; 11.27:
ie, an aggregate can have its own subheap,
just as any other type of obj' can;
but the point that doesn't need correction is that
an ag' cannot dictate to its public components
that they keep their trailers within that ag's subheap .
. we can only keep track of so much context
within a pointer of fixed size,
so we keep track of the owning act'rec,
but not every ag' the obj is nested within .
]
. the purpose of a subheap
is to halve the trailer's pointer size:
we assume each module will have
fewer than 2^16 or 2^24 objects,
so we get an extra byte or 2,
which we can use for variant flags and subtype.tags .
. at any given time in a thread,
one particular module is in control,
and all the objects under its control
will be local to it,
and thus their trailers will be located
in the current module's subheap .
[10.2: correction:
. any subprogram has inout parameters
(a function's return object is actually
a sort of inout parameter)
and, the inout parameter is a pointer to
an external (ie, non-local) object .
. all inout and rom parameters
are paired with a pointer to their owner's subheap .
]
. so, we can put that module's address in a global register,
and when that module asks a type mgr
to operate on a local obj',
type mgt can know from that register
which subheap that obj's trailer will be in, [11.27:
if not local to the obj' itself .].

29: lib.adda/mem'mgt/memorymanagement.org .


30: todo.adda/mem'mgt/act'rec-specific subheaps are also type-specific: [done]
. there are places in adda/mem'mgt
where I recently assumed that the act'rec has the subheap,
but what that subheap is,
is a list of pointers to type-specific subheaps,
so that if there is a mistake in
one type's use of the subheap,
it doesn't affect the subheap of other types .
[11.27:
. but we also allowed for exceptions:
the act'rec has a generic subheap for
any type mgt that doesn't implement a
type-specific subheap .
. the compiler automatically implements this,
and it may determine after seeing a type in use
that it is so little used that it is not worthwhile
to provide a type-specific subheap .]

30: adda/mem'mgt/subheap/per-type is not complete:
. having only per-[type mgt] subheaps
is not a complete theory
because some objects can be the Any.type
in order to support python-style generics .


30: adda/mem'mgt/trees can use 256-node pools:
. trees can use very small pointers
because larger trees can be composed of smaller trees
(that is called here, tree grafting);
but, networks of nodes with loops have multiple roots;
and, what makes tree grafting possible
(or intuitive and easy to correctly implement)
is that a subtree has just one root,
and that root can provide access to the node pool
which all the tree's pointers are relative to .


30: adda/mem'mgt/subheap/per-act'rec is useful:
[mis:]
. by having an act'rec's subheap be
an array of machine pointers to heap allocations
(one pointer for each obj'trailer),
an object's trailer pointers can be reduced .
[11.27:
trailer pointers as array indices to save mem:
. it doesn't save any mem to have a subheap that is
an array of machine addresses, one for each trailer;
what does save mem is having a subheap that
uses the same array of machine addresses
in order to implement an extendable byte array,
and then giving each trailer
a section of byte array on a word boundary .]

[ 11.22: todo: trailer of a return object: [done]
. in 9.x/adda/mem'mgt/top-down stack layout (this article)
the trailer of a return object
should be located in the subheap of the caller
not the callee's local subheap .
. make sure that is already understood .
details:
. the return obj is always a pointer
locating the callee's destination,
such a pointer always has both the id of the object,
and the act'rec that it is local to .
(although the type`mgt decides
whether the given object needs that context,
or else whether the obj's trailer pointer
is locating a private subheap .]



1 comment:

  1. 5: mis.adda/mem'mgt/type-specific subheap has definite scope:
    . it was assumed that the type mgt could
    choose the scope of the type-specific subheap:
    either local to an object's declaration, or globally;
    but that would defeat the principle of
    per-act'rec mem'allocations .
    6:
    -- I did specify [in the above]
    [@] http://amerdreamdocs.blogspot.com/2013/11/memmgt-for-top-down-stack-layout.html
    that type mgt would be given a choice
    of which scope to keep an obj's trailer;
    I recently wondered about the advantages of that,
    when localizing heaps seemed like better mgt .
    . we might consider partial exceptions,
    like modular subprograms need their own heap,
    but nested subprograms always use
    the subheap of the subprogram they are nested within .
    . that seems intuitive considering
    a nested's act'rec is merely an extension of
    the act'rec of the subprogram that contains that nested sub' .

    ReplyDelete