2009-12-27

mem'mgt

7.6: adda/implementing act'rec`subheaps:

. reviewing this:
"(
. consider rules for mem allignment and unions .
. malloc gives a block alligned along widest type .
. how does heap mgt create a mem space that type mgt's can use ?
. should type-mgt use sets of arrays as records
and use separate arrays for each function call?
. instead let mem mgt --not type mgt --
worry about how rec's are impl'd .
. does this mean having all types inherit from supertype mem?
)
response:

. one of the first test programs to try in c,
is to impl' your own malloc
from a file.block you get from c's malloc .

. provide a foundation within the block's sub.heap
for a version of malloc that is type-specific:

. the block`subheap is extensible by being build of file.blocks,
so then a pointer is 2 parts: (the block, offset from start of block) .
. the type-specific subheaps can be extensible this way too .

. being ready for all these various sub.heaps
need not entail much overhead since
c`unions could hold pointers of every type,
and then you build a tree dynamically as needed
the same way a file system is built from just one root pointer .
. the root of a file system is merely a folder.ptr (4bytes?)
. a folder.type is an extensible array or list of
union{file.ptr, folder.ptr} .
. then file.ptr is a list of blocks .


8.21: adda/auto.space recycled efficiently:
. one way to control the location of an obj'
is to declare the obj in one scope,
then use a sub to make assignment,
so if it involves extensible mem,
it will be based in the subheap local to the obj's scope,
( if it were based in a sub,
then any passing of it would require a copy;
whereas, can share when at a root scope .
) .

8.26: adda/mem'mgt/type.tag`efficiency:
. it might be best to not worry about
how much memory is used by the live data;
data not currently in use can be packed,
so that only live data is unpacked along proper size-dependent boundaries .
. another mem'saver is that some machines
aren't concerned about allignment .
. yet another way may be how structs are managed:
the only reason allignment becomes an issue
is when the struct`component needs to be operated on in-place,
without extracting it from the struct;
so, adda could be translating packed adda struct's
into unpacked virtual registers .
. adda creates these packed struct's by declaring an array of byte,
and then moving other sizes of var's as being array of byte .

8.31: adda/mem'mgt:
. size of type has 2 parts:
the act'rec and the heap (rootsize, bodysize) .
. whether or not it grows dynamically,
this gives typemgt options as to where to get mem from:
if client-side, then problem is trivial;
but if init is keeping some of obj in server space
then runtime has to ask type`mgt for help with closing:
calling type`close(obj) .
. given an obj', type`mgt has to find links to server-side alloc's,
and then use those links to recycle the link's target obj' .
. a type`mgt is more than an ada package,
in that it needs a chance to start a process
rather than just run some mgt`init routine .
. the type`mgt`body has locals that can be accessed by
the type's member`operations .
. it needs the option of being a routine
(eg, for occassions when it doesn't do it's own mem'mgt)
but if it decides to be process,
it needs to define some way for the run-time to tell it
when its clients are done, and should terminate .
(eg, for giving it a chance to
free the mem being used by it's own server-side mem'mgt) .
. if it intends on being a process,
it would use process control stmt's,
which would include responding to
the pre-defined entry"(sys``terminate) .

8.31: adda/mem'mgt/efficiency concerns:
. how is interaction between {sys, type`mgt},
when type`mgt needs client space?
. on the client`side what are the op's that change size?
init's and other out-mode op's
that pass mgt a ptr to local space .
. an obj' is expanded either by
being an array given extended length
or being a node-struct'd subheap given a new node .
. the mgt is writing in a subheap within the client heap
and all ptrs are safely relative to that subheap
except for external tagged ptrs (url's, external obj's, ...);
ie, the tag indicates what subheap or volume is meant .
. when would type`mgt want to base an obj' in a type`mgt heapspace?
. if the obj' is partially in {client`space, type`mgt`space}
then when client exits,
runtime has to chase down entire subheap for things to recycle
instead of one clean flush .
. there could be obj's from diffn't act'rec's (local vs main);
so, part of tag must be
{actrec
, ptr to obj's subheap within act'rec's subheap,
} .

9.7: adda/mem'mgt/limited subheaping:
. does there need to be a limit on sub nesting as in c ?
that does increase the possible number of heaps;
how about heap level peaks at 3 -- main, sub, sub^2 ?
. after that, heaps are shared;
ie, sub^{3..n} all use the same heap:
all these deep scopes have their locals tagged with scope#3;
so that, their obj's don't get flushed until scope#3 exits .

9.7: adda/mem'mgt/subheaping for efficiency:
. not only do scopes get heaps,
large structs like arrays get their own heap,
so that a mov can be done by relinking a ptr from one scope`heap to another .
. how are heaps arranged such that
it's easy for assignment to recycle over-writes for the new val?
[9.10:
. in traditional oop, the object system is bolted onto c,
so when you assign one obj' to another,
you are overwriting a pointer .
. in a real oop system, the type'mgt is in control of the assignment operator .
. it's using that assignment's destination.ptr
to access the ptr's target;
. it copies the source`body to the destination`body .
. how does it extend the destination's body
when the source`body is larger?
. also, since the source is a copy
wouldn't it be more efficient to just copy the source pointer
and recycle the destination ?
that will be more efficient only if
the obj is large eno' to warrant it's own subheap .
]

9.7: adda/mem'mgt/heap.tags:
. does an obj' need a heap.tag?
or can it be like the type.cluster tag,
where due to compiler or runtime typechecking,
the tag can be implicit because the supertype is invariant ?
. in these implicit cases,
the function that tells an obj's supertype is impl'd virtually:
. if the code calls that function,
the compiler replaces that call with a
ptr to the supertype`name in the code's symbol table .
. besides a symbol having a supertype,
it also has a scope depth .
...
. another strategy is to start by including the heap tag,
and then during maturity of the compiler design,
you can see when it can be safely optimized away .
...

adda/mem'mgt/subheap not per act'recs:

. the mem design first states its goals:
make it easy to grow and delete obj's dynamically
by making their allocation not part of a scope's alloc,
so it is heaps that are freed, not obj's .

. notice in traditional stack,
todo: maybe that's discussed elsewhere ?
anyway, there can be some large sys-wide ptr's
for doing things by heap;

. maybe instead of a fixed number of scope levels,
compiler could decide amt of heap sharing
by having it depend on how mem-intensive scopes were .
. having too many heaps is inefficient
when act'rec's are small and fixed-size .
. the stack itself is on an extensible heap
which makes it have pages
where the first pages can be disked when mem gets low .

9.9: sci.adda/mem'mgt/seg'd stack:
. how to make stack segmented?
use arrays to make rec's ?
9.12:
. but that idea only works when all the records are the same ...
or did I mean:
. use array of pointer to arrays;
makes it all look contiguous by dividing addresses into
(segment, offset) .

9.9: adda/mem'mgt/stack with trailer and tagged pointers:
. the pages for stack size should be large
while pages for act'rec's should be small
since there could be many of them due to recursion .
. fixed-sized act'rec's should go on the stack
(this includes also those in which
size varies over calls but are fixed-sized at time of call) .
. with most types that are oop'y
(where a member of that typeclass can keep private
whether or not it can grow after the call's arg is laid on the stack)
the only way to do that is to have the act'rec be rom:
ie, when a sub' modifies its arg, it's really declaring a local;
[9.9:
conversely, if the formal param's are known to be fixed-sized,
then oop'y actual params can be converted to the efficient type requested .
9.12:
. I was thinking of simple cases like int range fixed,
where you could assume the type'mgt would alloc' as much needed
for any value in that type;
however, do you really want to complicate things
by requiring the type'mgt to communicate privates to the mem'mgt ?
. better to fall back on idea of using tagged pointers
such that you can have stack-based obj's able to spill into the trailer .
]
or oop'y types must have 2modes:
. the record's fields can all have sizes that vary across calls
because they are dope vectors that point to a place further down the stack .
. at the end of the act'rec is a trailer pointer
which points to a dynamically growable subheap .
. you can tell when the dope vectors are pointing into trailer space
-- instead of into stack space where the initial value was put --
because the dope pointer is negative instead of positive,
(you would reverse the negation
before using it as an offset into trailer space)
. if extending obj'size by node additions
then the polar-ptr idea will let node-base structures
flow seamlessly from the stack to the trailer .
. if array is typed as being growable dynamically
and since they are expected to be contiguous,
these arrays should always be initialized on the trailer .
[9.9:
. it might be possible to routinely have arrays in seg's;
but then for a possibly growing one,
the init on the stack has to be a complete seg,
even if it's very small .
9.12:
. if mem'mgt profiling expects an array to grow much
it will be not only on the trailer's subheap,
but in its own subheap that is a child of the trailer .
. that way, growth of the array will not require
copying the array to a new location . ]

9.10: adda/mem'mgt/unpacking for allignment:
. after reviewing intel performance needing size-alligned data,
http://stackoverflow.com/questions/1054657/memory-alignment-on-a-32-bit-intel-processor
I wondered how to use an un/pack record solution on the stack .
. the stack would keep things packed,
but then the currently displayed scopes
(the act'rec's not deep under the stack due to recursion)
would be moved to an unpacked stack .


9.10: adda/mem'mgt/sub.heaps:

. (9.9.7/ adda/mem'mgt/heap.tags) might be a confused idea;
this thread of ideas started with scopes having heaps,
and then heaps needing scope.tags .
. it's still not clear to me how subheaps work;
so, review:

. the type'mgt is getting ref's to obj's on the stack;
normally this is:
stack`scope#i + [offset where local is] .
. the traditional worry about which scope an obj' is in
occurs when passing a ptr from scope#n to scope#(n-m)
so that when the scope closes, the ptr is dangling .

. if a function returns type"t, then mem'mgt
needs to consider the return.obj's heapspace to be that of it's caller;
ie, the obj'being returned (call that self?)
is going to be addressed as
stack`scope#[caller's level] + [offset of temp'var taking the return] .

. the case where the object is huge, and gets it's own subheap?
. how are subheaps organized ?

. the purpose is to make an act'rec extensible:
it appears to be an array, but it's actually a string of arrays,
and any time mem'mgt wants to make the array longer,
it allocates another segment to the string .

. after that, it's using the array just like it was a slice of the stack .
. unlike recently conjectured,
mem'mgt doesn't have to search the subheap for pointers to more subheaps;
instead, it can keep at it's head, a list of the subheaps it contains;
so then, dealloc just goes down 2 strings instead of one .
. another thing it can do is dynamically decide
whether an obj's base.mem will be { an act'rec`trailer, subheap } .

. mem'mgt is deciding what will be a subheap,
and therefore won't have any surprises;
it's decided on during compilation or run-time profiling .
. how is type'mgt requesting additional space?
. any obj' that can vary in size will have, in addition to its type.tag,
a size.tag that is telling mem'mgt how large the obj' is .
. if the mem'mgt decided to give the obj' its own subheap,
then the size.tag will be some value that flags this is a subheap.ptr .

. when type'mgt accesses one of its obj's components,
the compiler will have translated this into
something that mem'mgt does with its subheap .
. when type'mgt extends an array by concat',
this involves mem'mgt extending the array`subheap if needed
and copying the data .
. when type'mgt extends a tree
by assigning to a subtree.ptr a subtree,
this involves a unique opportunity for mem'mgt:
todo:
. how is subtree copying work efficienty
if you weren't using the src anyway,
couldn't you just link it into the destination?
. the pointers, if they are to be reduced in size,
are relative to the src's subheap .
. if speed were needed over space savings,
then tree.ptr's should not be relative .

9.12: todo.adda/mem'mgt/subheap structure:
. the root string can use itself as space when small
then when out-grows self,
becomes string of ptr's, one of which points to old self .
. 256 pointers with in 8bit ptrs ,
then can have 3story tree with 24bits, etc,
. the only way to insur efficient malloc is to malloc huge chunks
and impl your own?
didn't plauger warn against that ?


9.13: adda/mem'mgt/stack`structure:
. if going the stackless route,
where only trusted code uses the c`stack,
it would be simpler to have 2 stack parts: (control, data)
. the stack#control has (rtn`addr.ptr, act'rec.ptr ).
. the act'rec.ptr may seem like a waste for sub's having no arg,
but the act'rec also holds the locals,
and nearly all sub's have some local state
(rarely all the domain will be externals) .

9.13: adda/mem'mgt/impl'ing subheaps and chunks:
. what plauger said applies only to hoarding:
. if you alloc only large chunks to impl your own malloc,
then the c`malloc can adjust because
even though they are out of order,
all the chunks are the same size,
so there are no large holes in the heap that can't be recycled .
. make the main subheap mgt by
allocating chunk mgt as an array of chunk.ptr;
then alloc chunks as needed .
. chunk'mgt sees a chunk as an array of subheap seg's,
and has ptr's to each or otherwise arranges self to know
what seg's of what chunks are free .


9.21: adda/mem`mgt/cost of segmenting array:
. array access is root + index;
seg aray is divide bits to (upper, lower ):
(upper selects seg
,lower is offset from seg selected
) . seg(upper)+lower .

9.21: adda/mem`mgt/modular act'rec's:
. instead of actual params,
have all params be a pointer to record
so watching stack depth doesnt depend on param .
. should include locals too? yes (top ones),
. those work the same way but separately:
auto's are translated into malloc/free pairs .
. it doesn't have to free what it returns,
any param getting a ptr from a call knows it has to free that pointer .
(again, this is what adda is doing for the programmer
translating adda into safe and possibly efficient c code )
. other ptr assignments involve
overwrites that are done by function, not c assignment;
and, this function frees previous ptr's before overwriting them .

9.21: adda/mem`mgt/operation-rich types done by stack
. instead of calling numeric functions the usual way,
use the calc rpn model .
(num`enter x; num`enter y; num`+; return x) .
. traverse left to right using tree recursion:
use circular array as stack (mod array`size,
pop just moves x.ptr to previous cell) .
. visit left;
data isa?
(value:
push
, biop:
op result replaces args on stack
, uniop:
op result replace one arg
);
cell has 2nd arg?
visit 2nd child .

9.24: adda/mem`mgt/efficient stack watch:
. when using malloc for every arg
then you should also tally what is used
so your system can tell you how much heap the prog is using
. it's just a ballpark figure as some of the actual mem
is heap mgt overhead whose size is depending on c'system
but at least having a big-O idea of the size
along with malloc's telling when mem is out
and having pre-saved mem for activating the heap-empty.handler,
will keep you in control during a mem-out
rather than being core dumped when you overflow the stack .
. alt'ly,
it's more efficient to let args and locals stay on stack
as long as your compiler determines big-O size
and includes that in the number that is tracking your stack depth .
padding:
. the c`compiler's use of padding for efficient allignment
may cause an underestimation of the act'rec's actual size .

10.9: adda/efficient pointer sizing:
. Cerf said [@] news.tech/KurzweilAI.net, Oct. 7, 2009
it was inefficient to have vari-sized internet addresses,
but the way to do it is like phones have an area code:
you first have a fixed-sized area code to match up,
and once you get matching area codes,
you set a sign bit that says
start looking for the right sub.area code (the leaf number) .

10.23: adda/mem'mgt/implicit return address imported:
. there's no worry about whether c can return structs
or can only return arrays as pointers;
because, all functions are translated into
c`functions that also input the return address
(ie, the caller supplies return space) .

11.14: adda/mem'mgt/fundamental trade-offs:
. fundamental trade-offs to the 2 ways to be modular
. if every nodular[pointer-growable] type
has its heap zoned
then easy to find a mem leak by it .
but any moves of sub.obj's must be impl'd as copies
instead of marking node as belonging to new obj .
. how much mem does it cost to obj-id every node
vs how much time does it take to copy ?
. in zoned system,
some copy can be avoided if src is const (in rom)
because there could be a node-variant that is
ptr to node of other obj .
space-savings
. part of space-savings of zoned
is that many nodular type systems have similar-sized nodes
or small size of zone makes compaction easy .
. the tall tree has many variants:
only the internal nodes will be same-size:
the leaf nodes are type-tagged ptr's or immed's .