2011-01-31

byte-sized pointers

1.18: adda/dstr/byte-sized pointers:
. keep pointers small by seeing large lists
as subtrees;
one of the node types of a tree is
"(this subtree root declares a new subheap:
it includes access to a 256-size array of tree node).
. how does that scale? it does because
root of huge subtree had its own subheap
where all terminal nodes were tagged as
(this is a sys'ptr [32bit c pointer]
-- not a pair of byte-sized tree ptrs [byte'ptr])
[1.31:
. as the tree progresses downward with fanout,
it eventually consumes 256 nodes;
any tree-building further down
must be done from a new-subheap node`type
(ie, a node whose purpose is to declare that
the following subtree is based in a new subheap).
. lets say that a tree uses
2 subheaps worth of nodes;
then if the tree is balanced,
it can be built with only 2 new-subheap nodes
by putting the left and right subtrees
each in their own subheap;
if the tree is not balanced,
then space efficiency requires that subheaps be shareable,
so that many smaller subtrees can be
impl'd on the same subheap .
. to make best use of space,
the use of new-subheap node`types
must be minimized because they are overhead .]

. when having a ptr to subtree,
it needs a record:
sys`ptr to the subheap,
byte'ptr to where in the subheap subtree is rooted .

. when building trees from text, [1.31:
or when trees can be modified? ..]
then we need speed more than mem'
whereas figuring out how to pack for byte-ptr's
would be very time-consuming;
so, then these sort of trees should be word-sized .
. both sizes are distinguished from sys'pointers
with the term "(local ptr):
whereas a system pointer is logically a machine address
(practically an index into a process module),
a local ptr is always an index into the current subheap .
1.31:
. our tree node has less than 16 variants,
so the tag can fit in 4bits (a half-byte, nibble,
-- addressing a byte to extract the {hi,lo} 4bits );
the node's data will be sizes larger than a nibble,
and the tag should be at the beginning
to tell us what the rest of the node looks like;
so, in order to accommodate mem'allignment requirements,
without wasteful padding after the tag,
the subheap will have 2 forks
so that tags and data can each be
in their own arrays .

. a subheap supporting byte-ptr's
can hold only 256 nodes,
in turn needing a 128-byte chunk for the 256 * 4bit tags .
. these nodes are word sized:
#branch: (tag, left.byte-ptr, right.byte-ptr);
#leaf: (tag, symbol.word-ptr).

. subheaps are composed of chunks;
so, if a subheap is not enirely used,
it's not entirely alloc'd either .
. both byte-ptr and word-ptr subheaps
can use the same size chunks, 128 bytes,
to match the minimum chunk needed for tags .
. therefore 256*2byte data nodes = 128*2*2 = 4 chunks .
. if less than half of nodes are used,
then it can save on 2 chunks of 128 bytes .

. finally, a byte-ptr subheap needs to be
an array of 5 sys'pointers for access to the
1 tag chunk + 4 data chunks,
plus
whatever else is needed for variant tagging,
or part of an efficiency hack .

. if a large structure has a known minimum at init' time,
then it can be given a superchunk,
which is some arbitrary multiple of contiguous chunks .

No comments:

Post a Comment