12.22: adda/type/short pointer:
pos: kiss:
. build the obvious system first;
don't worry about having to
rebuild from scratch in order to merge an optimization;
it's important for testing that there are
simple, trustworthy substitutes .
summary:
. shorter pointers could save a lot of space;
in c, pointers may be 4-bytes long
in order to address any place on the machine .
. addresses can be byte-sized only if needing
less than 256 addresses; for instance,
an array(1..256)of word can hold 256 tree nodes;
and, each node has 2 bytes,
each of which can point to any place in that array .
. a larger tree would use word-sized pointers,
which means the nodes are 4bytes long
(2pointers * 2bytes).
. the byte- and word-sized addresses
use an array index to do the pointing;
and, are collectively called short pointers;
any use of a short pointer implies a reference to
some particular array;
eg, the root of a tree would be a descriptor
which identifies the tree as using short pointers
and then provides a link to the array being referred to .
. a tree descriptor would also include
a separate array for data;
because, arrays need to contain just one type
(unless its items all have bulky type tags);
and, while all internal tree nodes
are of type (ptr, ptr),
the leaf nodes contain data of varying sizes and types;
so, the data array might contain system pointers;
or, it could use a heap system,
where the pointer refers to several items:
the first item contains a size field
which indicates how many subsequent items
belong to this datum .
. large trees (uses word-pointers)
may contain small trees (uses byte-pointers);
any subtree of a large tree containing less than 256 nodes
could be repacked as a pointer to small tree .
. as the list processor goes into a tree,
it is always keeping the current base addresses
for both the large tree and any small tree .
. a large tree is chosen over a small tree type
not only when it will always be large,
but also when it is likely to grow .
. often it will never approach its max size;
therefore the array being used for the
internal nodes of a large tree
should have a growable size to conserve space .
. one way of being growable is to be
an array of pointers to segments:
eg, suppose the segment is 10 units;
then for item# 78,
follow the 7th pointer to a segment,
and use the 8th item on that segment .
. in practice,
the segments likely should be the size of small trees
( 256 nodes * 2pointers * 1 byte).
. while the large tree is indexing the seg'd array,
a small subtree could index a segment therein .
[12.23:
. since the short pointers of a large tree are
pointing at a seg' pointer,
seg's could vary in type, and be type-tagged,
that way the large tree can point directly at a small tree
without having to redirect to a data array .
. the large tree is an array of
( pointer to seg
, enum"seg nodes are {word, long}
)].
. both the large tree (word-pointers)
and the small tree (byte-pointers)
can have a bit per word to indicate
whether they point to tree or data .
. word-sized pointers use their sign bit for indicating
whether they point to tree or data;
byte-sized pointers exist in pairs on word-sized nodes,
and a byte is too small to sacrifice a bit,
so we'll divide the segment in half;*
if the positive side is used (a byte in 0..127)
that item is an internal node;
else (a byte in 128..255) the item is a leaf node .
*(conveniently, no matter what the shape of a tree,
the number of internal and leaf nodes is nearly the same:
#leaf = #internal + 1 )
A binary tree with N internal nodes has exactly N+1 external (leaf) nodes
Proof: by induction
Base case:
a binary tree with 0 internal nodes has exactly 1 external (leaf) node (root)
Induction step:
. assume a binary tree with m internal nodes
has exactly m+1 external (leaf) nodes
A binary tree with N internal nodes is made up of:
A root (internal node)
A left binary trees with k internal nodes
A right binary trees with (N - 1 - k) internal nodes
According to the induction assumption:
The left binary trees with k internal nodes has exactly
k+1 external nodes
The right binary trees with (N - 1 - k) internal nodes
has exactly (N - k) external nodes
So the total number of external nodes
in a binary tree with N internal nodes is:
k+1 + (N - k) = (N + 1 ) external nodes .
notes:
binary operation polymorphism:
. tree polymorphism works the same way numbers do:
if a biop has to deal with (byte, word),
it converts byte to word for the (word, word) vm;
ie, it makes a copy of the tree but with new pointer types .
. by making small arrays part of larger ones
you would be doing more of your own mem mgt
instead of using c mem mgt
which expects you to de/allocate in stack order;
use c mem mgt for arrays whenever possible;
but use small arrays for tree nodes
to keep tree-pointer size down .
. the segs of a seg'd array
can either come from malloc,
or be pointers into a reserve array
where you're doing your own mem mgt .
. the small tree needs arrays of size = word*256
-- that may be the right size for units used by seg'd arrays .
. of the pointer and its target,
which should contain needed subtype tags?
if the target contains the id,
then there is efficient aliasing;
but in some targets there would be bloat
because targets like system pointers need the whole obj,
and can't spare a few bits for type tagging,
so then an additional array of byte for tagging is needed .
2011-12-31
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment