Americium Dream Documents: pointers

Showing posts with label pointers. Show all posts

2014-11-30

the value of constants

2.20: news.adda/co/the value of constants:
11.30: summary:
. in classical programming constants are encouraged
because it simplifies the programming space:
you are no longer concerned with
accidently modifying a constant object
because the compiler or the hardware
will automatically detect that mistake for you .
. constants have even more usefulness
in the context of cloud computing .

target-owning vs aliasing pointers

12.7: summary:
. this explores types of pointers
needed for adda's space-efficient style of oop,
and support of safe pointers .

semi-type-specific subheaps

14: adda/oop/type-specific heaps:
[29: intro:
. in a type-specific heap,
each object reference has 2 parts:
# an ID of the type it belongs to,
# a serial number unique to that type . 31:
. this promotes encapsulation,
because the reference isn't a pointer,
the only one who knows the object's address
is the type mgt that owns the object .
]
. given the idea of type-specific heaps,
how does that mix with adda's oop?
--that's the efficient form of oop;
the inefficient way of doing oop
uses garbage collection
and that is a choice independent of
which heap an object is owned by,
whether local or global .
. using pointers is not what causes
pointer semantics and garbage:
that's caused when assignments are overwriting
the pointer instead of the pointer's target:
to avoid garbage collection,
assignments need to overwrite the objects themselves
not the pointers to the objects .
. instead of a global type-specific heap
local vars would be placed into
the subprogram's local heap
that can have within it type-specific subheaps .

managing self-modifying code

1.31: adda/managing self-modifying code:
(inspired by python unpickle vulnerability)
. the safe pickle is built by the system .
. it can be compared to the decompile,
how is it extensible? that is to ask
how are objects built in the first place?

explorations of virtual memory

7.1: adda/vmem'mgt/intro:

. in our virtual memory stack system
we are replacing each of the stack's
subprogram activation records (act'rec's)
with a pointer to a resizable object
(ie, it points to an expandable array in the heap );
thus, the stack [8.29:
-- if we didn't have a stackless architecture -- ]
becomes an array of pairs:
( return address
, pointer to act'rec
) . if our allotted ram is getting full,
we can file the obj's attached to earlier parts of the stack
or even file earlier segments of a very long stack .

when to type an object as pointer

4.1: 4.2: adda/dstr/pointer/when to type object as pointer:
implicit conversions:
. in designing syntax, it's been assumed that explicit is good,
hence we encourage the explicit use of pointers;
but that idea is at tension with trying to avoid
unnecessary implementation details .

short pointer

12.22: adda/type/short pointer:

pos: kiss:
. build the obvious system first;
don't worry about having to
rebuild from scratch in order to merge an optimization;
it's important for testing that there are
simple, trustworthy substitutes .

summary:
. shorter pointers could save a lot of space;
in c, pointers may be 4-bytes long
in order to address any place on the machine .
. addresses can be byte-sized only if needing
less than 256 addresses; for instance,
an array(1..256)of word can hold 256 tree nodes;
and, each node has 2 bytes,
each of which can point to any place in that array .
. a larger tree would use word-sized pointers,
which means the nodes are 4bytes long
(2pointers * 2bytes).

. the byte- and word-sized addresses
use an array index to do the pointing;
and, are collectively called short pointers;
any use of a short pointer implies a reference to
some particular array;
eg, the root of a tree would be a descriptor
which identifies the tree as using short pointers
and then provides a link to the array being referred to .

. a tree descriptor would also include
a separate array for data;
because, arrays need to contain just one type
(unless its items all have bulky type tags);
and, while all internal tree nodes
are of type (ptr, ptr),
the leaf nodes contain data of varying sizes and types;
so, the data array might contain system pointers;
or, it could use a heap system,
where the pointer refers to several items:
the first item contains a size field
which indicates how many subsequent items
belong to this datum .

. large trees (uses word-pointers)
may contain small trees (uses byte-pointers);
any subtree of a large tree containing less than 256 nodes
could be repacked as a pointer to small tree .
. as the list processor goes into a tree,
it is always keeping the current base addresses
for both the large tree and any small tree .

. a large tree is chosen over a small tree type
not only when it will always be large,
but also when it is likely to grow .
. often it will never approach its max size;
therefore the array being used for the
internal nodes of a large tree
should have a growable size to conserve space .
. one way of being growable is to be
an array of pointers to segments:
eg, suppose the segment is 10 units;
then for item# 78,
follow the 7th pointer to a segment,
and use the 8th item on that segment .
. in practice,
the segments likely should be the size of small trees
( 256 nodes * 2pointers * 1 byte).

. while the large tree is indexing the seg'd array,
a small subtree could index a segment therein .
[12.23:
. since the short pointers of a large tree are
pointing at a seg' pointer,
seg's could vary in type, and be type-tagged,
that way the large tree can point directly at a small tree
without having to redirect to a data array .
. the large tree is an array of
( pointer to seg
, enum"seg nodes are {word, long}
)].

. both the large tree (word-pointers)
and the small tree (byte-pointers)
can have a bit per word to indicate
whether they point to tree or data .
. word-sized pointers use their sign bit for indicating
whether they point to tree or data;
byte-sized pointers exist in pairs on word-sized nodes,
and a byte is too small to sacrifice a bit,
so we'll divide the segment in half;*
if the positive side is used (a byte in 0..127)
that item is an internal node;
else (a byte in 128..255) the item is a leaf node .

*(conveniently, no matter what the shape of a tree,
the number of internal and leaf nodes is nearly the same:
#leaf = #internal + 1 )
A binary tree with N internal nodes has exactly N+1 external (leaf) nodes
Proof: by induction
    Base case:
        a binary tree with 0 internal nodes has exactly 1 external (leaf) node (root)
    Induction step:
        . assume a binary tree with m internal nodes
        has exactly m+1 external (leaf) nodes
        A binary tree with N internal nodes is made up of:
            A root (internal node)
            A left binary trees with k internal nodes
            A right binary trees with (N - 1 - k) internal nodes
        According to the induction assumption:
            The left binary trees with k internal nodes has exactly
                   k+1 external nodes
            The right binary trees with (N - 1 - k) internal nodes
                   has exactly (N - k) external nodes
            So the total number of external nodes
                in a binary tree with N internal nodes is:
                k+1 + (N - k) = (N + 1 ) external nodes .

notes:

binary operation polymorphism:
. tree polymorphism works the same way numbers do:
if a biop has to deal with (byte, word),
it converts byte to word for the (word, word) vm;
ie, it makes a copy of the tree but with new pointer types .

. by making small arrays part of larger ones
you would be doing more of your own mem mgt
instead of using c mem mgt
which expects you to de/allocate in stack order;
use c mem mgt for arrays whenever possible;
but use small arrays for tree nodes
to keep tree-pointer size down .

. the segs of a seg'd array
can either come from malloc,
or be pointers into a reserve array
where you're doing your own mem mgt .

. the small tree needs arrays of size = word*256
-- that may be the right size for units used by seg'd arrays .

. of the pointer and its target,
which should contain needed subtype tags?
if the target contains the id,
then there is efficient aliasing;
but in some targets there would be bloat
because targets like system pointers need the whole obj,
and can't spare a few bits for type tagging,
so then an additional array of byte for tagging is needed .

2011-07-30

tags, link-trees, and file-trees

7.3: 7.30: adde/tags, link-trees, and file-trees:
. how to integrate tags with folder systems?
placing a file in a folder is tagging it;
a hierarchy is a tag with segments;
ie, each path name is one tag .
. while a file can be in only one folder,
it can be assigned to many tags .

. tags can be thought of as
a list of links to its members;
and, equivalently, as attributes of a file:
each file or file link can list all the tags
to which it is a member,
just as it can tell the folder path it's in .

link-trees vs file-trees:
. a file-tree is the usual folder tree,
it contains actual files rather than links;
so, a file can be a member of only one file-tree .
. the link-tree is like a file-tree
except it contains only links to files
rather than actual files;
thus, a file can belong to many link-trees,
and may be in many places
within the same link-tree .
7.30:
. the main idea for these distinctions
is that file-trees are low-level concerns;
the only reason they need to be accessed
is when the space limits require
deletions or archiving .
. most file system operations
are concerned only with organizing files,
not managing limited space,
and each activity should have its own
customized view of the file system;
so, most file accesses should be done
through link-trees .
. tags can be applied to either files or links,
depending on the tag's scope:
tags on files are global;
tags on links apply only to
a specific link-tree;
ie, each link-tree has its own set of tags:
both the ones induced by the global tags
belonging to files it's linking to,
and the tags specified by the
user of the link-tree .

references for security and speed

7.9: adda/oop/references for security and speed:
. notice how oop would rather
create a new object and modify a pointer
than modify the value of a current object .
. I call this the "(pervasive pointers).

. from an encapsulation view there is no diff',
since only the owning type
can do these mod's .
. what it does seem to uniquely do though,
is encourage the sharing of objects;
ie, if you can modify your value
then I have no interest in sharing its object
because it's no longer synonymous with
the intended value at some past time .
--
. oop is doing for every value
what c does only for arrays .
7.10:
. it does promote strong
capability-based security;
because if everything is through reference,
and ref's are encryptable,
then that is a much stronger encapsulation
unless the system is already protected
by allowing only system-compiled code .
7.11:
. another interesting property of
pervasive pointers
is that by comparing pointers,
one can tell quickly whether
even huge structures are identical;
though if the pointers aren't equal,
they might still represent equal values .
. python has a speed hack where the
first 100 ints have reserved pointers
so in the many loops controlled by small ints,
pointer compares are decisive .

2011-01-31

byte-sized pointers

1.18: adda/dstr/byte-sized pointers:
. keep pointers small by seeing large lists
as subtrees;
one of the node types of a tree is
"(this subtree root declares a new subheap:
it includes access to a 256-size array of tree node).
. how does that scale? it does because
root of huge subtree had its own subheap
where all terminal nodes were tagged as
(this is a sys'ptr [32bit c pointer]
-- not a pair of byte-sized tree ptrs [byte'ptr])
[1.31:
. as the tree progresses downward with fanout,
it eventually consumes 256 nodes;
any tree-building further down
must be done from a new-subheap node`type
(ie, a node whose purpose is to declare that
the following subtree is based in a new subheap).
. lets say that a tree uses
2 subheaps worth of nodes;
then if the tree is balanced,
it can be built with only 2 new-subheap nodes
by putting the left and right subtrees
each in their own subheap;
if the tree is not balanced,
then space efficiency requires that subheaps be shareable,
so that many smaller subtrees can be
impl'd on the same subheap .
. to make best use of space,
the use of new-subheap node`types
must be minimized because they are overhead .]

. when having a ptr to subtree,
it needs a record:
sys`ptr to the subheap,
byte'ptr to where in the subheap subtree is rooted .

. when building trees from text, [1.31:
or when trees can be modified? ..]
then we need speed more than mem'
whereas figuring out how to pack for byte-ptr's
would be very time-consuming;
so, then these sort of trees should be word-sized .
. both sizes are distinguished from sys'pointers
with the term "(local ptr):
whereas a system pointer is logically a machine address
(practically an index into a process module),
a local ptr is always an index into the current subheap .
1.31:
. our tree node has less than 16 variants,
so the tag can fit in 4bits (a half-byte, nibble,
-- addressing a byte to extract the {hi,lo} 4bits );
the node's data will be sizes larger than a nibble,
and the tag should be at the beginning
to tell us what the rest of the node looks like;
so, in order to accommodate mem'allignment requirements,
without wasteful padding after the tag,
the subheap will have 2 forks
so that tags and data can each be
in their own arrays .

. a subheap supporting byte-ptr's
can hold only 256 nodes,
in turn needing a 128-byte chunk for the 256 * 4bit tags .
. these nodes are word sized:
#branch: (tag, left.byte-ptr, right.byte-ptr);
#leaf: (tag, symbol.word-ptr).

. subheaps are composed of chunks;
so, if a subheap is not enirely used,
it's not entirely alloc'd either .
. both byte-ptr and word-ptr subheaps
can use the same size chunks, 128 bytes,
to match the minimum chunk needed for tags .
. therefore 256*2byte data nodes = 128*2*2 = 4 chunks .
. if less than half of nodes are used,
then it can save on 2 chunks of 128 bytes .

. finally, a byte-ptr subheap needs to be
an array of 5 sys'pointers for access to the
1 tag chunk + 4 data chunks,
plus
whatever else is needed for variant tagging,
or part of an efficiency hack .

. if a large structure has a known minimum at init' time,
then it can be given a superchunk,
which is some arbitrary multiple of contiguous chunks .

2010-12-25

managing capabilities without encryption

12.7: adda/cstr/managing capabilities without encryption:
. in a read of capability-based security
I wondered if there was some way to
have capability enforcement
without having to encrypt all the shared pointers .
. related ideas include:
# Singularity's faster task switching
done by using ref's to shared mem'
instead of passing copies between modules;
# how to use pointers to shared resources
so that even though 2 concurrent sharers
were both active at different processors,
only one active sharer at a time
would have a useful link to the shared resource .
sketch of a possible design:
. a process can never reach out directly,
but is always accessing things via
pointers located in their header,
and only the supervisor can modify this header;
eg, the task scheduler .
. it's not enough to have possession of a pointer,
you've got to have a supervisor
copy it to your header;
so, it's like encryption,
in that it requires an authorization .
layers for when bugs happen:
. encrypted cap'pointers are being
another level of security; [12.25:
. cap'based is supposed to include
the soa idea:
. bugs are going to get into the system,
but in a software system that
connected it's modules
the same way https connects computers,
then it wouldn't matter bugs had invaded;
because each of the components is
being separately guarded by modularity
(not allowing direct access to state)
and is working only with ID'd clients
(not servicing anonymous agents).
. the idea of unencrypted header pointers
is assuming that
the system's runtime can be secured
which is not likely
on today's monolithic OS's .]

2010-06-30

pointer syntax depends on typing

6.21: adda/pointer/syntax depends on typing:
. my current ideas of adopting folder syntax
as the syntax for pointers dereferences
seems to go against how folders are
normally used: (dir/) is same as (dir) .
. my semantics have said
(dir/) is body of folder
whereas
(dir) is ptr to folder .
. nevertheless,
the semantics are the same if
considering the type expected:
(eg,
a (copy) or (move) operation
is not expecting to operate on
pointers implementing the container
) .
a var' expecting a pointer
converts either case to a pointer,
whereas, a var' expecting subfolder
converts either case to a subfolder .

2010-03-31

high-level uses of pointer and possessive

3.22: adda/dstr/high-level uses of pointer and possessive:
. both pointers (as seen in subdirectory syntax)
and possessives (as seen as component selection by structs)
can be seen as different ways to express the same thing;
but one way they express different things
is that ptr's don't imply ownership
as they often implement sharing .
. shouldn't adda use the {pointer, possessive} symbols
for the model views?
ie, ownerships can be impl'd with pointers,
but if the understood relation is ownership,
then the pointer should be written as a possessive mark .
that idea may be the same as this:
the important thing in design is to follow an interface .
3.31:
. indeed, this was really an empty debate,
since these operators do have different interfaces,
and it's the writer -- not the lang designer --
who has control of which relation to make structures with .

2009-12-16

safe pointers

7.6:

. pointers all include the id of the scope they are part of,

so that when asking a type-mgt to make an assignment to that pointer,

it can tell whether the scopes differ,

-- looking esp'ly for when a local

is being assigned to an external --

and, then things can be arranged to guard against dangling pointers .

. this needs to include the idea of monads

where each external has a local proxy,

and then it's the responsibility of the run-time system

-- not the type-mgt --

to worry about dangling pointers .

8.9: adda/pointer/types of addressing:

. pointers are addresses;

here is a catalog of addressing types or dimensions:

. persistent pointers can target the name, location, or contents:

{ url -- name of path and file

, filename -- any object or location having that name

, file obj -- the particular object regardless of changes in filename or location

}

. non-persistent addresses -- specific to a time or space domain:

( these are pointers that are indexes or offsets of a private [or local] heap,

ie, a heap used only by a particular

process, type, type.class, or object .

)

. pointer ownership:

clients can refer to obj's by pointers

and servers can use private pointers to structure their obj's .

Americium Dream Documents

2014-11-30

the value of constants

2013-12-07

target-owning vs aliasing pointers

2013-08-31

semi-type-specific subheaps

2013-03-09

managing self-modifying code

2012-08-29

explorations of virtual memory

2012-04-30

when to type an object as pointer

2011-12-31

short pointer

2011-07-30

tags, link-trees, and file-trees

references for security and speed

2011-01-31

byte-sized pointers

2010-12-25

managing capabilities without encryption

2010-06-30

pointer syntax depends on typing

2010-03-31

high-level uses of pointer and possessive

2009-12-16

safe pointers

(As an Amazon Associate I earn from qualifying purchases.); pages of alpha doc's

posts by category

Blog Archive

tags

About Me

Facebook

search Wikipedia

Search This Blog