2012-08-29

explorations of virtual memory

7.1: adda/vmem'mgt/intro:

. in our virtual memory stack system
we are replacing each of the stack's
subprogram activation records (act'rec's)
with a pointer to a resizable object
(ie, it points to an expandable array in the heap );
thus, the stack [8.29:
-- if we didn't have a stackless architecture -- ]
becomes an array of pairs:
( return address
, pointer to act'rec
) . if our allotted ram is getting full,
we can file the obj's attached to earlier parts of the stack
or even file earlier segments of a very long stack .

[8.9:
. the problem with implementing this
is that C's malloc gives us an address;
but if we are filing & restoring heap obj's,
then we need to choose the same heap addresses
rather than be told them by malloc;
to solve that problem we need to use handles;
ie, the handle array has pointers to the heap,
and the stack has indices into the handle array .
(8.29:
. what's implied here is that it's the act'recs
that can be filed in their entirety as needed;
but what this article had in mind at 7.1
was something else;
ie, this 8.9 comment of the 7.1 article
is correcting an understanding of the article
that is not what the article meant ) .]

. each act'rec is assigned a 64bit id .
we can deallocate all the extended parts of this obj
and then refer to that id to find the file
to get the extended parts back .
. if a segment of the stack gets filed,
then in the file we replace its pointers to obj's,
with the corresponding id numbers .
[8.29: clarification:
. instead of putting the act'recs on handles
we are never paging out act'recs;
because they are of insignificant size,
plus we need their address to be static
so that we can point to them .
. the reason they are assigned an id
is not to be a handle
rather the id is to find the file
that contains the act'recs extension
which is a segmented array
(an act'rec can be very small because
it's just got space for a few pointers and type tags;
one of those pointers leads to a heap segment
that is an array of pointer to more heap segments
-- this is how the act'rec is incrementally extensible ).
. so, when we need the body of the act'rec again,
we convert the file int an extensible array on the heap .
. if I finally understand this article's intent,
then we don't really need the 64bit id onboard
because we already have such an id:
we just use the static address of the act'rec head .]

. what do return addresses look like
in a vmem system?
each subprogram template has a unique id,
and the return address is a pair:
( subprogram id
, offset from the start of a subprogram id
) .
[8.29: correction:
. in a dynamic system it may not be true that
each subprogram template has a unique id .
. each act'rec is at least this pair:
( subprogram template ptr
, ptr to the extensible array for the ivars
); so a return address needs the pair:
( act'rec ptr
, index into the act'rec's subprogram template
) . the c or obj'c runtime will be providing us with
the subprogram template ptr (a system pointer).
]
. the usual way is to load the program into
a certain area of ram, or vram,
so then the return address implies
both offset and subprogram;
but we want an oop'ish situation
where a subprogram obj is a pair
(template, instantiation)
just like any other object is a pair
(type-tag, instantiation).

. what's actually happening, efficiencies aside,
is that the stack has a subprogram obj,
which is a pair
(subprogram template
, instantiation obj
). then, for each call, there is a pair:
( an index into the prior subprogram
    for where to continue after returning from
    the following call;
, the call's subprogram obj
) . [8.9: correction:
. what the stack has as a return address is:
(subprogram instance address
, offset
) and then the subprogram instance is a pair
like any other object (type-tag, instance var's)
but in this case the type-tag = subprogram template .
8.29: re-correction:
. a subprogram's obj (aka act'rec) looks like this:
( type-tag: is a subprogram
, value.template: ptr to subprogram template
, value.ivar: ptr to extensible array on heap
) .]
. in a c-style stack, the main program
does not use a return statement,
but in an ada-style stack,
main's return address could be
a system address that closes processes .

7.30: adda/mem'mgt/every obj needs a reference:

. how was it that adda's tagged obj's can't use id's ?
most obj systems implement objects as
pointers into a global heap;
and, they don't support process pickling;
because, if they did
they'd find that unpickling would result in
all the pointers being different
unless they impl'ed their own heap;
because,
the c heap doesn't let you decide
where in the heap your obj goes:
instead, you tell it how much space you need,
and then it tells you where your
new space is located .
[ mis: that is a delusion
-- I need to get into the details .]

. adda doesn't use pointers to heap:
it uses an initial direct obj,
and if it out-grows that space,
then it dope-vectors the rest of it
to the act'rec's trailer space
which is a local heap that adda implements .
.  to get the id of an obj
requires a string of symbolic id's
every obj belongs to some act'rec's local heap
so,
it's a space-time trade-off:
you can track this symbolic binary path name
or you add an id to the obj
[8.29: clarification:
. the "(string of symbolic id's) being used
is always just 2 levels deep:
( act'rec ptr, symbol tree node id).]
--
you may need both the path and id anyway
at the notification center
because, otherwise how do you
find the obj given the id ?
you need a map from id to pathname .
so the lack of addresses is a mess we need to fix .
[8.29: correction:
. why does the system need both?
the concensus for now is that
an obj path is equivalent to an obj id .]

7.31:

[8.29: seg registry is obsolete:
. we are aren't needing a seg registry;
because, we are paging-out only whole act'recs;
this way requires no need for handles
and the slow maps or large arrays
that would be needed to implement handles .
]
. our local heaps consist of segs
that we can put into a seg registry
so upon unpickling,
we fill our seg registry with segs
and then a pointer consists of seg registry indices  .
. we also have a process registry,
so each process has an address we control .
. now the binary path is considerably shorter:
(process#, seg#, offset),
--[
. this reminded me that we may need
both forks of an obj:
the stack place holds the dynoData,
and the symbol node of the subprogram unit
holds the constData .
. so, now we're back to needing a
symbolic binary path name:
( process#
, act'rec ptr(indicates root of symbol tree)
, symbolnode#(indicates (seg#,offset) )
) 8.29: correction:
. our subprograms turned out to be
such concurrency-enabled 1st-class functions
that it was pointless to think of act'rec's
as belonging to a particular process;
therefore, the symbolic binary path name
does not include a process# .
]

044:
. so, while the system is running,
given an obj, how do we know its location?
we obviously have the current process id ,
what else do we need ? ...
[8.29: resolution:
. after reviewing the mem system,
and then discarding the seg registry idea,
here is how we know an obj:
. locally it's always known by its
symbol tree node id,
and then if it's needed externally,
it's known as having as a pathname,
the current act'rec's pointer .
]

. each process has a pointer to stack, and stack top;
we activate main:
the stack is given its first act'rec:
( main's template.sysptr
, main's PLrec(param&locals rec.sysptr)
),
the PLrec is an extendable array of
segIndex ( an index into the seg registry)
[ correction:
. but backtracking again here:
if we do need the symbolic path to be a complete path
then we don't need a seg registry;
we just link directly to segs .
8.29: re-correction:
. not sure what that had in mind;
but, again, we can't link directly to segs;
because, they are relocatable after page-out .
]

111: [8.29: handles and seg registry are obsolete:]
. the seg'd array is the backbone
(holding segs like they were ribs);
this backbone array is using system pointers
to stitch its extensions together,
but its components are segIndex's,
which is another attribute of a process;
ie, when you know a process id,
you then also know:
( the system pointers to a stack
, stack top
, and seg registry
) .

123: [8.29: handles and seg registry are obsolete:]
. when the PLrec extends itself with system pointers
the extra space is not found on the backbone
but on segs in the seg registry,
ie, the backbone's ribs are segIndices not system pointers
(re write this by first giving names like sysPtr
vs vseg(index into seg registry) [or segIndex] )