2011-05-30

the state of scripting concurrency

4.19: adda/co/the state of scripting concurrency/intro:
[5.30:
. this excerpt from stackoverflow.com
had me looking at concurrency again:]
"( If you have programmed expertly in
Perl Python and in Java for 10 years,
then you'll probably write your program in Perl
because you'll complete the program faster,
the program will have fewer lines of code,
and the language will stay more out of your way.
If you are not an expert in Perl, Python, or Java,
and you have to choose one of those languages,
then I recommend that you choose Python.
... except if threading is important (re: GIL)...)
green threads and the need for GIL:
[5.30:
. processes are full programs running concurrently:
each process has its own space for
both variables and code;
threads are like processes except that
they share the locals and code
of the process that spawned them .
. threads and processes can be either
native -- implemented by the os,
or green -- impl'd by an app (eg, a scripting interpreter).
Erlang provides a green process(vs thread),
which is much more lightweight than a native process
because it does share (read-only) code space .

. a computer with multiple cores
can be truly concurrent:
doing more than one thing at the same time
by contrast, timeslicing is virtual concurrency:
giving each task a slice of computer time .
. the GIL (Global Interpreter Lock)
is a mutual exclusion lock
that prevents true concurrency:
insuring that app threads are timesliced,
rathering than being mapped to multiple cores .
. it's needed when the the interpreter,
it's libraries, or its plugins
are not thread-safe because of
threads being able to share variables
that aren't protected with atomic access:
ie, being able to complete a read or write
before having being interrupted by the timeslicer .

Ruby's support for concurrency:
. IronRuby builds on top of .NET Threads,
so they map 1-1 to OS-threads as well;
JRuby does likewise on the JVM .
[5.30:
. these GIL-free variants of Ruby
provide threads without any warranty:
it's up to you to insure that
all your dependencies are thread safe .
. concurrency models supported by Ruby
include Threads, Processes and
Fibers (systems-level coroutines).
. other abstractions to consider include
Coroutines, Actor Models, Petri Nets, Process Algebras
(particularly CSP and the Pi-Calculus),
Software Transactional Memory
and distributed Map/Reduce algorithms
-- see Go, Occam-Pi, Clojure and Erlang;
Ruby could impl' these with current libraries;
eg, EventMachine or RevActor .
. Ruby needs a standard actor/executor API
-- not platform-specific impl's of actors .

. the future of high performance concurrency
is libdispatch/GCD;
for the java/scala folks, there's HawtDispatch:
(JRuby's port of that is at github/jcd).
. HawtDispatch is a thread pooling and
NIO event notification framework API
modeled after the Apple`libdispatch API
that powers Apple's Grand Central Dispatch (GCD).
It allows you to easily develop
multi-threaded applications
without the usual problems .
]-5.30
python's gil:
Juergen Brendel argues against the GIL;
Guido maintained the GIL is here to stay
until someone can prove its removal
doesn't slow down single-threaded Python code.
. the language doesn't require the GIL
but, the CPython virtual machine
that has historically been unable to shed it.
it was shown that even on the platform
with the fastest locking primitive (Windows at the time)
it slowed down single-threaded execution
nearly two-fold .
. removing the GIL complicates life for
extension module writers
by precluding the use of global mutable data .
There might also be changes in the Python/C API
necessitated by the need to lock certain objects
for the duration of a sequence of calls.
Bob Warfield 2007`analysis of gil:
Guido is Right to Leave the GIL in Python,
Not for Multicore but for Utility Computing
considering large scalability issues
in the world of SaaS, Web 2.0,
and utility computing fabrics;
eg, Amazon EC2(elastic computing).
. a concurrency capability based on threads
has done nothing to access multiple machines
-- for that you need socket-connected processes .
. furthermore,
a simple, safe and reliable concurrency language
should be focused on a [green]process model,
not a thread model.
[5.29:
. concurrent programming has a bad reputation
for being both buggy and undebuggable,
but it's based on work with threads .
5.30:
. to be efficient and safe,
a language needs to pervasively support
green processes:
a unit of concurrency that does share
read-only mem like a thread does
but does not share variable mem .
. pervasive support means that
not only is the standard library thread safe,
but all reusable modules are also .]

another way threads don't scale:
The fundamental problem with threads
is that sharing requires locking
which doesn’t scale (or compose),
and is prone to races and deadlocks .
Erlang features [green]processes
where isolation is enforced by the language
rather than the operating system .

. Erlang is a functional language
with strict copy semantics
and with no pointers or references.
[. it is merely the semantics
that are pass-by-copy;
the impl'details involve
read-only pass-by-reference .]

Why don’t we all switch to Erlang?
Messages have to be copied.
You can’t deep-copy a large data structure
without some performance degradation,
and not all copying can be optimized away
(it requires behind-the-scenes alias analysis).
so, mainstream languages don’t abandon sharing;
instead, they rely on programmer’s discipline
or try to control aliasing.

No comments:

Post a Comment