Wish list for component systems in python

January 13, 2008 at 01:32 AM | categories: python, oldblog | View Comments

Interesting wishlist for kamaelia (and others, but I'm interested from a kamaelia perspective) here: http://darkness.codefu.org/wordpress/2007/12/26/295
I'm not sure I buy all the criticisms, and feel they're more a wish-list in terms of improvements specific points mentioned about kamaelia as (IMO) potential wishlist items:
  • "the implementation needs to be updated to support the new features of generators in Python 2.5...." - that I don't buy particularly. The new features of generators are : you can now do try..finally (which was a wart, and is just a matter of using that in new code) Another is that you can send values directly into a generator. If you do that, you effectively make your generator synchronous (or at best isochronous) with the caller. I'm not convinced by a long shot that's a good thing and it feels like a mistake in the language. However by chaining with a pub/sub generator you may be able to gain somethng a bit erlang-ish (to do the matching). I'm not convinced though. The other major thing is that you can send exceptions in to kill a generator. Now that is useful, but is something we'll only do when we decide to kill off completely the kamaelia on python series 60 - which is something I'm not sure I want to do yet. The other point is "how do you model that correctly in other languages" - which is unclear.
  • "....as the current syntax strikes me as rather ugly." - is the other half of the sentence. That can be true at times, though I think the use of WaitComplete and the new STM code makes this somewhat nicer - cf the greylisting server.
  • "In fact, it looks like Kamaelia needs a recent release, period: the last one I saw was from 2006." - ouch! I hadn't realised it had been quite as long as that. That's actually quite embarassing! It was on my todo list, but I'll take that on board.
  • "Kamaelia also fails to offer multi-process operation" - yep. This is something I don't make any bones about - we wanted to make sure that the current API was useful and used it in a variety of scenarios before moving this forward. The recent experiments I've made here suggest this should be relatively simple to achieve. (multiwindow pygame, first multiprocess test)
  • The idea of a networked pipe seems desirable as well given this: "it needs a way to generalize its “wires” to support communication with, e.g., remote components. You could actually combine the marshalling component with the framing component and a TCP client/server components and make this work" - this is something I've played with in /Sketches - but that's nowhere near the level I'd want to use as yet - partly because we hadn't at that point really got a feel for how this could be useful. (That's changing) However, this isn't something I've had a real chance to move forward at all.
  • The code should be able to migrate from microthread to thread to process as desired. This comes back to the idea that ThreadedComponent should be a chassis (ala FirstProcessBasedComponent) rather than a component you instantiate. (that would allow the threaded component to accept any component into running in its thread). This is something I realised early on could be a real problem when it comes to the issue of shared data. However the recent STM code designed to replace the dictionary in the CAT could be a real boost there. Definitely worth reconsidering. This opens up the general idea of proxy components further.
  • Components in multi-language- this seems at odds with the request to use more features of python generators. My guess is actually the core would be a core set of constructs,and then a predefined inter-language comms style. I've been leaning towards JSON more and more lately, and this would certainly hit the multi-language compatibility & ease of working with issues.
Beyond that however there's also an implication of a desire for lower level configuration as well - along the lines of Shards as experimented with by Tara last summer. All of this is doable, but with limited time, its debatable how much kamaelia can satisfy all these things, especially when I'm the only person working on it, with little feedback. As a result I'll focus on the areas I'm getting feedback on :-)

All in all though - useful wish list & food for thought.



Read and Post Comments

Axon.STM 1.0.1 - New Release

December 24, 2007 at 11:59 AM | categories: python, oldblog | View Comments

Just posted this to comp.lang.python - but posting here as well :-) I've received some great feedback since the initial beta release of the minimalistic STM code I discussed and released 2 weeks ago. I've incorporated the feedback, and created a couple of examples based on the canonical dining philosophers example. (One based on normal python threads, one based on Kamaelia)

It turns out that there was a potential race hazard during "using" and "usevar" which I'd missed - many thanks to Richard Taylor for pointing out this issue.

Changelog
1.0.1
  • Improved locking. (fixed race hazards during copying for reading - the last release was noted as probably OK for CPython, but maybe not for Jython or IronPython. This version is more robust)
  • Added Dining Philosophers examples (threading & Axon threading)
Getting it
You can download this release version here:
http://thwackety.com/Axon.STM-1.0.1.tar.gz
Installing it
tar zxf Axon.STM-1.0.1.tar.gz
cd Axon.STM-1.0.1/
sudo python setup.py install
What IS it?, Why is it useful?, Docs? Using it?
See website :-)

DIning Philosophers?
See last blog post!

Pure python version in subversion as well.

Thanks for this release
Many thanks go to Richard Taylor for detailed feedback and discussion regarding locking and for pointing me at MASCOT which made me think of doing the dining philosophers this way

Future
This will be merged onto the mainline of Kamaelia with some auxillary functions , as another feather aimed at making concurrency easy to work with :-) Note: For the moment to use this alongside Kamaelia you install this, then reinstall Axon over the top.

Read and Post Comments

Dining Philosophers In Kamaelia

December 23, 2007 at 11:36 PM | categories: python, oldblog | View Comments

After playing around with the STM code, and spotting the dining philosopher's code in the MASCOT example/manual that I found, I've now implemented the traditional dining philosophers example using Kamaelia. This is largely about acquiring and releasing resources - something we largely do in a really simplistic way right now, and have explicitly excluded threaded code from directly accessing.

Implementing STM makes it possible to extend this to threaded code as well, and the dining philosophers example is a pretty canonical example.
#!/usr/bin/python

# First some imports...

import random
import time
import Axon
from Axon.STM import Store

# And a single support function to make code clearer :-)

def all(aList, value):
    for i in aList:
        if value != i:
            return False
    return True


# We can then define a dining philosopher as follows:

class Philosopher(Axon.ThreadedComponent.threadedcomponent):
    forks = ["fork.1", "fork.2"] # default for testing :-)
    def getforks(self):
        """
        Essentially what this says is:
           * I have a number of forks I'm expected to try and pick up.
           * So I grab a copy of the global state.
           * Check to see they're not owned by anyone:
              X[fork].value == None
           * If they're not, then setting their owner to "me":
               X[fork].value = self.name
           * Then try to commit that change of ownership
        """
        gotforks = False
        while not gotforks:
            try:
                X = self.store.using(*self.forks)
                if all([ X[fork].value for fork in self.forks], None):
                    for fork in self.forks:
                        X[fork].value = self.name
                    X.commit()
                    gotforks = True
                else:
                    time.sleep(random.random())
            except Axon.STM.ConcurrentUpdate:
                time.sleep(random.random())
        print "Got forks!", self.name, self.forks
        return X

    def releaseforks(self,X):
        """
        If that works, then I have the forks, otherwise I sleep and retry.
        I then later release the forks by setting their values to None and
        committing back.
        """
        print "releasing forks", self.name
        for fork in self.forks:
            X[fork].value = None
        X.commit()

    def main(self):
        while 1:
            X = self.getforks()
            time.sleep(0.2)
            self.releaseforks(X)
            time.sleep(0.3+random.random())

# The final bit of the code simply sets the system in motion:
S = Store()
N = 5
for i in range(1,N):
    Philosopher(store=S,forks=["fork.%d" % i ,"fork.%d" % (i+1)]).activate()

Philosopher(store=S,forks=["fork.%d" % N ,"fork.%d" % 1]).run()

The upshot of this is, personally, I find this relatively clear and relatively simple. It is also the sort of thing that you can wrap up neatly into a relatively reusable piece of code (which is of course the sort of thing I like to do :)

One thing I'm tempted to do is to change this to emit when a philosopher picks up/puts down a fork since you could animate this in a fun way I suspect :-)

Read and Post Comments

Software Transactional Memory updated - Code Review Sought!

December 20, 2007 at 12:49 PM | categories: python, oldblog | View Comments

I've updated the STM code in subversion at present here:
https://kamaelia.svn.sourceforge.net/svnroot/kamaelia/branches/private_MPS_Scratch/Bindings/STM/Axon/STM.py
I would really appreciate any code review from anyone who's capable of doing so - even if you've never done a code review before! Things I think need particular attention is the detail around locking. Specifically the following 4 methods are intended to be used by a user of the code:
class Store(object):
    def usevar(self, key):
    def set(self, key, value):
    def using(self, *keys):
    def set_values(self, D):
The above 4 functions entirely manage the locking. The following functions are internal and all assume that the store is locked appropriately before an update attempt:
    def __get(self, key):
    def __make(self, key):
    def __do_update(self, key, value):
    def __can_update(self,key, value):
Anyhow, why is this useful? Well, it turns out this should make resource management a lot simpler. If I'm right, the following code should be sufficient for the Dining Philosopher's problem:
take forks... (pseudo python, ignoring various details)
not_have_forks = True
while not_have_forks:
   try:
      S = <get the store from somewhere>
      forks = S.using("fork.1", "fork.2")
      if forks["1"].value == None and forks["2"].value == None:
          forks["1"].value = id(self)
          forks["2"].value = id(self)
          forks.commit()
          not_have_forks = False
   except ConcurrentUpdate:
       time.sleep(random()) # random backoff - doesn't guarantee success
   except BusyRetry:
       time.sleep(random()) # random backoff - doesn't guarantee success
eat()
# Put forks back - should always succeed. Should. :-)
try:
    forks["1"].value = None
    forks["2"].value = None
    forks.commit()
except ConcurrentUpdate:
    pass
except BusyRetry:
    pass
... which I think is pretty neat :-)

What inspired me to this? Well, there's been a thread on the Python UK list recently about my STM implementation, which resulted in a discussion about a system called "MASCOT" being discussed. Essentially MASCOT looks incredibly similar to Kamaelia - at least superficially. From my perspective, the mere existence of MASCOT and its various subsystems which looks very similar I think validate a lot of the core ideas in Kamaelia. I want to write a post up about that separately though.... More later!

Anyhow, back to this, if anyone is willing to code review the STM code here:
https://kamaelia.svn.sourceforge.net/svnroot/kamaelia/branches/private_MPS_Scratch/Bindings/STM/Axon/STM.py
I'd greatly appreciate it!

Read and Post Comments

Concurrency - The difference between functional call context & object context

December 17, 2007 at 10:25 PM | categories: python, oldblog | View Comments

A common idiom in functional programming at least with the functional languages I've used and systems I've needed to build - is a call context. That is any sufficiently large functional piece of code is at some level modelling interacting with a model. So in pseudo-functional-language you (conceptually) end up doing something like:
fun move_x context distance:
   let
       x,y,z = Parts(context)
   in
       return Context(x+distance,y,z)
   done
This context then gets passed around, and updated. This means you can end up with lots of functions that look like this:
fun move_x contet ...
fun move_y context ...
fun move_z context ...

This looks very similar to something you get in python:
def move_x(self, ...
def move_y(self, ...
def move_z(self, ...

However there's a big difference. In the object oriented python (or almost any other OO language really), the value of "self" is a persistent value (ie exists while someone has a reference to it), and "self" is itself mutable.

This has some implications that you don't have in a functional world for concurrency. For example, you could model and pass on an updated  indication something is locked something like this:
fun ..with_lock context ...:
   let
       x,y,z,_ = Parts(context)
   in
       return Context(x,y,z,"locked")
   done
ie simply update our functional object to say that the state is locked. That sounds remarkably similar to doing something like this:
def ..with_lock(self, ...
     self.locked = True
    ....
However, remember the difference here. In the functional version the Context is (logically at least) creating a completely new value, with the exception of the fact that one attribute of the structure is different. However the OO version is mutating a piece of storage.
This is perhaps the simplest example I can see of how a functional style is easier to work with in a concurrent world with shared data, and perhaps also what aspect of shared data is dangerous.

You see, because the functional style essentially takes a value and copies it before mutating it, it naturally means that the only bits of code that see the copy are the pieces of code called next. eg:
def hello(context, count):
    if count >1:
        return hello("hello"+context, count -1)

In this example the context just bigger and bigger and bigger. It is however never shared - except explicitly. Just because on the surface it looks similar, it's nothing like it. The coding and design principles look similar, but they're not. The fact is it is object oriented, but it's a  functional object and it is only visible within the thread that created it. If someone else has another object that looks the same, that's nice, but it isn't the same.

(All of this goes back to fundamentals of functional languages)

However, if I create an object in an OO language, I can easily create a collection of threads that can mutate that object. As soon as I do, in an object oriented language, without hackery what one thread of control can see about the object, all threads of control can see about the object. That's right - all attributes become instantly shared data.

This means that suppose I acquire a lock, and wish to pass this onto another function I'm calling. In a functional language, I can merely update the value in the (thread-of-control-private) context, whereas in the OO language, I can't update the context. I have to create a new functional context to pass around if I wish to achieve something similar. ie I have to do this:
def meth(self, callcontext, ...)
Where callcontext contains the new indication saying we're locked.

As a result, this leads me to ponder:
The core problem here, as ever, in the OO version isn't just shared state - it's shared mutable state
More than that it's also the fact that the mutable state can't be easily used to control thread local state by default.

Doesn't that open the door to something more halfway? Add support for thread private object values? I guess this is what Unified Functions and Objects - UFO (which I use many years ago) - did - I just never appreciated what it was really doing. (Or if it didn't, what it should've done :-)

I think what this also suggests (to me) is that thread local storage actually needs promoting to a much higher level - to the extent that updates to objects default to updating attributes in thread local storage rather than direct unmanaged update - with the idea being that the managed updates can then be managed in a somewhat better manner.

Hm. I suppose what that is saying is that the mutable shared object state should default to being software transactional memory, whereas the version you normally access is thread local, since then you would (hopefully) get the best of both worlds...

Hm, shared self.__dict__ as an STM , and threads accessing it only doing so through special dict proxies which get commited ? hmm.....

Read and Post Comments

The web is stateless, please

December 12, 2007 at 12:50 AM | categories: python, oldblog | View Comments

Arrgh. If I hit "delete" on a form, I'd like the thing I'm deleting to be encoded in the form rather than associated with an opaque sessionid which "helpfully" remembers what I last looked at. The web is meant to be stateless after all. kthxbye.
Read and Post Comments

Simple assistive short term memory

December 10, 2007 at 10:54 PM | categories: python, oldblog | View Comments

OK, here's a challenge. Build a tool to assist with short term memory loss. The idea is this:
  • You go to the top of the stairs to do something
  • You get there, and wonder what you were there for
It'd be neat if you could do this:
  • Wear a simple looking pedant, which was a short term memory assistive device
  • Simply mutter to yourself what you're doing
  • When you forget what you're doing you press/tap a button & hear a repeat of the last muttering
    • Press it again in rapid succession, and you get the one before, and so on. (giving you context "what was I doing before...")
  • Aesthetically, it would be nice if the device looked like a pedant or similar.
  • The playback/recording could be a simple earpiece/microphone (ala mobile phone handsfree).
  • It should be sufficiently low power to run happily for as much of a day or more as possible.

image (c) 2007 daniboi1977 (flickr)
(original license cc-nc-nd)
Technologically this sounds like it should be really quite simple & trivial to make today.
  • Now here's the thing - you could do this today with a notebook & pencil hanging from a necklace - except that you replace muttering with scribbling, and pressing a button with looking at the notebook. Or perhaps hanging a dictaphone round your neck and remembering to hit "record" every time you say something.
  • However, not everyone who might find such a thing useful (ie most of us when we get older) would be comfortable with the notebook on a necklace or dictapone on a necklace approach, but may find a short-term memory pendant assist socially acceptable.
So, the challenge is - could you make this today (almost certainly), and given that, how would you do so? This isn't entirely a thought experiment - people with alzheimers may find such a beast useful, and simply having to press a button to hear what you were doing and double tapping to hear what you were doing is probably something learnable. (maybe, dunno in that context really)

I suppose in a way it's a bit like an personal audio twitter with absolutely minimal interface with zero connectivity.

Incidentaly, imagine what you could use such a device for beyond this...
Read and Post Comments

Interesting - Transparent messages instead of modal "monolog" boxes

December 10, 2007 at 01:37 AM | categories: python, oldblog | View Comments

Wins bonus points for looking straight out of a film :-)
http://humanized.com/weblog/2006/09/11/monolog_boxes_and_transparent_messages/
Their point is somewhat less shallow than this, but it does remind of film interfaces... :-)

Read and Post Comments

Widefinder in Kamaelia?

December 09, 2007 at 05:55 PM | categories: python, oldblog | View Comments

Tim Bray's widefinder project is an interesting little problem. How do you find the top 10 most popular accesses to your website in a way that gains performance when you use more CPUs? Having done log analysis in the past, there's always a tradeoff Looking at the original Ruby source
counts = {}
counts.default = 0

ARGF.each_line do |line|
  if line =~ %r{GET /ongoing/When/\d\d\dx/(\d\d\d\d/\d\d/\d\d/[^ .]+) }
    counts[$1] += 1
  end
end

keys_by_count = counts.keys.sort { |a, b| counts[b] <=> counts[a] }
keys_by_count[0 .. 9].each do |key|
  puts "#{counts[key]}: #{key}"
end

This really looks fairly Kamaelia friendly. It requires support not yet written ( :) ), but it certainly fits Kamaelia's model. (The code needing writing is relatively minimal now though, given we have basic multicore support)

So, what would this need to look like? Well, to my mind it'd look something like this: (pseudo kamaelia)
Pipeline(
    readfile,
    SplitWorkers(8, -> Create 8 Analyser workers, ideally in different CPUs
                 Pipeline(Analyser -> Builds a heap for counts, emits heap
                          Serialiser -> outputs values in order from the heap
                 ) -> Output from a worker is a tuple (id, value)
    )
    merge(8) -> merge heap streams (Can't start until it has values from all workers
    Head(10) -> output first 10 items
    ConsoleEchoer()
)

Even then, for this problem, this is suboptimal - we want the merge to not actually output the merged heap, but rather the values from the heaps, so that it doesn't do unnecessary work. Either way though, an interesting problem. May get round to implementing this after I've dealt with STM.

Aside from distribution across CPUs, and collation of results, this looks relatively simple to implement. Mainly blogging about this whilst I remember. The other thing here as well to note is that the results would never need to be completely sorted in this version. In fact, the more CPUs you have, the less sorted the data would be!
Read and Post Comments

Minimalistic Software Transactional Memory

December 08, 2007 at 11:03 PM | categories: python, oldblog | View Comments

I've just posted this message on comp.lang.python looking for comments, but I'll post it here in case there's any feedback this way :-)

I'm interested in writing a simple, minimalistic, non persistent (at this stage) software transactional memory (STM) module. The idea being it should be possible to write such a beast in a way that can be made threadsafe fair easily.

For those who don't know, STM is a really fancy way of saying variables with version control (as far as I can tell :-) designed to enable threadsafe shared data.

I'm starting with the caveat here that the following code is almost certainly not threadsafe (not put any real thought into that as yet), and I'm interested in any feedback on the following:
  • Does the API look simple enough?
  • Are there any glaring mistakes in the code ? (It's always harder to see your own bugs)
  • What key areas appear least thread-safe, and any general suggestions around that.
If I get no feedback I hope this is of interest. Since these things get archived, if you're reading this a month, 6 months, a year or more from now, I'll still be interested in feedback...

OK, API.

First of all we need to initialise the store:
S = Store()
We then want to get a value from the store such that we can use the value, and do stuff with it:
greeting = S.using("hello")
Access the value:
print repr(greeting.value)
Update the value:
greeting.set("Hello World")
Commit the value back to the store:
greeting.commit()
If you have concurrent updates of the same value, the following exception gets thrown:
ConcurrentUpdate
cf:
>>> S = Store()
>>> greeting = S.using("hello")
>>> par = S.using("hello")
>>> greeting.set("Hello World")
>>> par.set("Woo")
>>> greeting.commit()
>>> par.commit()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 12, in commit
  File "<stdin>", line 11, in set
__main__.ConcurrentUpdate
That's pretty much the simplest API I can come up with. (I've tried a few others but didn't really like them)

The way this works is we have a Store that manages Values. (perhaps a better name may be variables to avoid clashing with pythons parlance of labels and values?)

Anyhow, you can ask the store for Value, which it will give you. The Value knows what it's called and where it's from, so when you tell it to commit, it can try to do so. The little detail here is you get a copy of the Value not the stored Value. (This is to avoid accidental concurrent update of the same actual object)

As a result, Store looks like this:
class Store(object):
    def __init__(self):
        self.store = {}

    def get(self, key):
        return self.store[key].clone()

    def set(self, key, value):
        if not (self.store[key].version > value.version):
            self.store[key] = Value(value.version+1, value.value, self, key)
            value.version= value.version+1
        else:
            raise ConcurrentUpdate

    def using(self, key):
        try:
            return self.get(key)
        except KeyError:
            self.store[key] = Value(0, None,self,key)
            return self.get(key)

    def dump(self):
        for k in self.store:
            print k, ":", self.store[k]
You'll note that the set method is the greatest candidate for any possible race hazard here - though I can see a possible boundary issue in "using". (I think :-)

Otherwise I think the above code is relatively straightforward. You'll note that this API allows this:
greeting.set("Hello")
greeting.commit()
greeting.set("Hello World")
greeting.commit()
greeting.set("Hello World. Game")
greeting.commit()
greeting.set("Hello World. Game Over")
greeting.commit()
The other class is value that looks like this:
class Value(object):
    def __init__(self, version, value,store,key):
        self.version = version
        self.value = value
        self.store = store
        self.key = key

    def __repr__(self):
        return "Value"+repr((self.version,self.value,self.store,self.key))

    def set(self, value):
        self.value = value

    def commit(self):
        self.store.set(self.key, self)

    def clone(self):
        return Value(self.version, self.value,self.store,self.key)

To me this looks like a pretty complete minimalistic thing, which does seem to work OK, but I'm interested in the three points I mention above - if anyone is willing to comment - specifically:
  • Does the API look simple enough?
  • Are there any glaring mistakes in the code ? (It's always harder to see your own bugs)
  • What key areas appear least thread-safe, and any general suggestions around that.
If anyone comments - thanks!

Full code below.
class ConcurrentUpdate(Exception): pass

class Value(object):
    def __init__(self, version, value,store,key):
        self.version = version
        self.value = value
        self.store = store
        self.key = key

    def __repr__(self):
        return "Value"+repr((self.version,self.value))

    def set(self, value):
        self.value = value

    def commit(self):
        self.store.set(self.key, self)

    def clone(self):
        return Value(self.version, self.value,self.store,self.key)

class Store(object):
    def __init__(self):
        self.store = {}

    def get(self, key):
        return self.store[key].clone()

    def set(self, key, value):
        if not (self.store[key].version > value.version):
            self.store[key] = Value(value.version+1, value.value, self, key)
            value.version= value.version+1
        else:
            raise ConcurrentUpdate

    def using(self, key):
        try:
            return self.get(key)
        except KeyError:
            self.store[key] = Value(0, None,self,key)
            return self.get(key)

    def dump(self):
        for k in self.store:
            print k, ":", self.store[k]

S = Store()
greeting = S.using("hello")
print repr(greeting.value)
greeting.set("Hello World")
greeting.commit()
# ------------------------------------------------------
print greeting
S.dump()
# ------------------------------------------------------
par = S.using("hello")
par.set("Woo")
par.commit()
# ------------------------------------------------------
print greeting
S.dump()
# ------------------------------------------------------
greeting.set("Woo")
greeting.commit()

print repr(greeting), repr(greeting.value)

S.dump()

Read and Post Comments

« Previous Page -- Next Page »