Implementing Actors - Guild Internals
March 20, 2014 at 12:35 AM | categories: python, actors, guild, concurrency | View CommentsThis post provides an overview of how Guild Actors work. If you missed what Guild is, and how it contrasts with other approaches, it might a good idea to read these two posts first:
It starts off with a trivial actor, and showing what the basic method decorators implement. This is then expanded to a slightly more complex example. Since the results of the decorator are used by a metaclass to transform the methods in the appropriate way, there's a brief recap what a metaclass is. I then discuss how the ActorMetaclass is actually used, and an overview of its logic. Next we walk through what actually happens inside the thread. Finally the implementation of binding of late bindable methods is discussed, and due the implementation of Actor methods is remarkably short and clear.
So let's start off with the basics...
Each actor is an instance of a subclass of the Actor class. The Actor class is a subclass of threading.Thread, meaning each Actor is a thread. In order to make calls to methods on the Actor, the user must have decorated the methods using either the actor_method decorator or the actor_function decorator. If the used doesn't do this, then the calls they make are not threadsafe.
In practice, the actor_method decorator effectively operates as follows. The following:
class example(Actor): @actor_method def ping(self, callback): callback(self)
... means this:
class example(Actor): def ping(self, callback): callback(self) example = ('ACTORMETHOD', example)
Similarly, all decorators in guild.actor do this - they literally just tag the function to be modified into either an actor method, actor function, process method, late binding, etc.
That means this ...
class example(Actor): @actor_method def ping(self, callback): callback(self) @actor_function def unique_id(self): return 'example_'+ str(id(self)) @process_method def process(self): self.Pling() @late_bind_safe def Pling(self): pass
... is transformed by the decorators to this:
class example(Actor): def ping(self, callback): callback(self) def unique_id(self): return 'example_'+ str(id(self)) def process(self): self.Pling() def Pling(self): pass example = ('ACTORMETHOD', example) unique_id = ('ACTORFUNCTION', unique_id) process = ('PROCESSMETHOD', process) Pling = ('LATEBINDSAFE', Pling)
If that was all though, this wouldn't be a very useful actor since none those methods could be called.
In order to make this useful, Actor uses a metaclass to transform this into something more useful.
Recap: What is a metaclass?
In python, everything is an object. This includes classes. Given this, classes are instances of the class 'type'. A 'type' instance is created and initialised by a call to a function with the following signature:
def __new__(cls, clsname, bases, dct):
The interesting part here is dct.
dct is a dictionary where the keys are names of things within the class, and the values are what those names refer to. Given this dictionary creates a class, any values which are functions become methods. Any values become the initial values of class attributes. This is also why we call out a 'class statement' not a class declaration.
This also means that the following:
class Simple(threading.Thread): daemon = True def run(self): while True: print 'Simple'
... is interpreted by python (approximately) like this:
def run_method(self): while True: print 'Simple' Simple = type('Simple', [threading.Thread], { 'daemon' : True, 'run' : run_method } )
The neat thing about this is that this means we can intercept the creation the class itself.
ActorMetaclass
Rather than the Actor class being an instance of type, the Actor class is an instance of ActorMetaclass. ActorMetaclass is a subclass of type, so it shares this __new__ method. Given metaclasses are inherited just like anything else, this means any subclass - like our 'example' above share this metaclass.
As a result, the above 'example' class statement is (approximately) the same as this:
def ping_fn(self, callback): callback(self) def unique_id_fn(self): return 'example_'+ str(id(self)) def process_fn(self): self.Pling() def Pling_fn(self): pass example = ActorMetaclass('example', [Actor], { 'example' : ('ACTORMETHOD', example_fn), 'unique_id' : ('ACTORFUNCTION', unique_id_fn), 'process' : ('PROCESSMETHOD', process_fn), 'Pling' : ('LATEBINDSAFE', Pling_fn) })
This results in a call to our __new__ method. Our new method eventually had to call type.__new__() as in the section above, but before we do we can replace the values in the dictionary.
The logic in Guild's metaclass is this:
new_dct = {} for name,val in dct.items(): new_dct[name] = val if val.__class__ == tuple and len(val) == 2 tag, fn = str(val[0]), val[1] if tag.startswith("ACTORMETHOD"): # create stub function to enqueue a call to fn within the thread elif tag.startswith("ACTORFUNCTION"): # create stub function to enqueue a call to fn within the thread, # wait for a response and then to return that to the caller. elif tag.startswith("PROCESSMETHOD"): # create a stub function that repeatedly (effectively) enqueues # calls to fn within the thread. elif tag == "LATEBIND": # create a stub function that when called throws an exception, # specifically an UnboundActorMethod exception. The reason is # because it allows someone to detect when an 'outbox'/our late # bindable method has been used without being bound to. elif tag == "LATEBINDSAFE": # In terms of the implementation, this actually has the same effect # as an actor method. However in terms of interpretation it's a hint # to users that this method is expected to be rebound to a different # actor's method. return type.__new__(cls, clsname, bases, new_dct)
Actual implementation of an Actor subclass
The upshot of this is the decorator tags the functions which need a proxy outside the thread to allow calls then to be enqueued for sending to the thread to execute.
This means our example class above is (effectively) transformed into this:
class example(Actor): def ping(self, *args, **argd): def ping_fn(self, callback): callback(self) self.inbound.put_nowait( (ping_fn, self, args, argd) ) def unique_id(self, *args, **argd): def unique_id_fn(self): return 'example_'+ str(id(self)) resultQueue = _Queue.Queue() self.F_inbound.put_nowait( ( (unique_id_fn, self, args, argd), resultQueue) ) e, result = resultQueue.get(True, None) if e != 0: raise e.__class__, e, e.sys_exc_info return result def process (self): def process_fn(self): self.Pling() def loop(self, *args, **argd): x = process_fn(self) if x == False: return self.core.put_nowait( (loop, self, (),{} ) ) self.core.put_nowait( (loop, self, (),{} ) ) def Pling(self, *args, **argd ): def Pling_fn(self): pass self.inbound.put_nowait( (Pling_fn, self, args, argd) )
The Actor class
From these stub methods, it should be clear that the implementation of the Actor class there has the following traits:
- Each actor has a collection of queues for sending messages into the thread.
- The thread has a main loop that consists of a simple interpreter (or event dispatcher you prefer)
Additionally, Actors may have a gen_process method which returns a generator. This generator is then executed - given a time slice if you will - by the main thread in between checking each of the inbound queues & handling requests.
The reason for this being a generator is not for performance reasons. The reason for it is to allow the implementation of an Actor stop() method. That stop method looks like this:
def stop(self): self.killflag = True
The main runloop repeatedly checks this flag, and if set throws a StopIteration exception into the generator.
The upshot of this is the use of a generator in this way allows the thread to be 'interrupted', receive and handle messages in a threadsafe manner so on.
The logic within the thread is as follows:
def main(self): self.process_start() self.process() g = self.gen_process() # set to None if fails while True: if g != None: g.next() yield 1 if # any queue had data: if self.inbound.qsize() > 0: # handle actor methods command = self.inbound.get_nowait() self.interpret(command) # if fails, call self.stop() if self.F_inbound.qsize() > 0: # Actor functions command, result_queue = self.F_inbound.get_nowait() result_fail = 0 try: result = self.interpret(command) except Exception as e: # Capture exception to throw back result_fail = e result_fail.sys_exc_info = sys.exc_info()[2] result_queue.put_nowait( (result_fail, result) ) if self.core.qsize() > 0: # used by 'process method' command = self.core.get_nowait() self.interpret(command) else: if g == None: # Don't eat all CPU if no generator time.sleep(0.01) # would be better to wait in the queues.
(The above code ignores the error handling inside the code for simplicity)
Finally, the interpret function that executes the actual methods within the thread looks like this: (again ignoring errors)
def interpret(self, command): # print command callback, zelf, argv, argd = command if zelf: result = callback(zelf, *argv, **argd) return result # if there was a type error exception complain vociferously, and re-raise else: result = callback(*argv, **argd) return result
Binding Late Bound Actor Methods
Using our Camera and Display examples from before, this means effectively doing this:
camera = Camera() display = Display() camera.output = display.input
However, that last line is changing the state of an object which is owned another thread. As a result we need to change this attribute within the thread. Using our code above, this is now quite simple:
@actor_method def bind(self, source, dest, destmeth): # print "binding source to dest", source, "Dest", dest, destmeth setattr(self, source, getattr(dest, destmeth))
That's then used like this:
camera = Camera() display = Display() camera.bind('output', display, 'input')
Summary
Guild actors are implemented using a small number of inbound queues per object to allow them to receive messages. These messages are received by the thread, and interpreted as commands to cause specific methods to called.
Decorators are used by the user to effectively tag the methods, to describe how they will used, allowing the ActorMetaclass to transform the calls into thread safe calls that enqueue data to the appropriate inbound queues.
The key reason for the use of decorators and the metaclass is to wrap up the thread safety logic one place, and also acts as syntactic sugar making the logic of Actor threads much clearer and simpler to interpret and use correctly.
The bulk of the logic of the message queue handling, along with user behaviour for an active actor, is implemented using generators the reason being to allow the threads be interrupted and shut down cleanly. Beyond that there are a small number of helper functions.
For those interested, take a look at the implementation on github .
As usual, comments welcome.