Implementing Actors - Guild Internals

March 20, 2014 at 12:35 AM | categories: python, actors, guild, concurrency | View Comments

This post provides an overview of how Guild Actors work. If you missed what Guild is, and how it contrasts with other approaches, it might a good idea to read these two posts first:

It starts off with a trivial actor, and showing what the basic method decorators implement. This is then expanded to a slightly more complex example. Since the results of the decorator are used by a metaclass to transform the methods in the appropriate way, there's a brief recap what a metaclass is. I then discuss how the ActorMetaclass is actually used, and an overview of its logic. Next we walk through what actually happens inside the thread. Finally the implementation of binding of late bindable methods is discussed, and due the implementation of Actor methods is remarkably short and clear.

So let's start off with the basics...

Each actor is an instance of a subclass of the Actor class. The Actor class is a subclass of threading.Thread, meaning each Actor is a thread. In order to make calls to methods on the Actor, the user must have decorated the methods using either the actor_method decorator or the actor_function decorator. If the used doesn't do this, then the calls they make are not threadsafe.

In practice, the actor_method decorator effectively operates as follows. The following:

class example(Actor):
    @actor_method
    def ping(self, callback):
        callback(self)

... means this:

class example(Actor):
    def ping(self, callback):
        callback(self)

    example = ('ACTORMETHOD', example)

Similarly, all decorators in guild.actor do this - they literally just tag the function to be modified into either an actor method, actor function, process method, late binding, etc.

That means this ...

class example(Actor):
    @actor_method
    def ping(self, callback):
        callback(self)

    @actor_function
    def unique_id(self):
        return 'example_'+ str(id(self))

    @process_method
    def process(self):
        self.Pling()

    @late_bind_safe
    def Pling(self):
        pass

... is transformed by the decorators to this:

class example(Actor):
    def ping(self, callback):
        callback(self)

    def unique_id(self):
        return 'example_'+ str(id(self))

    def process(self):
        self.Pling()

    def Pling(self):
        pass

    example = ('ACTORMETHOD', example)
    unique_id = ('ACTORFUNCTION', unique_id)
    process = ('PROCESSMETHOD', process)
    Pling = ('LATEBINDSAFE', Pling)

If that was all though, this wouldn't be a very useful actor since none those methods could be called.

In order to make this useful, Actor uses a metaclass to transform this into something more useful.

Recap: What is a metaclass?

In python, everything is an object. This includes classes. Given this, classes are instances of the class 'type'. A 'type' instance is created and initialised by a call to a function with the following signature:

def __new__(cls, clsname, bases, dct):

The interesting part here is dct.

dct is a dictionary where the keys are names of things within the class, and the values are what those names refer to. Given this dictionary creates a class, any values which are functions become methods. Any values become the initial values of class attributes. This is also why we call out a 'class statement' not a class declaration.

This also means that the following:

class Simple(threading.Thread):
    daemon = True
    def run(self):
        while True:
             print 'Simple'

... is interpreted by python (approximately) like this:

def run_method(self):
    while True:
        print 'Simple'

Simple = type('Simple', [threading.Thread], {
                                             'daemon' : True,
                                             'run' : run_method
                                            }
             )

The neat thing about this is that this means we can intercept the creation the class itself.

ActorMetaclass

Rather than the Actor class being an instance of type, the Actor class is an instance of ActorMetaclass. ActorMetaclass is a subclass of type, so it shares this __new__ method. Given metaclasses are inherited just like anything else, this means any subclass - like our 'example' above share this metaclass.

As a result, the above 'example' class statement is (approximately) the same as this:

def ping_fn(self, callback):
    callback(self)

def unique_id_fn(self):
    return 'example_'+ str(id(self))

def process_fn(self):
    self.Pling()

def Pling_fn(self):
    pass

example = ActorMetaclass('example', [Actor],
                         {
                           'example' : ('ACTORMETHOD', example_fn),
                           'unique_id' : ('ACTORFUNCTION', unique_id_fn),
                           'process' : ('PROCESSMETHOD', process_fn),
                           'Pling' : ('LATEBINDSAFE', Pling_fn)
                         })

This results in a call to our __new__ method. Our new method eventually had to call type.__new__() as in the section above, but before we do we can replace the values in the dictionary.

The logic in Guild's metaclass is this:

new_dct = {}
for name,val in dct.items():
    new_dct[name] = val
    if val.__class__ == tuple and len(val) == 2
        tag, fn = str(val[0]), val[1]
        if tag.startswith("ACTORMETHOD"):
             # create stub function to enqueue a call to fn within the thread

        elif tag.startswith("ACTORFUNCTION"):
             # create stub function to enqueue a call to fn within the thread,
             # wait for a response and then to return that to the caller.

        elif tag.startswith("PROCESSMETHOD"):
             # create a stub function that repeatedly (effectively) enqueues
             # calls to fn within the thread.

        elif tag == "LATEBIND":
             # create a stub function that when called throws an exception,
             # specifically an UnboundActorMethod exception. The reason is
             # because it allows someone to detect when an 'outbox'/our late
             # bindable method has been used without being bound to.

        elif tag == "LATEBINDSAFE":
             # In terms of the implementation, this actually has the same effect
             # as an actor method. However in terms of interpretation it's a hint
             # to users that this method is expected to be rebound to a different
             # actor's method.

return type.__new__(cls, clsname, bases, new_dct)

Actual implementation of an Actor subclass

The upshot of this is the decorator tags the functions which need a proxy outside the thread to allow calls then to be enqueued for sending to the thread to execute.

This means our example class above is (effectively) transformed into this:

class example(Actor):

    def ping(self, *args, **argd):
        def ping_fn(self, callback):
            callback(self)

        self.inbound.put_nowait( (ping_fn, self, args, argd) )

    def unique_id(self, *args, **argd):
        def unique_id_fn(self):
            return 'example_'+ str(id(self))

        resultQueue = _Queue.Queue()
        self.F_inbound.put_nowait( ( (unique_id_fn, self, args, argd), resultQueue) )

        e, result = resultQueue.get(True, None)
        if e != 0:
            raise e.__class__, e, e.sys_exc_info
        return result

    def process (self):
        def process_fn(self):
            self.Pling()

        def loop(self, *args, **argd):
            x = process_fn(self)
            if x == False:
                return
            self.core.put_nowait( (loop, self, (),{} ) )

        self.core.put_nowait( (loop, self, (),{} ) )

    def Pling(self, *args, **argd ):
        def Pling_fn(self):
            pass

        self.inbound.put_nowait( (Pling_fn, self, args, argd) )

The Actor class

From these stub methods, it should be clear that the implementation of the Actor class there has the following traits:

  • Each actor has a collection of queues for sending messages into the thread.
  • The thread has a main loop that consists of a simple interpreter (or event dispatcher you prefer)

Additionally, Actors may have a gen_process method which returns a generator. This generator is then executed - given a time slice if you will - by the main thread in between checking each of the inbound queues & handling requests.

The reason for this being a generator is not for performance reasons. The reason for it is to allow the implementation of an Actor stop() method. That stop method looks like this:

def stop(self):
    self.killflag = True

The main runloop repeatedly checks this flag, and if set throws a StopIteration exception into the generator.

The upshot of this is the use of a generator in this way allows the thread to be 'interrupted', receive and handle messages in a threadsafe manner so on.

The logic within the thread is as follows:

def main(self):
    self.process_start()
    self.process()
    g = self.gen_process() # set to None if fails
    while True:
        if g != None:
            g.next()
        yield 1
        if # any queue had data:

           if self.inbound.qsize() > 0: # handle actor methods
                command = self.inbound.get_nowait()
                self.interpret(command) # if fails, call self.stop()

           if self.F_inbound.qsize() > 0: # Actor functions
                command, result_queue = self.F_inbound.get_nowait()
                result_fail = 0
                try:
                    result = self.interpret(command)
                except Exception as e: # Capture exception to throw back
                    result_fail = e
                    result_fail.sys_exc_info = sys.exc_info()[2]
                result_queue.put_nowait( (result_fail, result) )

           if self.core.qsize() > 0: # used by 'process method'
                command = self.core.get_nowait()
                self.interpret(command)
        else:
          if g == None: # Don't eat all CPU if no generator
              time.sleep(0.01) # would be better to wait in the queues.

(The above code ignores the error handling inside the code for simplicity)

Finally, the interpret function that executes the actual methods within the thread looks like this: (again ignoring errors)

def interpret(self, command):
    # print command
    callback, zelf, argv, argd = command
    if zelf:
        result = callback(zelf, *argv, **argd)
        return result
        # if there was a type error exception complain vociferously, and re-raise
    else:
        result = callback(*argv, **argd)
        return result

Binding Late Bound Actor Methods

Using our Camera and Display examples from before, this means effectively doing this:

camera = Camera()
display = Display()

camera.output = display.input

However, that last line is changing the state of an object which is owned another thread. As a result we need to change this attribute within the thread. Using our code above, this is now quite simple:

@actor_method
def bind(self, source, dest, destmeth):
    # print "binding source to dest", source, "Dest", dest, destmeth
    setattr(self, source, getattr(dest, destmeth))

That's then used like this:

camera = Camera()
display = Display()

camera.bind('output', display, 'input')

Summary

Guild actors are implemented using a small number of inbound queues per object to allow them to receive messages. These messages are received by the thread, and interpreted as commands to cause specific methods to called.

Decorators are used by the user to effectively tag the methods, to describe how they will used, allowing the ActorMetaclass to transform the calls into thread safe calls that enqueue data to the appropriate inbound queues.

The key reason for the use of decorators and the metaclass is to wrap up the thread safety logic one place, and also acts as syntactic sugar making the logic of Actor threads much clearer and simpler to interpret and use correctly.

The bulk of the logic of the message queue handling, along with user behaviour for an active actor, is implemented using generators the reason being to allow the threads be interrupted and shut down cleanly. Beyond that there are a small number of helper functions.

For those interested, take a look at the implementation on github .

As usual, comments welcome.

Read and Post Comments

Readable concurrency in Python

March 16, 2014 at 05:30 PM | categories: python, actors, concurrency, kamaelia | View Comments

Last week there were a couple of interesting posts by Glyph Lefkowitz and Rob Miller on concurrency. Both are well worth a read. One of the examples present by Glyph is the canonical concurrent update problem. This essentially happens when an update takes multiple streps and can be interfered with. Rob's post essentially presents a solution in Go.

The core of this is that unconstrained shared mutable state in a concurrent situation is a bad idea. This is something that I've spoken about in the past with regard to Kamaelia. In fact, in Kamaelia, there were two ways of handling this. One was to essentially funnel all requests for updating values through a "cashier" component. The other was to use software transactional memory. Kamaelia provides tools for both approaches.

Guild also provides tools for both approaches. The key tool for the cashier approach is guild.actor. The key tool for the STM approach, is guild.stm.

Given Glyph's and Rob's posts had touched on ideas I've spoken about in the past, I thought it might be nice to work through the example in Guild. Initially we'll model an account as an actor, and test it with some basic non-threaded code. Then we'll test it with 3 actors randomly withdrawing cash and 1 more randomly adding cash. Finally, we'll show what happens when 2 actors both have accounts and are randomly transferring money from each others accounts. (Interestingly this last one kinda makes certain ideas of banking clearer to me :-)

Because they were the first names that sprang to mind, this post uses the names from characters from the Flintstones. I blame Red Dwarf.

Basic Account Actor

Source: guild/examples/blog/account-1.py

So, first of all the account actor using Guild. As before, first of all the code, then a discussion.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
from guild.actor import *

class InsufficientFunds(ActorException):
    pass

class Account(Actor):
    def __init__(self, balance=10):
        super(Account, self).__init__()
        self.balance = balance

    @actor_function
    def deposit(self, amount):
        # I've made this a function to allow the value to be confirmed deposited
        print "DEPOSIT", "\t", amount, "\t", self.balance
        self.balance = self.balance + amount
        return self.balance

    @actor_function
    def withdraw(self, amount):
        if self.balance < amount:
            raise InsufficientFunds("Insufficient Funds in your account",
                                    requested=amount, balance=self.balance)
        self.balance = self.balance - amount
        print "WITHDRAW", "\t", amount, "\t", self.balance
        return amount

This should be pretty readable.

First of all the logic of what's happening:

  • We define an exception InsufficientFunds to raise when someone tries to withdraw more money than the account contains

  • We define a subclass of actor - Account. Since we need to initialise it with a balance, we must call the superclass initialiser at line 8. Our Account objects have 2 operations that they can perform: deposit and withdraw.

  • withdraw checks that sufficient funds are available. If they are, then the funds are returned as a result, after updating the balance and logging the result. If they are not, an InsufficientFunds exception is raised, which the caller thread will have to deal with.

  • deposit takes the amount of funds, updates the balance, logs the results and returns the new balance to the caller thread.

What's happening in terms of mechanics? (links below take you to the code in github)

  • withdraw is an actor_function. What does this mean? It means that the caller calls the method. In the caller thread, this places the message ((withdraw, self, amount),resultQueue) onto an inbound queue to the actor. The caller thread then waits for a response. The actor receives the message, does the work, and posts the results back down the result queue. The caller retrieves this, and returns the result to the caller. If there was an exception thrown within the actor, this is re-raise inside caller thread. As a result our withdraw function can look pretty normal. If there's insufficient funds, the caller gets an exception to deal with. If there are sufficient funds, the balance is updated, a message is logged to the console, and the amount of money is returned to the caller.

  • deposit is also an actor_function. It doesn't need to be because depositing money always succeeds, however it's nice for deposit to return the updated balance to the caller. (If the caller doesn't care, this could be an actor_method instead)

Single threaded Account user

Source: guild/examples/blog/account-1.py

So, let's use this. In the main thread we'll create the account and start it. We'll then define 3 account users who always only withdraw funds - Fred, Barney and Wilma. In our simulation Betty is the person earning money - but she's not mentioned here.

Anyway, in each iteration through this loop, Betty earns 100, and Fred, Barney and Wilma each randomly pick an amount between 10 and 160. This continues over and over until someone tries to take more money than is in the account. We then report the amount grabbed, stop the account and exit. The code looks like this:

account = Account(1000).go()

fred, barney, wilma = 0,0,0
try:
    while True:
        account.deposit(100)
        fred += account.withdraw(random.choice([10,20,40,80,160]))
        barney += account.withdraw(random.choice([10,20,40,80,160]))
        wilma += account.withdraw(random.choice([10,20,40,80,160]))
except InsufficientFunds as e:
    print e.message
    print "Balance", e.balance
    print "Requested", e.requested
    print account.balance


print "GAME OVER"

print "Fred grabbed", fred
print "Wilma grabbed", wilma
print "Barney grabbed", barney

account.stop()
account.join()

There's not an awful lot to discuss with this. This logic should be pretty clear. The fact that the account is in a different thread isn't particularly interesting - but it shows the basic logic of depositing/withdrawing funds. What does the output look like?

DEPOSIT         100     1000
WITHDRAW        40      1060
WITHDRAW        40      1020
WITHDRAW        40      980
DEPOSIT         100     980
       ... snip ...
WITHDRAW        160     310
WITHDRAW        160     150
DEPOSIT         100     150
WITHDRAW        160     90
Insufficient Funds in your account
Balance 90
Requested 160
90
GAME OVER
Fred grabbed 800
Wilma grabbed 610
Barney grabbed 800

Multiple threads access the Account

Source: guild/examples/blog/account-2.py

In this example, we create two new actors - MoneyDrain and MoneySource.

  • MoneyDrain - This sits there and repeatedly tries to withdraw random amounts of funds from the Account, and keeps track of how much money it's tried to grab. When the account has insufficient funds to withdraw from, the MoneyDrain gives up, complains and stops() itself.
  • MoneySource - This sits there, and repeatedly adds a random amount of funds to the account.

We then rewrite our simulation as follows: We still have one shared account Fred, Betty, and Barney are now all MoneyDrains on Wilma. * Wilma is the sole source of income for the group -ie a MoneySource

The system is then started, and runs until Fred, Betty and Barney have all taken as money as they can. Wilma is then stopped and the total funds reported.

# InsufficientFunds/etc as before

class MoneyDrain(Actor):
    def __init__(self, sharedaccount):
        super(MoneyDrain, self).__init__()
        self.sharedaccount = sharedaccount
        self.grabbed = 0

    @process_method
    def process(self):
        try:
            grabbed = self.sharedaccount.withdraw(random.choice([10,20,40,80,160]))
        except InsufficientFunds as e:
            print "Awww, Tapped out", e.balance, "<", e.requested
            self.stop()
            return
        self.grabbed = self.grabbed + grabbed

class MoneySource(Actor):
    def __init__(self, sharedaccount):
        super(MoneySource, self).__init__()
        self.sharedaccount = sharedaccount

    @process_method
    def process(self):
        self.sharedaccount.deposit(random.randint(1,100))

account = Account(1000).go()

fred = MoneyDrain(account).go()
barney = MoneyDrain(account).go()
betty = MoneyDrain(account).go()

wilma = MoneySource(account).go() # Wilma carries all of them.

wait_for(fred, barney, betty)
wilma.stop()
wilma.join()

account.stop()
account.join()

print "GAME OVER"

print "Fred grabbed", fred.grabbed
print "Wilma grabbed", barney.grabbed
print "Betty grabbed", betty.grabbed
print "Total grabbed", fred.grabbed + barney.grabbed + betty.grabbed
print "Since they stopped grabbing..."
print "Money left", account.balance

Things worth noting here - we've got 4 completely separate free running threads all acting on shared state (shared funds) in a 5th thread. We're able to start them all off, they operate cleanly, and at this level we can trust the behaviour of the 5 threads - due to the fact that the Accounts actor ensures that operations on the shared state are serialised into atomic operations. As a result, we can completely trust this code to operate in the manner which we expect it to.

It's also worth noting that when the withdraw method fails, the exception is thrown inside the appropriate thread. This is visible in the output below because all 3 threads have to run out access to funds for the program to exit.

What does the output from this look like?

WITHDRAW        10      990
WITHDRAW        20      970
WITHDRAW        40      930
DEPOSIT         74      930
WITHDRAW        20      984
       ... snip ...
WITHDRAW        20      47
DEPOSIT         44      47
WITHDRAW        80      11
WITHDRAW        10      1
Awww, Tapped out 1 < 80
DEPOSIT         93      1
Awww, Tapped outAwww, Tapped out 1 < 160
 1 < 40
DEPOSIT         77      94
GAME OVER
Fred grabbed 220
Wilma grabbed 510
Betty grabbed 560
Total grabbed 1290
Since they stopped grabbing...
Money left 171

Multiple threads transferring funds between multiple Accounts

Source: guild/examples/blog/account-3.py

This final example is a bit of fun, but also explicitly shows how to implement a function for transferring funds. Before the main example, let's look at the transfer function:

def transfer(amount, payer, payee):
    funds = payer.withdraw(amount)
    payee.deposit(funds)

This looks deceptively simple. In practice, what happens is someone calls the function with 2 accounts. The appropriate funds are withdrawn from one account and deposited in the other. This is guaranteed to be thread safe due to this translating to the following operations:

  • caller: Create a ResultQueue
  • caller: create message ((withdraw, payer, amount), ResultQueue) and place on payer's F_inbound queue.
  • caller: wait for message on ResultQueue
  • payer: receive message from F_inbound queue
  • payer: perform contents of method withdraw(self, amount) - put result in "result"
  • payer: if an exception is raised put (exception, None) on ResultQueue
  • payer: if an exception is not raise put (0, result) on ResultQueue
  • caller: if result is (exception, None), rethrow exception
  • caller: if result is (0, result), then the result is stored in "funds"
  • caller - create message ((deposit, payee, funds), ResultQueue) and place on payee's F_inbound queue.
  • caller: wait for message on ResultQueue
  • payee: receive message from F_inbound queue
  • payee: perform contents of method deposit(self, amount) - put result in "result"
  • payee: if an exception is raised put (exception, None) on ResultQueue
  • payee: if an exception is not raise put (0, result) on ResultQueue
  • caller: if result is (exception, None), rethrow exception
  • caller: if result is (0, result), then the result is discarded, and the function exits

This then allows us to create a MischiefMaker. Our MischiefMaker will be given two accounts - their own and a friends. They will then repeatedly transfer random amounts of funds out of their friends account. They'll also keep track of how much money they've grabbed from their friend.

An example of tracing the logic here might be this:

  • Barney/Fred balances: 1000,1000
  • Barney grabs 250, Freb grabs 250 - Barney/Fred balances: 1000,1000
  • Barney grabs 250, Freb grabs 250 - Barney/Fred balances: 1000,1000
  • Barney grabs 250, Freb grabs 250 - Barney/Fred balances: 1000,1000
  • Barney grabs 250, Freb grabs 250 - Barney/Fred balances: 1000,1000
  • Barney grabs 250, Freb grabs 250 - Barney/Fred balances: 1000,1000
  • Barney grabs 500, Freb grabs 250 - Barney/Fred balances: 1250,750
  • Barney grabs 500, Freb grabs 250 - Barney/Fred balances: 1500,500
  • Barney grabs 500, Freb grabs 250 - Barney/Fred balances: 1750,250
  • Barney grabs 500, Freb grabs 250 - FAILS, Barney gives up. Fred then continues.

The upshot here is that both Fred and Barney are grabbing what they think is alot more than 1000 each, even though there's only 2000 in circulation. This seems a bit counter intuitive, but when you consider the banking system does essentially operate this way - just with more actors - it makes more sense.

So the MischiefMaker code looks like this:

class MischiefMaker(Actor):
    def __init__(self, myaccount, friendsaccount):
        super(MischiefMaker,self).__init__()
        self.myaccount = myaccount
        self.friendsaccount = friendsaccount
        self.grabbed = 0

    @process_method
    def process(self):
        try:
            grab = random.randint(1,10)*10
            transfer(grab, self.friendsaccount, self.myaccount)
        except InsufficientFunds as e:
            print "Awww, Tapped out", e.balance, "<", e.requested
            self.stop()
            return
        self.grabbed = self.grabbed + grab

As before, this should be fairly clear. We keep track of accounts, and transfers occur bidirectionally as quickly as possible.

account1 = Account(1000).go()
account2 = Account(1000).go()

fred = MischiefMaker(account1, account2).go()
barney = MischiefMaker(account2, account1).go()


wait_for(fred, barney)

account1.stop()
account2.stop()
account1.join()
account2.join()

print "GAME OVER"

print "Fred grabbed", fred.grabbed
print "Barney grabbed", barney.grabbed
print "Total grabbed", fred.grabbed + barney.grabbed
print "Since they stopped grabbing..."
print "Money left", account1.balance, account2.balance

When we run this, all 4 threads are free running. Fred grabs money, Barney grabs money, and the fact withdraw and deposit are actor_functions ensures that the values in each account are valid at all points in time. The upshot of this is that when the simulation ends, we started with a total of 2000 and we finished with a total of 2000. Snipping the now substantial output somewhat:

INITIAL         1000
INITIAL         1000
WITHDRAW        90      910
WITHDRAW        50      950
DEPOSIT         90      910
DEPOSIT         50      950
WITHDRAW        20      980
WITHDRAW        10      990
DEPOSIT         20      980
DEPOSIT         10      990
WITHDRAW        100     900
WITHDRAW        90      910
DEPOSIT         100     910
DEPOSIT         90      900
        ... snip ...
DEPOSIT         50      850
WITHDRAW        90      810
DEPOSIT         90      1100
WITHDRAW        30      780
        ... snip ...
WITHDRAW        10      100
DEPOSIT         10      1890
WITHDRAW        30      70
DEPOSIT         30      1900
WITHDRAW        20      50
DEPOSIT         20      1930
Awww, Tapped out 50 < 100
GAME OVER
Fred grabbed 27560
Barney grabbed 28350
Total grabbed 55910
Since they stopped grabbing...
Money left 50 1950
Ending money 2000

The thing I like about this example incidentally is that it shows Fred and Barney having very large logical incomes from each other, whereas in reality there was a fixed amount of cash. (Essentially this means Fred and Barney are borrowing from each other, much like banks do)

Conclusion

Not only can concurrency be dealt with sanely - as per Rob's point, it can also look nice, and be developer friendly. If you extend the actor model to include actor_functions, complex problems like concurrent update can become clear to work with.

In a later post I'll go into the internals of how this is implemented, but the description of how the transfer method operates should make it clearer that essentially each actor serialises actions upon it, ensuring that actor state can only be updated by one thread at a time.

Links to the three examples:

If you find this interesting, perhaps give it a try at some point. I personally find it a more practical approach - especially when dealing with things that are naturally concurrent.

Comments welcome.

Read and Post Comments

Guild - pipelinable actors with late binding

March 07, 2014 at 11:51 PM | categories: open source, python, iot, bbc, actors, concurrency, kamaelia | View Comments

Guild is a python library for creating thread based applications.

Threads are represented using actors - objects with threadsafe methods. Calling a method puts a message on an inbound queue for execution within the thread. Guild actors can also have stub actor methods, representing output. These are stub methods which are expected to be rebound to actor methods on other actors. These stub methods are called late bind methods. This allows pipelines of Guild actors to be created in a similar way to Unix pipelines.

Additionally, Guild actors can be active or reactive. A reactive actor performs no actions until a message is received. An active guild actor can be active in two main ways: it can either repeatedly perform an action, or more complex behaviour can use a generator in a coroutine style. The use of a generator allows Guild actors to be stopped in a simpler fashion than traditional python threads. Finally, all Guild actors provide a default 'output' late-bindable method, to cover the common case of single input, single output.

Finally, Guild actors are just python objects and actors with additional functionality - it's designed to fit in with your code, not the other way round. This post covers some simple examples of usage of Guild, and how it differs (slightly) from traditional actors.

Getting and Installing

Installation is pretty simple:

$ git clone https://github.com/sparkslabs/guild
$ cd guild
$ sudo python setup.py install

If you'd prefer to build, install and use a debian package:

$ git clone https://github.com/sparkslabs/guild
$ cd guild
$ make deb
$ sudo dpkg -i ../python-guild_1.0.0_all.deb

Example: viewing a webcam

This example shows the use of two actors - webcam capture, and image display. The thing to note here is that we could easily add other actors into the mix - for network serving, recording, analysis, etc. If we did, the examples below can be reused as is.

First of all the code, then a brief discussion.

import pygame, pygame.camera, time
from guild.actor import *
pygame.camera.init()

class Camera(Actor):
    def gen_process(self):
        camera = pygame.camera.Camera(pygame.camera.list_cameras()[0])
        camera.start()
        while True:
            yield 1
            frame = camera.get_image()
            self.output(frame)
            time.sleep(1.0/50)

class Display(Actor):
    def __init__(self, size):
        super(Display, self).__init__()
        self.size = size

    def process_start(self):
        self.display = pygame.display.set_mode(self.size)

    @actor_method
    def show(self, frame):
        self.display.blit(frame, (0,0))
        pygame.display.flip()

    input = show

camera = Camera().go()
display = Display( (800,600) ).go()
pipeline(camera, display)
time.sleep(30)
stop(camera, display)
wait_for(camera, display)

In this example, Camera is an active actor. That is it sits there, periodically grabbing frames from the webcam. To do this, it uses a generator as a main loop. This allows the fairly basic behaviour of grabbing frames for output to be clearly expressed. Note also this actor does use the normal blocking sleep function.

The Display Actor initialises by capturing the passed parameters. Once the actor has started, it's process_start method is called, enabling it to create a display, it then sits and waits for messages. These arrive when a caller calls the actor method 'show' our its alias 'input'. When that happens the upshot is that the show method is called, but in a threadsafe way - and it simply displays the image.

The setup/tear down code shows the following:

  • Creation of, and starting of, the Camera actor
  • Creation and start of the display
  • Linking the output of the Camera to the Display
  • The main thread then waits for 30 seconds - ie it allows the program to run for 30 seconds.
  • The camera and display actors are then stopped
  • And the main thread waits for the child threads to exit before exitting itself.

This could be simplified (and will be), but it shows that even though the actors had no specific shut down code, they shut down cleanly this way.

Example: following multiple log files looking for events

This example follows two log files, and grep/output lines matching a given pattern. In particular, it maps to this kind of command line:

$ (tail -f x.log & tail -f y.log) | grep pants

This example shows that there are still some areas that would benefit from additional syntactic sugar when it comes to wiring together pipelines. In particular, this example should be writable together like this:

Pipeline( Parallel( Follow("x.log"), Follow("y.log"),
          Grep("pants"),
          Printer() ).run()

However, I haven't implemented the necessary chassis yet (they will be).

Once again, first the code, then a discussion.

from guild.actor import *
import re, sys, time

class Follow(Actor):
    def __init__(self, filename):
        super(Follow, self).__init__()
        self.filename = filename
        self.f = None

    def gen_process(self):
        self.f = f = file(self.filename)
        f.seek(0,2)   # seek to end
        while True:
            yield 1
            line = f.readline()
            if not line: # no data, so wait
                time.sleep(0.1)
            else:
                self.output(line)

    def onStop(self):
        if self.f:
            self.f.close()

class Grep(Actor):
    def __init__(self, pattern):
        super(Grep, self).__init__()
        self.regex = re.compile(pattern)

    @actor_method
    def input(self, line):
        if self.regex.search(line):
            self.output(line)

class Printer(Actor):
    @actor_method
    def input(self, line):
        sys.stdout.write(line)
        sys.stdout.flush()

follow1 = Follow("x.log").go()
follow2 = Follow("y.log").go()
grep = Grep("pants").go()
printer = Printer().go()

pipeline(follow1, grep, printer)
pipeline(follow2, grep)
wait_KeyboardInterrupt()
stop(follow1, follow2, grep, printer)
wait_for(follow1, follow2, grep, printer)

As you can see, like the bash example, we have two actors that tail/follow two different log files. These both feed into the same 'grep' actor that matches the given pattern, and these are finally passed to a Printer actor for display. Each actor shows slightly different aspects of Guild's model.

  • Follow is an active actor. It captures the filename to follow in the initialiser, and creates a placeholder for the associated file handle. The main loop them follows the file, calling its output method when it has a line. Finally, it will continue doing this until its .stop() method is called. When it is, the generator is killed (via a StopIteration exception being passed in), and the actor's onStop method is called allowing the actor to close the file.

  • Grep is a simple reactive actor with some setup. In particular, it takes the pattern provided, compiles a regex matcher using it. Then any actor call to its input method results in any matching lines to be passed through via its output method.

  • Printer is a simple reactive actor. Any actor call to it's input method results in the data passed in being sent to stdout.

Work in progress

It is worth noting that Guild at present is not a mature library yet, but is sufficiently useful for lots of tasks. In particular, one area Guild will improve on in - specifying coordination more compactly. For example, the Camera example could become:

Pipeline( Camera(),  Display( (800,600) ) ).run()

That's a work in progress however, adding with other chassis, and other useful parts of kamaelia.

What are actors?

Actors are threads with a mailbox allowing them to receive and act upon messages. In the above webcam example, it has 2 threads, one for capturing images, and one for display. Images from the webcam end up in the mailbox for the display, which displays images it receives. Often actor libraries wrap up the action of sending a message to the mailbox of an actor via a method on the thread object.

The examples above demonstrate this above via the decorated methods:

  • Display.show, Grep.input, Printer.input

All of these methods - when called by a client of the actor - take all the arguments passed in, along with their function and place on the actor's mailbox (a thread safe queue). The actor then has a main loop that checks this mailbox and executes the method within the thread.

How does Guild differ from the actor model?

In a traditional actor model, the code in the camera Actor might look like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
import pygame, pygame.camera, time
from guild.actor import *
pygame.camera.init()

class Camera(Actor):
    def __init__(self, display):
        super(Camera, self).__init__()
        self.display = display

    def gen_process(self):
        camera = pygame.camera.Camera(pygame.camera.list_cameras()[0])
        camera.start()
        while True:
            yield 1
            frame = camera.get_image()
            self.display.show(frame)
            time.sleep(1.0/50)
  • NB: This is perfectly valid in Guild. If you don't want to use the idea of late bound methods or pipelining, then it can be used like any other actor library.

If you did this, the display code would not need any changes. The start-up code that links things together though would now need to look like this:

display = Display( (800,600) ).go()
camera = Camera(display).go()
# No pipeline line anymore
time.sleep(30)
stop(camera, display)
wait_for(camera, display)

On the surface of things, this looks like a simplification, and on one level it is - we've removed one line from the program start-up code. Our camera object however now has its destination embedded at object initialisation and it's also become more complex, with zero increase in flexibility. In fact I'd argue you've lost flexibility, but I'll leave why for later.

For example, suppose we want to record the images to disk, we can do this by adding a third actor that can sit in the middle of others:

import time, os
class FrameStore(Actor):
    def __init__(self, directory='Images', base='snap'):
        super(FrameStore, self).__init__()
        self.directory = directory
        self.base = base
        self.count = 0

    def process_start(self):
        os.makedir(self.directory)
        try:
            os.makedirs("Images")
         except OSError, e:
            if e.errno != 17: raise

    @actor_method
    def input(self, frame):
        self.count += 1
        now = time.strftime("%Y%m%d-%H%M%S",time.localtime())
        filename = "%s/%s-%s-%05d.jpg" % (self.directory, self.base, now, self.count)
        pygame.image.save(frame, filename)
        self.output(frame)

This could then be used in a Guild pipeline system this way:

camera = Camera().go()
framestore = FrameStore().go()
display = Display( (800,600) ).go()
pipeline(camera, framestore, display) 
time.sleep(30)
stop(camera, framestore, display) 
wait_for(camera, framestore, display)

It's for this reason that Guild supports late bindable actor methods.

What's happening here is that the definition of Actor includes this:

class Actor(object):
    #...
    @late_bind_safe
    def output(self, *argv, **argd):
        pass

That means every actor has available "output" as a late bound actor method.

This pipeline called:

pipeline(camera, display)

Essentially does this:

camera.bind("output", display, "input")

This transforms to a threadsafe version of this:

camera.output = display.input

As a result, it replaces the call camera.output with a call to display.input for us - meaning that it is as efficient to do camera.output as it is to do self.display.show in the example above - but significantly more flexible.

There are lots of fringe benefits of this - which are best discussed in later posts, but this does indicate best how Guild differs from the usual actor model.

Why write and release this?

About a year ago, I was working on a project with an aim of investigating various ideas relating to of the Internet of Things. (In particular, which definition of that really mattered to us, why, and what options it provided)

As part of that project, I wrote a small/just big though library suitable for testing some ideas I'd had regarding integrating some ideas in Kamaelia, with the syntactic sugar in the actor model. Essentially, to map Kamaelia's inboxes and messages to traditional actor methods, and maps outboxes to late bound actor methods. Use of standard names and/or aliases would allow pipelining.

Guild was the result, and it's proven itself useful in a couple out projects, hence its packaging as a standalone library. Like all such things, it's a work in progress, but it also has a cleaner to use version of Kamaelia's STM code, and includes some of the more useful components like pipelines and backplanes.

If you find it useful or spot a typo, please let me know.

Read and Post Comments