Minimalistic Software Transactional Memory
December 08, 2007 at 11:03 PM | categories: python, oldblog | View Comments
I've just posted this message on comp.lang.python looking for comments, but I'll post it here in case there's any feedback this way :-)
I'm interested in writing a simple, minimalistic, non persistent (at this stage) software transactional memory (STM) module. The idea being it should be possible to write such a beast in a way that can be made threadsafe fair easily.
For those who don't know, STM is a really fancy way of saying variables with version control (as far as I can tell :-) designed to enable threadsafe shared data.
I'm starting with the caveat here that the following code is almost certainly not threadsafe (not put any real thought into that as yet), and I'm interested in any feedback on the following:
OK, API.
First of all we need to initialise the store:
The way this works is we have a Store that manages Values. (perhaps a better name may be variables to avoid clashing with pythons parlance of labels and values?)
Anyhow, you can ask the store for Value, which it will give you. The Value knows what it's called and where it's from, so when you tell it to commit, it can try to do so. The little detail here is you get a copy of the Value not the stored Value. (This is to avoid accidental concurrent update of the same actual object)
As a result, Store looks like this:
Otherwise I think the above code is relatively straightforward. You'll note that this API allows this:
Full code below.
I'm interested in writing a simple, minimalistic, non persistent (at this stage) software transactional memory (STM) module. The idea being it should be possible to write such a beast in a way that can be made threadsafe fair easily.
For those who don't know, STM is a really fancy way of saying variables with version control (as far as I can tell :-) designed to enable threadsafe shared data.
I'm starting with the caveat here that the following code is almost certainly not threadsafe (not put any real thought into that as yet), and I'm interested in any feedback on the following:
- Does the API look simple enough?
- Are there any glaring mistakes in the code ? (It's always harder to see your own bugs)
- What key areas appear least thread-safe, and any general suggestions around that.
OK, API.
First of all we need to initialise the store:
S = Store()We then want to get a value from the store such that we can use the value, and do stuff with it:
greeting = S.using("hello")Access the value:
print repr(greeting.value)Update the value:
greeting.set("Hello World")Commit the value back to the store:
greeting.commit()If you have concurrent updates of the same value, the following exception gets thrown:
ConcurrentUpdatecf:
>>> S = Store()That's pretty much the simplest API I can come up with. (I've tried a few others but didn't really like them)
>>> greeting = S.using("hello")
>>> par = S.using("hello")
>>> greeting.set("Hello World")
>>> par.set("Woo")
>>> greeting.commit()
>>> par.commit()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 12, in commit
File "<stdin>", line 11, in set
__main__.ConcurrentUpdate
The way this works is we have a Store that manages Values. (perhaps a better name may be variables to avoid clashing with pythons parlance of labels and values?)
Anyhow, you can ask the store for Value, which it will give you. The Value knows what it's called and where it's from, so when you tell it to commit, it can try to do so. The little detail here is you get a copy of the Value not the stored Value. (This is to avoid accidental concurrent update of the same actual object)
As a result, Store looks like this:
class Store(object):You'll note that the set method is the greatest candidate for any possible race hazard here - though I can see a possible boundary issue in "using". (I think :-)
def __init__(self):
self.store = {}
def get(self, key):
return self.store[key].clone()
def set(self, key, value):
if not (self.store[key].version > value.version):
self.store[key] = Value(value.version+1, value.value, self, key)
value.version= value.version+1
else:
raise ConcurrentUpdate
def using(self, key):
try:
return self.get(key)
except KeyError:
self.store[key] = Value(0, None,self,key)
return self.get(key)
def dump(self):
for k in self.store:
print k, ":", self.store[k]
Otherwise I think the above code is relatively straightforward. You'll note that this API allows this:
greeting.set("Hello")The other class is value that looks like this:
greeting.commit()
greeting.set("Hello World")
greeting.commit()
greeting.set("Hello World. Game")
greeting.commit()
greeting.set("Hello World. Game Over")
greeting.commit()
class Value(object):To me this looks like a pretty complete minimalistic thing, which does seem to work OK, but I'm interested in the three points I mention above - if anyone is willing to comment - specifically:
def __init__(self, version, value,store,key):
self.version = version
self.value = value
self.store = store
self.key = key
def __repr__(self):
return "Value"+repr((self.version,self.value,self.store,self.key))
def set(self, value):
self.value = value
def commit(self):
self.store.set(self.key, self)
def clone(self):
return Value(self.version, self.value,self.store,self.key)
- Does the API look simple enough?
- Are there any glaring mistakes in the code ? (It's always harder to see your own bugs)
- What key areas appear least thread-safe, and any general suggestions around that.
Full code below.
class ConcurrentUpdate(Exception): pass
class Value(object):
def __init__(self, version, value,store,key):
self.version = version
self.value = value
self.store = store
self.key = key
def __repr__(self):
return "Value"+repr((self.version,self.value))
def set(self, value):
self.value = value
def commit(self):
self.store.set(self.key, self)
def clone(self):
return Value(self.version, self.value,self.store,self.key)
class Store(object):
def __init__(self):
self.store = {}
def get(self, key):
return self.store[key].clone()
def set(self, key, value):
if not (self.store[key].version > value.version):
self.store[key] = Value(value.version+1, value.value, self, key)
value.version= value.version+1
else:
raise ConcurrentUpdate
def using(self, key):
try:
return self.get(key)
except KeyError:
self.store[key] = Value(0, None,self,key)
return self.get(key)
def dump(self):
for k in self.store:
print k, ":", self.store[k]
S = Store()
greeting = S.using("hello")
print repr(greeting.value)
greeting.set("Hello World")
greeting.commit()
# ------------------------------------------------------
print greeting
S.dump()
# ------------------------------------------------------
par = S.using("hello")
par.set("Woo")
par.commit()
# ------------------------------------------------------
print greeting
S.dump()
# ------------------------------------------------------
greeting.set("Woo")
greeting.commit()
print repr(greeting), repr(greeting.value)
S.dump()