Random musings on random stuff.
I've just posted this message on comp.lang.python looking for comments, but I'll post it here in case there's any feedback this way :-)
I'm interested in writing a simple, minimalistic, non persistent (at this stage) software transactional memory (STM) module. The idea being it should be possible to write such a beast in a way that can be made threadsafe fair easily.
For those who don't know, STM is a really fancy way of saying variables with version control (as far as I can tell :-) designed to enable threadsafe shared data.
I'm starting with the caveat here that the following code is almost certainly not threadsafe (not put any real thought into that as yet), and I'm interested in any feedback on the following:
S = Store()We then want to get a value from the store such that we can use the value, and do stuff with it:
greeting = S.using("hello")Access the value:
print repr(greeting.value)Update the value:
greeting.set("Hello World")Commit the value back to the store:
greeting.commit()If you have concurrent updates of the same value, the following exception gets thrown:
ConcurrentUpdatecf:
>>> S = Store()That's pretty much the simplest API I can come up with. (I've tried a few others but didn't really like them)
>>> greeting = S.using("hello")
>>> par = S.using("hello")
>>> greeting.set("Hello World")
>>> par.set("Woo")
>>> greeting.commit()
>>> par.commit()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 12, in commit
File "<stdin>", line 11, in set
__main__.ConcurrentUpdate
class Store(object):You'll note that the set method is the greatest candidate for any possible race hazard here - though I can see a possible boundary issue in "using". (I think :-)
def __init__(self):
self.store = {}
def get(self, key):
return self.store[key].clone()
def set(self, key, value):
if not (self.store[key].version > value.version):
self.store[key] = Value(value.version+1, value.value, self, key)
value.version= value.version+1
else:
raise ConcurrentUpdate
def using(self, key):
try:
return self.get(key)
except KeyError:
self.store[key] = Value(0, None,self,key)
return self.get(key)
def dump(self):
for k in self.store:
print k, ":", self.store[k]
greeting.set("Hello")The other class is value that looks like this:
greeting.commit()
greeting.set("Hello World")
greeting.commit()
greeting.set("Hello World. Game")
greeting.commit()
greeting.set("Hello World. Game Over")
greeting.commit()
class Value(object):To me this looks like a pretty complete minimalistic thing, which does seem to work OK, but I'm interested in the three points I mention above - if anyone is willing to comment - specifically:
def __init__(self, version, value,store,key):
self.version = version
self.value = value
self.store = store
self.key = key
def __repr__(self):
return "Value"+repr((self.version,self.value,self.store,self.key))
def set(self, value):
self.value = value
def commit(self):
self.store.set(self.key, self)
def clone(self):
return Value(self.version, self.value,self.store,self.key)
class ConcurrentUpdate(Exception): pass
class Value(object):
def __init__(self, version, value,store,key):
self.version = version
self.value = value
self.store = store
self.key = key
def __repr__(self):
return "Value"+repr((self.version,self.value))
def set(self, value):
self.value = value
def commit(self):
self.store.set(self.key, self)
def clone(self):
return Value(self.version, self.value,self.store,self.key)
class Store(object):
def __init__(self):
self.store = {}
def get(self, key):
return self.store[key].clone()
def set(self, key, value):
if not (self.store[key].version > value.version):
self.store[key] = Value(value.version+1, value.value, self, key)
value.version= value.version+1
else:
raise ConcurrentUpdate
def using(self, key):
try:
return self.get(key)
except KeyError:
self.store[key] = Value(0, None,self,key)
return self.get(key)
def dump(self):
for k in self.store:
print k, ":", self.store[k]
S = Store()
greeting = S.using("hello")
print repr(greeting.value)
greeting.set("Hello World")
greeting.commit()
# ------------------------------------------------------
print greeting
S.dump()
# ------------------------------------------------------
par = S.using("hello")
par.set("Woo")
par.commit()
# ------------------------------------------------------
print greeting
S.dump()
# ------------------------------------------------------
greeting.set("Woo")
greeting.commit()
print repr(greeting), repr(greeting.value)
S.dump()
Wouldn't it be more useful having a store (possibly with a mapping
API?) that stored several values. Then a single commit can be done at
the end of the transaction.
If the transaction fails, then the whole process can be repeated with none of the changes having been committed.
The
tricky part is that other threads asking for the value should get the
*old* value until the commit has been done - so you need to know which
thread 'value' is being asked for from.
Michael
Quite probably yes :-)
However initially I thought I'd start off with trying to get single values right :-) (A single value can be a dict of course, but your point holds for arbitrary unrelated values)
I suspect the way to deal with that might be to do something like:
S = Store()Which seems pretty nice/simple. It'd be useful for the commit failure to indicate which value was broken by a concurrent update, but not necessary.
D = S.using("account_one", "account_two", "myaccount")
D["myaccount"].set(D["account_one"].value+D["account_two"].value)
D["account_one"].set(0)
D["account_two"].set(0)
D.commit()
> The tricky part is that other threads asking for the value should getI think this approach deals with this naturally. cf:
> the old value until the commit has been done - so you need to know
> which thread 'value' is being asked for from.
def get(self, key): # in StoreThis means that the value in 2 threads for the same checked out value names can be different, but they get warning of this when they try to commit.
return self.store[key].clone()
...
def clone(self): # in Value
return Value(self.version, self.value,self.store,self.key)
...
def set(self, key, value):
if not (self.store[key].version > value.version):
self.store[key] = Value(value.version+1, value.value, self, key)
value.version= value.version+1
else:
raise ConcurrentUpdate
S = Store()I did think that this would work with the current code but changing the value to something more complex - like this:
D = S.using("account_one", "account_two", "myaccount")
D["myaccount"].set(D["account_one"].value+D["account_two"].value)
D["account_one"].set(0)
E = S.using("account_one", "myaccount")
E["myaccount"].set(E["myaccount"].value-100)
E["account_one"].set(100)
E.commit()
D["account_two"].set(0)
D.commit()
S = Store()But in practice, because I'm not doing a copy here of self.value:
D = S.using("accounts")
D.set({"account_one":50, "account_two":100, "myaccount":0})
D.commit()
print "here"
S.dump()
X = D.value
X["myaccount"] = X["account_one"] + X["account_two"]
X["account_one"] = 0
X["account_two"] = 0
D.set(X)
D.commit()
S.dump()
print "Committed", D.value["myaccount"]
def clone(self): # in ValueI end up with accidental concurrent access, which is easy to fix, for
return Value(self.version, self.value,self.store,self.key)
import copyAs well as here:
....
def clone(self): # in Value
return Value(self.version, copy.deepcopy(self.value),
self.store, self.key)
S = Store()Fails as follows:
D = S.using("accounts")
D.set({"account_one":50, "account_two":100, "myaccount":0})
D.commit() # First
S.dump()
X = D.value
X["myaccount"] = X["account_one"] + X["account_two"]
X["account_one"] = 0
E = S.using("accounts")
Y = E.value
Y["myaccount"] = Y["myaccount"]-100
Y["account_one"]= 100
E.set(Y)
E.commit() # Second
S.dump()
X["account_two"] = 0
D.set(X)
D.commit() # Third
S.dump()
print "Committed", D.value["myaccount"]
accounts : Value(1, {'account_two': 100, 'myaccount': 0, 'account_one': 50})(which is of course the error wanted - since we want to be able to detect failure)
accounts : Value(2, {'account_two': 100, 'myaccount': -100, 'account_one':
100})
Traceback (most recent call last):
File "./NewSTM.py", line 70, in <module>
D.commit() # Third
File "./NewSTM.py", line 20, in commit
self.store.set(self.key, self)
File "./NewSTM.py", line 37, in set
raise ConcurrentUpdate
__main__.ConcurrentUpdate