Hope for the copyright system?
August 20, 2008 at 09:55 PM | categories: python, oldblog | View CommentsUnit test crib sheet
August 07, 2008 at 12:13 PM | categories: python, oldblog | View Comments#!/usr/bin/pythonJust running this without -v:
#
# running this as "test_CribSheet.py -v "
# - gives you some cribsheet docs on what's going on and runs all the tests
#
# running this as "./test_CribSheet.py -v LikeCycleOfATest"
# - allows you to just run one of the suites.
#
# This doesn't replace documentation, and there's probably some hidden
# assumptions here, but it's quite useful.
#
import unittest
import os
class DemoDocStringsImpact(unittest.TestCase):
# Note that the next test doesn't have a doc string. Look at the results in -v
def test_DefaultVerboseMessage(self):
pass
# Note that the next test does have a doc string. Look at the results in -v
def test_NonDefaultVerboseMessage(self):
"This message will be shown in -v"
pass
class LikeCycleOfATest(unittest.TestCase):
def setUp(self):
"We get called before every test_ in this class"
self.value = 2
def test_test1(self):
"LifeCycle : 1 - we get called after setUp, but before tearDown"
self.assertNotEqual(1, self.value)
self.value = 1
def test_test2(self):
"""LifeCycle : 2 - self.value wiped from previous test case
- this is because setUp & tearDown are called before/after every test"""
self.assertNotEqual(1, self.value)
def tearDown(self):
"We get called before *every* test_ in this class"
# We could for example close the file used by every test, or close
# a database or network connection
class Escape_tests(unittest.TestCase):
def test_NullTest1(self):
"assertNotEquals - fails with AssertionError if Equal"
self.assertNotEqual(1, 2)
def test_NullTest2(self):
"assertEquals - fails with AssertionError if not Equal"
self.assertEqual(1, 1)
def test_NullTest3(self):
"assertEquals, custom error message - - fails with AssertionError + custom message if not Equal"
self.assertEqual(1, 1, "If you see this, the test is broken")
def test_CallsSelfFailShouldBeCaughtByAssertionError(self):
"self.fail - fail with AssertionError + custom message - useful for failing if an assertion does not fire when it should"
try:
self.fail("Fail!")
except AssertionError:
pass
def test_NullTest4(self):
"assert_ - for those times when you just want to assert something as try. Can have a custom message"
self.assert_(1 ==1 , "one and one is two...")
def test_NullTest5(self):
"fail unless - This is essentially the same as self.assert_ really"
self.failUnless(1 ==1)
def test_NullTest6(self):
"fail unless - code for this shows how to catch the Assertion error"
try:
self.failUnless(1 !=1 )
except AssertionError:
pass
def test_NullTest7(self):
"fail unless - how to extract the error message"
try:
self.failUnless(1 !=1, "Looks like the test is wrong!")
except AssertionError, e:
self.assert_(e.message == "Looks like the test is wrong!")
def test_NullTest8(self):
"assertRaises - can be useful for checking boundary cases of method/function calls."
def LegendaryFail():
1/0
self.assertRaises(ZeroDivisionError, LegendaryFail)
def test_NullTest9(self):
"failUnlessRaises - can be useful for checking boundary cases of method/function calls. - can also pass in arguments"
def LegendaryFail(left, right):
left/right
self.failUnlessRaises(ZeroDivisionError, LegendaryFail,1,0)
def test_NullTest10(self):
"assertRaises - how to simulate this so your assertion error can get a custom message"
def LegendaryFail(left, right):
left/right
try:
LegendaryFail(1,0)
self.fail("This would fail here is LegendaryFail did not raise a ZeroDivisionError")
except ZeroDivisionError:
pass
if __name__=="__main__":
# Next line invokes voodoo magic that causes all the testcases above to run.
unittest.main()
~/scratch> python test_CribSheet.pyRunning this with -v:
...............
----------------------------------------------------------------------
Ran 15 tests in 0.002s
OK
~/scratch> python test_CribSheet.py -vRunning this for just one group of tests:
test_DefaultVerboseMessage (__main__.DemoDocStringsImpact) ... ok
This message will be shown in -v ... ok
self.fail - fail with AssertionError + custom message - useful for failing if an assertion does not fire when it should ... ok
assertNotEquals - fails with AssertionError if Equal ... ok
assertRaises - how to simulate this so your assertion error can get a custom message ... ok
assertEquals - fails with AssertionError if not Equal ... ok
assertEquals, custom error message - - fails with AssertionError + custom message if not Equal ... ok
assert_ - for those times when you just want to assert something as try. Can have a custom message ... ok
fail unless - This is essentially the same as self.assert_ really ... ok
fail unless - code for this shows how to catch the Assertion error ... ok
fail unless - how to extract the error message ... ok
assertRaises - can be useful for checking boundary cases of method/function calls. ... ok
failUnlessRaises - can be useful for checking boundary cases of method/function calls. - can also pass in arguments ... ok
LifeCycle : 1 - we get called after setUp, but before tearDown ... ok
LifeCycle : 2 - self.value wiped from previous test case ... ok
----------------------------------------------------------------------
Ran 15 tests in 0.003s
OK
~/scratch> python test_CribSheet.py -v DemoDocStringsImpactCorrections, improvements, suggestions welcome. Hopefully of use to someone :) (written since I tend to use existing tests as a crib)
test_DefaultVerboseMessage (__main__.DemoDocStringsImpact) ... ok
This message will be shown in -v ... ok
----------------------------------------------------------------------
Ran 2 tests in 0.001s
OK
Kamaelia Talks at Pycon UK, September 12-14th Birmingham
July 28, 2008 at 01:15 AM | categories: python, oldblog | View CommentsPractical concurrent systems made simple using Kamaelia
This is a talk for beginners who want to primarily get started with using Kamaelia (probably to either make building systems more fun, or because they want to explore/use concurrency naturally). The full abstract can be found here:
This is the short version: This talk as a result aims to teach you how to get started with Kamaelia, building a variety of systems, as well as walking through the design and implementation of some systems built in the last year. Systems built over the past year include tools for dealing with spam (greylisting), through database modelling, video & image transcoding for a youtube/flickr type system, paint programs, webserving, XMPP, games, and a bunch of other things.
The other talk:
Sharing Data & Services Safely in Concurrent Systems using Kamaelia
Again, this covers stuff most people have found they've needed to know something about after using Kamaelia for non-trivial stuff for a few weeks. (essentially how to use the CAT & STM code) The full abstract can be found here:
The short version for that: Whilst message passing and "shared" nothing systems like Kamaelia simplify many problems, sometimes you really do need to share data. (eg a single pygame display!) Unconstrained concurrent access to data causes problem, so Kamaelia has two problems: 1) How do you provide tools that enable access to shared data and services, 2) Do so in a way without making people's heads explode? I'll be using the Speak N Write code to illustrate that.
[*] This won't be/shouldn't be a shock since I'm on the pycon UK committee, but I don't take things for granted :-)There's also masses of great talks lined up, and the first batch of talks put up already that I'm interest in (scheduling allowing :) include:
- Stretching Pyglet's Wings
- How I used Python to Control my Central Heating System
- The Advantages And Disadvantages Of Python In Commercial Applications
- Getting Warmer with Python - Python's role in helping "solve" global warming.
- Python in Higher Education: One Year On
- PyPy's Python Interpreter - Status and Plans
- Cloud Computing and Amazon Web Services
- Distributed Serpents: Python, Peloton and highly available Services
- Open Source Testing Tools In Practice
- py.test - Rapid Testing with Minimal Effort
- ... and naturally lots more though :)
I'm also particularly looking forward to the keynotes by Ted Leung (Sun, previously OSAF) & Mark Shuttleworth (Ubuntu, Canonical). I've not heard Ted speak before so that'll be interesting in itself, however I've heard Mark speak twice before and he's a great speaker.
There's also plans afoot for a BOF discuss people's gripes with python's packaging systems, and the need for things like an easy_uninstall. More BOFs welcome of course.
If you've not signed up, go take a look at the talks list and see part of what you're missing :-)
(yeah, I'm excited, but why not? It's exciting :-) )
George Bernard Shaw was wrong
July 27, 2008 at 02:00 PM | categories: python, oldblog | View CommentsIf you have an apple and I have an apple and we exchange apples then you and I will still each have one apple. But if you have an idea and I have an idea and we exchange these ideas, then each of us will have two ideas.It's nice and on a most basic of levels it's true. However it's utterly incomplete as anyone who's worked on anything based on sharing ideas, be it brainstorming, collaborative working, open source or anything else. It's actually more a combinatorial explosion of ideas you get and in fact with just two "completely atomic" ideas you never have just 2 ideas, you always have at least 3 - A, B, meld(AB). In fact this sequence should bring flashbacks to many people to their maths:
The reason it's wrong is because of this:
- 2 ideas A, B -> A, B, meld(AB)
- 3 possibilities
- 3 ideas A, B, C -> A, B, C, meld(AB), meld(BC), meld(AC), meld(ABC)
- 7 possibilities
- 4 ideas A, B, C -> A, B, C, D, meld(AB), meld(AC), meld(AD), meld(BC), meld(BD), meld(CD), meld(ABC), meld(ABD), meld(ACD),meld(BCD), meld(ABCD)
- 15 possibilities
- More generally: 2**N -1 possibilities
Concurrent software is not the problem - Intel talking about 1000s of cores
July 05, 2008 at 12:47 PM | categories: python, oldblog | View CommentsWhy? Like the erlang group, in Kamaelia, the thing we've focussed on is making concurrency easy to work with, primarily by aiming for making concurrent software maintenance easier (for the average developer). In practical terms this has meant putting friendly metaphors (hopefully) on top of well established principles of message passing systems, as well as adding support for other forms of constrained shared data. (STM is a bit like version control for variables).
We've done this by using various application domains as the starting point, such as DVB, networking and use of audio/video etc, and used Python as the language of choice to do so (Though we probably could've shouted about our application uses more/better, though we've getting better I think :-). However the approaches apply to more or less any non-functional language - so there's a proof of concept of our miniaxon core in C++, Ruby, & Java as well. (C++ & ruby ones deliberately simple/naive coding style :)
This does mean that now when we approach a problem - such as the desire to build a tool that assists a child learning to read and write - we end up with a piece of code that internally exhibits high levels of concurrency. For example, even the simple Speak And Write application is made of 37 components which at present all right in the same process, but could be easily be made to use 37 processes... (prepending all Pipelines & Graphlines with the word "Process")
Despite this, we don't normally think in terms of number of components or concurrent things, largely because you don't normally think of the number of functions you use in a piece of code - we just focus on the functionality we want from the system. I'm sure once upon a time though people did, but I don't know anyone who counts the number of functions or methods they have. The diagram below for example is the high level functionality of the system:
bgcolour = (255,255,180)
Backplane("SPEECH").activate()
Pipeline(
SubscribeTo("SPEECH"),
UnixProcess("while read word; do echo $word | espeak -w foo.wav --stdin ; aplay foo.wav ; done"),
).activate()CANVAS = Canvas( position=(0,40), size=(800,320),
bgcolour = bgcolour ).activate()
CHALLENGE = TextDisplayer(size = (390, 200), position = (0,40),
bgcolour = bgcolour, text_height=48,
transparent =1).activate()
TEXT = Textbox(size = (800, 100), position = (0,260), bgcolour = (255,180,255),
text_height=48, transparent =1 ).activate()
Graphline(
CHALLENGER = Challenger(),
CHALLENGE_SPLITTER = TwoWaySplitter(),
CHALLENGE_CHECKER = Challenger_Checker(),
SPEAKER = PublishTo("SPEECH"),
CHALLENGE = CHALLENGE,
TEXT = TEXT,
CANVAS = CANVAS,
PEN = Pen(bgcolour = bgcolour),
STROKER = StrokeRecogniser(),
OUTPUT = aggregator(),
ANSWER_SPLITTER = TwoWaySplitter(),
TEXTDISPLAY = TextDisplayer(size = (800, 100), position = (0,380),
bgcolour = (180,255,255), text_height=48 ),
linkages = {
("CANVAS", "eventsOut") : ("PEN", "inbox"),
("CHALLENGER","outbox") : ("CHALLENGE_SPLITTER", "inbox"),
("CHALLENGE_SPLITTER","outbox") : ("CHALLENGE", "inbox"),
("CHALLENGE_SPLITTER","outbox2") : ("SPEAKER", "inbox"),
("PEN", "outbox") : ("CANVAS", "inbox"),
("PEN", "points") : ("STROKER", "inbox"),
("STROKER", "outbox") : ("OUTPUT", "inbox"),
("STROKER", "drawing") : ("CANVAS", "inbox"),
("OUTPUT","outbox") : ("TEXT", "inbox"),
("TEXT","outbox") : ("ANSWER_SPLITTER", "inbox"),
("ANSWER_SPLITTER","outbox") : ("TEXTDISPLAY", "inbox"),
("ANSWER_SPLITTER","outbox2") : ("CHALLENGE_CHECKER", "inbox"),
("CHALLENGE_CHECKER","outbox") : ("SPEAKER", "inbox"),
("CHALLENGE_CHECKER", "challengesignal") : ("CHALLENGER", "inbox"),
},
).run()
However, what has this got to do with 1000s of cores? After all, even a larger application (like the Whiteboard) only really exhibits a hundred or two hundred of degrees of concurrency... Now, clearly if every application you were using was written the approach of simpler, friendlier component metaphors that Kamaelia currently uses, then it's likely that you would probably start using all those CPUs. I say "approach", because I'd really like to see people taking our proofs of concept and making native versions for C++, Ruby, Perl, etc - I don't believe in the view of one language to rule them all. I'd hope it was easier to maintain and be more bug free, because that's a core aim, but the proof of the approach is in the coding really, not the talking.
However, when you get to 1000s of cores a completely different issue suddenly arises that you didn't have with concurrency at the levels of 1,5, 10, 100 cores. That of software tolerance of hardware unreliability, and that, not writing concurrent software is the REAL problem.
It's been well noted that Google currently scale their applications across 1000s of machines using Map Reduce, which fundamentally is just another metaphor for writing code in a concurrent way. However, they are also well known to work on the assumption that they will have a number of servers fail every single day. This will will fundamentally mean half way through doing something. Now with a web search, if something goes wrong, you can just redo the search, or just not aggregate the results of the search.
In a desktop application, what if the core that fails is handling audio output? Is it acceptable for the audio to just stop working? Or would you need to have some mechanism to back out from the error and retry? It was thinking about these issues early this morning that I realised that what you you need is a way of capturing what was going to be running on that core before you execute it, and then launch it. In that scenario, if the CPU fails (assuming a detection mechanism) you can then restart the component on a fresh core.
The interesting thing here is that ProcessPipeline can help us out here. The way process pipeline works is as follows. Given the following system:
ProcessPipeline( Producer(), Transformer(), Consumer() ).run()
Such as:
ProcessPipeline( SimpleFileReader(), AudioDecoder(), AudioPlayer() ).run()
Then ProcessPipeline runs in the foreground process. For each of the components listed in the pipeline, it forks, and runs the component using the pprocess library, with data passing between components via the ProcessPipeline (based on the principle of the simplest thing that could possibly work). The interesting thing about this is this: ProcessPipeline therefore has a copy of each component before it started executing. Fundamentally this allows process pipeline to be able (at some later point in time) to be able to detect erroneous death of the component (somehow :) ), either due to bugs, or hardware failure, and to be able to restart the component - masking the error from the other components in the system.
Now, quite how that would actually work in practice, I'm not really sure, ProcessPipeline is after all experimental at present, with issues in it being explored by a Google Summer of Code project aimed at a multi-process paint program (by a first year CS student...). However, it gives me warm fuzzy feelings about both our approach and it's potential longevity - since we do have a clear reasonable answer as to how to deal with that (hardware) reliability issue.
So, whilst Intel may have some "unwelcome advice", and people may be reacting thinking "how on earth do I even structure my code to work that way", but the real problem is "how do I write application code that is resilient to and works despite hardware failure".
That's a much harder question, and the only solution to both that I can see is "break your code down into restartable, non-datasharing, message passing, replaceable components". I'm sure other solutions either exist or will come along though :-) After all, Kamaelia turns out to have similarities to Hugo Simpon's MASCOT (pdf, see also wikipedia link) which is over 30 years old but barely advertised, so I'm sure that other approaches exist.
Interesting post on requirements for project websites
July 04, 2008 at 09:06 AM | categories: python, oldblog | View CommentsSemantics changing on whitespace?
July 03, 2008 at 10:41 AM | categories: python, oldblog | View Commentsif True:vs
print "Yay"
print "Woo"
frag. 1
if True:However, generally speaking this does actually mean what people intended it to mean. The common exception is where, you might want to write this:
print "Yay"
print "Woo"
frag. 2
class Foo(object):
def main(self):
while True:
print "Woo"
sys.stderr.write("DEBUG - UM, in loop\n")
print "Yay"
frag. 3
Whereas of course python views the sys.stderr.write line as the end of the while, def, & class blocks. Often people do the above (in non-python languages) because they want to make it easier to find where they've inserted debug code, and lament the lack of it python. As an aside, you can of course do the above in python, if you add an extra line in:
class Foo(object):Since the continuation marker effectively causes the next line to be part of the same line, thereby meaning it's logically indented, even if not in reality. ("sys" is still at the start of the line in the source of course)
def main(self):
while True:
print "Woo"
\
sys.stderr.write("DEBUG - UM, in loop\n")
print "Yay"
frag. 4
However, as far as whitespace goes, I think that's as far as the change in semantics due to white space goes of course *except* that it's also used as a delimiter between tokens. (This is kinda necessary after they found with fortran many years back that allowing whitespace in identifiers was a rather bad idea in practice)
Anyhow, this does tend to mean that the last line of this:
foo =10means the same as all of these:
bar = 2
X[foo/bar]
frag. 5
X [foo/bar] X[foo /bar] X [foo /bar] X[foo / bar] X [foo / bar]Whereas apparently in ruby it wouldn't - based on some recent posts. In fact, I think only 2 of them do the same thing. That's actually pretty insane (but then I'm sure people think the same about python's whitespace rules), but clearly a consequence of allowing foo bar to mean something similar (if not identical ?) to foo(bar). However, it also clearly breaches a the rule of least surprise. Whilst the problem with the rule of least surprise is "who is surprised", I think it's reasonable for someone looking at code to assume that the following all do the same things:
frag. 6
X[foo/bar] X [foo/bar] X[foo /bar] X [foo /bar] X[foo / bar] X [foo / bar]And it's also reasonable to assume that the following are at least intended to be different:
frag. 7
if (1)vs
printf("Yay\n");
printf("Woo\n");
frag. 8
if (1)
printf("Yay\n");
printf("Woo\n");
frag. 9
But of course in C, they aren't. Now that's why most C programmers wouldn't do that, but it's made me wonder. C has this foible, which every C programmer knows about. Ruby has the above foible which I'm guessing most if not all ruby programmers are aware of, but with python it's whitespace semantics (which are intended to actually encourage good behaviour and fix the "problem" with frags 8 vs 9) that everyone knows about and does put people off...
ie The biggest barrier (that I hear of) to adoption of python is the fact that frags 1 & 2 do mean different things. I'm not sure why it's a huge barrier, but it does turn out to be the single factor that turns most people off the language (in my experience...). Whilst you do have something like pindent.py which allows frags 1& 2 to look like this:
if True:vs
print "Yay"
print "Woo"
#end if
frag. 10 - same as frag 1 after running through pindent.py
if True:And whilst hell would freeze over before the addition of a keyword 'end' to python, it strikes me that being able to write:
print "Yay"
#end if
print "Woo"
frag. 11 - same as frag 2 after running through pindent.py
if True:and
print "Yay"
print "Woo"
end
frag. 12 - same as frag 1 after running through pindent.py
if True:and
print "Yay"
end
print "Woo"
frag. 13 - same as frag 2 after running through pindent.py
class Foo(object):Wouldn't actually be the end of the world, and would actually simplify things for beginners, but also simplify things when people are embedding code and copying/pasting code from webpages from archives of lists etc. It'd mean that web templating languages which allow python code inside templates (not a wise idea to mix code with templates really, but it does happen) would be able to use the same syntax, etc.
def main(self):
while True:
print "Woo"
sys.stderr.write("DEBUG - UM, in loop\n")
print "Yay"
end
end
end
frag. 14 - same as frag 2 after running through pindent.py
It would also do away with the one major criticism of python. To make it available, my personal preference is that it would have to be available as a command line switch, which defaults to off. However, as mentioned hell would freeze over before it was added, so the question that springs to mind is "is it worth writing a pre-processor for" ? I can see some benefits in doing so for example it would mean that python was less fussy about things like frags 12 and 14 - both of which have whitespace issues python would scream about. frag 12 has a common mistake - whereas frag 14 contains a common desire. (at least for people who are used to that temporary debug style in many languages)
It'd also mean (perhaps) that resurrecting kids books teaching programming could use python happily without people wondering whether they've counted the dedents correctly - since they'd be able to count up end keywords.
It'd also open the door to handwriting based coding in python... (since indenting 8 "spaces" when writing doesn't make much sense - and your indentation isn't going to be perfect then either)
So the question for me, is it worth writing? I personally suspect it is, and the preprocessor needed would be quite simple to write either from scratch or to derive from pindent.py, but wonder what other people's opinions are. How long did whitespace sensitivity in python stop you learning it? (It put me off for 5 years) Has it stopped you wanting to bother? Do you think such a pre-processor would be a really bad idea?
Is it just me...
July 02, 2008 at 05:20 PM | categories: python, oldblog | View CommentsDrawing Kamaelia Systems?
June 28, 2008 at 12:29 AM | categories: python, oldblog | View CommentsIf you're curious about it and have python & pygame already installed, installation boils down to this:
~/tmp> tar zxvf Kamaelia-Scribble-N-Link-0.1.0.tar.gzRight at this instant all that application does is this:
~/tmp> cd Kamaelia-Scribble-N-Link-0.1.0/
~/tmp> sudo python setup.py install
- It will recognise strokes drawn that look like an "o" or a "u" and assume you want to "draw"/add a new unconfigured component
- Drawing a joined up "x" (ie curvy one like a "backward c going upwards, then forward c downwards") will delete the last component added
- Drawing a stroke from top left to bottom right will "switch on" the "makelink" mode.
- The link will start from the next link clicked - eg an outbox
- The link will terminate with the following link clicked - eg an inbox
However, it is an interesting start/proof of concept - it certainly is beginning to look like we will be able to literally draw software at some point in the future... Any feedback welcome. :-)
Twitter - Why?
June 25, 2008 at 04:33 PM | categories: python, oldblog | View Comments« Previous Page -- Next Page »