Ye Olde Clue

Random musings on random stuff.

Schedulers matter

Improving the scheduler. It's been something we've avoided for quite a while in Kamaelia, but Simon Wittber's recent post benchmarking Fibra vs Kamaelia vs stackless is interesting. His key showing that "Stackless is 7x faster than Fibra, and Fibra is 10x faster than Kamaelia" are cool for him :-) ... and naturally led me to think "why". Looking at the code, it struck me that he's doing something more interesting with the scheduler given these code forms:

scheduler.install(self.messageLoop())
# self.MessageLoop is a regular python generator
...
            yield self.incrementCounter()
            yield kickTo.messageQueue.push(self)
If you delve inside the fibra scheduler (which doesn't appear to be here unexpectedly) you see the following core:
    def tick(self):
        """Iterates the scheduler, running all tasks and calling all
        handlers.
        """
        active = False
        for handler in self.handlers:
            active = handler.is_waiting() or active
        active = (len(self.tasks) > 0) or active
        tasks = []
        while True:
            try:
                task, send_value = self.tasks.pop(0)
            except IndexError, e:
                break
            try:
                if isinstance(send_value, Exception):
                    r = task.throw(send_value)
                else:
                    r = task.send(send_value)
            except StopIteration, e:
                r = e
            if r is None:
                tasks.append((task, None))
            else:
                try:
                    handler = self.type_handlers[type(r)]
                except KeyError:
                    raise RuntimeError("Don't know what to do with yielded type: %s" % type(r))
                handler.handle(r, task)
        self.tasks[:] = tasks
        return active
The core of this appears to be "when I'm done, do this later" next. If you think that's familiar, it should be - its very similar (at least conceptually) to what twisted does with deferreds. It's not identical, but similar. So what does this mean for the hacksack demo? Well, if we look at self.tasks after each run, by changing:
    def run(self):
        while self.tick():
            pass
to:
    def run(self):
        while self.tick():
            print "TASKS", [x[0] for x in self.tasks]
It becomes apparent what's happening (though it's fairly obvious from above):
TASKS [<generator object at 0xb7b04fcc>]
TASKS []
TASKS [<generator object at 0xb780f94c>]
TASKS []
TASKS [<generator object at 0xb7a1b04c>]
TASKS []
TASKS [<generator object at 0xb776fcac>]
TASKS []
TASKS [<generator object at 0xb79327ac>]
TASKS []
TASKS [<generator object at 0xb7b0ff2c>]
TASKS []
Fundamentally, the reason it's quicker for two reasons:
These two points matter because Axon's scheduler (from jesse's post, hopefully people are aware that Axon is the relevant part of kamaelia here):
This leads me to the rather obvious solution here - can we rebuild a better Axon scheduler by reusing fibra and throw away our scheduler? If so that would be really neat - throwing away our scheduler for something faster and more intelligent would be rather fun :) If we can't then we've been pondering improving it - that is making it more intelligent - for a while. Fibra's benchmarks suggest that doing so would be well worth it. The question this raises though is whether doing this would help us in the general case. At present I'm unclear on that, but until you try, you don't know.

Beyond all that though, fibra looks neat :)

Reply to this post

Comments

Fibra looks neat #

In case it doesn't come across, Fibra looks neat. I considered that as an alternate title :)

-- Michael, 4 Mar 2009 at 10:57, Rating: 0 (Reply) (Moderated by: anon)

Back to front page