There are many questions related to Stackless Python. But none answering this my question, I think (correct me if wrong – please!). There’s some buzz about it all the time so I curious to know. What would I use Stackless for? How is it better than CPython?

Yes it has green threads (stackless) that allow quickly create many lightweight threads as long as no operations are blocking (something like Ruby’s threads?). What is this great for? What other features it has I want to use over CPython?

It allows you to work with massive amounts of concurrency. Nobody sane would create one hundred thousand system threads, but you can do this using stackless.

This article tests doing just that, creating one hundred thousand tasklets in both Python and Google Go (a new programming language): http://dalkescientific.com/writings/diary/archive/2009/11/15/100000_tasklets.html

Surprisingly, even if Google Go is compiled to native code, and they tout their co-routines implementation, Python still wins.

Stackless would be good for implementing a map/reduce algorithm, where you can have a very large number of reducers depending on your input data.

Stackless Python’s main benefit is the support for very lightweight coroutines. CPython doesn’t support coroutines natively (although I expect someone to post a generator-based hack in the comments) so Stackless is a clear improvement on CPython when you have a problem that benefits from coroutines.

I think the main area where they excel are when you have many concurrent tasks running within your program. Examples might be game entities that run a looping script for their AI, or a web server that is servicing many clients with pages that are slow to create.

You still have many of the typical problems with concurrency correctness however regarding shared data, but the deterministic task switching makes it easier to write safe code since you know exactly where control will be transferred and therefore know the exact points at which the shared state must be up to date.

Thirler already mentioned that stackless was used in Eve Online. Keep in mind, that:

(..) stackless adds a further twist to this by allowing tasks to be separated into smaller tasks, Tasklets, which can then be split off the main program to execute on their own. This can be used for fire-and-forget tasks, like sending off an email, or dispatching an event, or for IO operations, e.g. sending and receiving network packets. One tasklet waits for a packet from the network while others continue running the game loop.

It is in some ways like threads, but is non-preemptive and explicitly scheduled, so there are fewer issues with synchronization. Also, switching between tasklets is much faster than thread switching, and you can have a huge number of active tasklets whereas the number of threads is severely limited by the computer hardware.

(got this citation from here)

At PyCon 2009 there was given a very interesting talk, describing why and how Stackless is used at CCP Games.

Also, there is a very good introductory material, which describes why stackless is a good solution for Your applications. (it may be somewhat old, but I think that it is worth reading).

EVEOnline is largely programmed in Stackless Python. They have several dev blogs on the use of it. It seems it is very useful for high performance computing.

While I’ve not used Stackless itself, I have used Greenlet for implementing highly-concurrent network applications. Some of the use cases Linden Lab has put it towards are: high-performance smart proxies, a fast system for distributing commands over huge numbers of machines, and an application that does a ton of database writes and reads (at a ratio of about 1:2, which is very write-heavy, so it’s spending most of its time waiting for the database to return), and a web-crawler-type-thing for internal web data. Basically any app that’s expecting to have to do a lot of network I/O will benefit from being able to create a bajillion lightweight threads. 10,000 connected clients doesn’t seem like a huge deal to me.

Stackless or Greenlet aren’t really a complete solution, though. They are very low-level and you’re going to have to do a lot of monkeywork to build an application with them that uses them to their fullest. I know this because I maintain a library that provides a networking and scheduling layer on top of Greenlet, specifically because writing apps is so much easier with it. There are a bunch of these now; I maintain Eventlet, but also there is Concurrence, Chiral, and probably a few more that I don’t know about.

If the sort of app you want to write sounds like what I wrote about, consider one of these libraries. The choice of Stackless vs Greenlet is somewhat less important than deciding what library best suits the needs of what you want to do.

The basic usefulness for green threads, the way I see it, is to implement a system in which you have a large amount of objects that do high latency operations. A concrete example would be communicating with other machines:

def Run():
    # Do stuff
    request_information() # This call might block
    # Proceed doing more stuff

Threads let you write the above code naturally, but if the number of objects is large enough, threads just cannot perform adequately. But you can use green threads even for in really large amounts. The request_information() above could switch out to some scheduler where other work is waiting and return later. You get all the benefits of being able to call “blocking” functions as if they return immediately without using threads.

This is obviously very useful for any kind of distributed computing if you want to write code in a straightforward way.

It is also interesting for multiple cores to mitigate waiting for locks:

def Run():
    # Do some calculations
    green_lock(the_foo)
    # Do some more calculations

The green_lock function would basically attempt to acquire the lock and just switch out to a main scheduler if it fails due to other cores using the object.

Again, green threads are being used to mitigate blocking, allowing code to be written naturally and still perform well.