3: Implementing Simple Generator Coroutines in C++03
Coroutines: Basics, Py->C++20,
Benchmarking
- Introduction
- What are coroutines?
- What coroutines can look like in C++
- Benchmark generators: C++03, C++20 & Python
This is a mini-series leading up to implementing coroutines in C++20, (“what I learnt on my holiday”). It also has comparison with other implementations, as part of my ModernCPP explorations. (first post)
In the previous post I showed some sample generators in C++ and noted that you could not know what the following did without knowing about the implementation:
class Fibonnaci: public Generator {
int a, b, s;
public:
() : a(1), b(1), s(0) { }
Fibonnaciint next() {
GENERATOR_CODE_STARTwhile ( true ) {
(a);
YIELD= a + b;
s = b;
a = s;
b };
GENERATOR_CODE_END};
};
As a result, this post expands on this to discuss the implementation.
Expanding the example
There’s no formal support for coroutines in C++03. So how does this work? Fundamentally, this uses preprocessor macros to implement the generator behaviour. This is what the code looks like when those macros are expanded:
class Fibonnaci: public Generator {
int a, b, s;
public:
() : a(1), b(1), s(0) { }
Fibonnaciint next() {
//GENERATOR_CODE_START
if (__generator_state == -1) { throw StopIteration(); } switch(__generator_state) { default:
while ( true ) {
= __LINE__; return (a); case __LINE__:
__generator_state = a + b;
s = b;
a = s;
b };
}; __generator_state = -1; throw StopIteration();
};
};
It shoulds be relatively clear from this that this is just really
just a bit of syntactic sugar around a switch
based state
machine. The state machine is managed by the automatically expanded
__LINE__
macro which represents the line number where
used.
Base class support
It’s in order to support this that we derive from a
Generator
baseclass. That baseclass looks like this:
class Generator {
protected:
int __generator_state;
public:
() { };
Generator~Generator() { };
};
What this shows is that in this generator, generator “local”
variables are stored inside the generator object, not in the coroutine
body. It also uses a protected __generator_state
attribute
to capture the line number before returning. A large switch statement
controls re-entry into the the coroutine body.
Support Macros
The preprocessor macros that are needed to enable this look like this:
// #define GENERATOR struct
#define GENERATOR_CODE_START if (__generator_state == -1) \
{ throw StopIteration(); } \
switch(__generator_state) \
{ default:
#define YIELD(v) __generator_state = __LINE__; \
return (v); \
case __LINE__:
#define GENERATOR_CODE_END }; \
__generator_state = -1; throw StopIteration();
It should be clear that this is primarily code focussed on building a
statemachine driven by __LINE__
Complete C++03 Generator code
This means that the whole code for the old-style generators is pretty short:
// NOTE: cpp03generators.h
class StopIteration { };
class Generator {
protected:
int __generator_state;
public:
() { };
Generator~Generator() { };
};
#define GENERATOR_CODE_START if (__generator_state == -1) \
{ throw StopIteration(); } \
switch(__generator_state) \
{ default:
#define YIELD(v) __generator_state = __LINE__; \
return (v); \
case __LINE__:
#define GENERATOR_CODE_END }; \
__generator_state = -1; throw StopIteration();
Where’s the code?
Update: 11/4/2023
Want to read and runcode, not the blog? I know the feeling. I recently created a github repo specifically for putting code from blog posts in. Though recognising that it might also have non-code stuff added at some point, I’ve called it blog-extras.
The blog-extras repo can be found here:
The code specific to this series of short posts can be found here:
That repo will get fleshed out with code/examples from other posts in this blog too.
Discussion
This is a fairly brief and clear implementation, which pretty much mirrors what every coroutine implementation has to do:
- Store local state inside a heap structure
- Keep track of the local line number to return to when started
- Guard against restarting a stopped generator
- Wrap all the body code inside a switch statement to allow jumping back to where we left off.
It’s cheap and cheerful, and loosely based on Duff’s device
based coroutines so is guaranteed to work (with
a small caveat around one type of debug environment). It’s also an
old approach, so is very portable and compiles almost everywhere. I’ve
used this approach in my python to C++ compiler pyxie to implement the
range()
method and it’s relatively clear and lightweight.
Due to the lightweight nature, this sort of coroutine is also pretty
useful in an constrained environments.
If you wanted to serialise this and send it over the network to run elsewhere, it should be fairly obvious how you would do this - since this doesn’t rely on any internal / compiler specific functionality. It should be fairly easy to also see how to implement Modern C++ functionality on top of this - things like templating, move support, etc.
One downside is the syntax looks a little funky. It also doesn’t provide for everything modern C++20 coroutines do - including things like the iterator/ranges protocol, sending values in, throwing exceptions in, and so on. (These could be added though)
NEXT POST: The next post in this short series looks at implementing simple generator style coroutines in C++20.
Updated: 2023/09/15 21:12:33