1 +++++++++++++++++++++++++++++++++++++++++++
2 Building Hybrid Systems with Boost.Python
3 +++++++++++++++++++++++++++++++++++++++++++
5 :Author: David Abrahams
6 :Contact: dave@boost-consulting.com
7 :organization: `Boost Consulting`_
10 :Author: Ralf W. Grosse-Kunstleve
12 :copyright: Copyright David Abrahams and Ralf W. Grosse-Kunstleve 2003. All rights reserved
14 .. contents:: Table of Contents
16 .. _`Boost Consulting`: http://www.boost-consulting.com
22 Boost.Python is an open source C++ library which provides a concise
23 IDL-like interface for binding C++ classes and functions to
24 Python. Leveraging the full power of C++ compile-time introspection
25 and of recently developed metaprogramming techniques, this is achieved
26 entirely in pure C++, without introducing a new syntax.
27 Boost.Python's rich set of features and high-level interface make it
28 possible to engineer packages from the ground up as hybrid systems,
29 giving programmers easy and coherent access to both the efficient
30 compile-time polymorphism of C++ and the extremely convenient run-time
31 polymorphism of Python.
37 Python and C++ are in many ways as different as two languages could
38 be: while C++ is usually compiled to machine-code, Python is
39 interpreted. Python's dynamic type system is often cited as the
40 foundation of its flexibility, while in C++ static typing is the
41 cornerstone of its efficiency. C++ has an intricate and difficult
42 compile-time meta-language, while in Python, practically everything
45 Yet for many programmers, these very differences mean that Python and
46 C++ complement one another perfectly. Performance bottlenecks in
47 Python programs can be rewritten in C++ for maximal speed, and
48 authors of powerful C++ libraries choose Python as a middleware
49 language for its flexible system integration capabilities.
50 Furthermore, the surface differences mask some strong similarities:
52 * 'C'-family control structures (if, while, for...)
54 * Support for object-orientation, functional programming, and generic
55 programming (these are both *multi-paradigm* programming languages.)
57 * Comprehensive operator overloading facilities, recognizing the
58 importance of syntactic variability for readability and
61 * High-level concepts such as collections and iterators.
63 * High-level encapsulation facilities (C++: namespaces, Python: modules)
64 to support the design of re-usable libraries.
66 * Exception-handling for effective management of error conditions.
68 * C++ idioms in common use, such as handle/body classes and
69 reference-counted smart pointers mirror Python reference semantics.
71 Given Python's rich 'C' interoperability API, it should in principle
72 be possible to expose C++ type and function interfaces to Python with
73 an analogous interface to their C++ counterparts. However, the
74 facilities provided by Python alone for integration with C++ are
75 relatively meager. Compared to C++ and Python, 'C' has only very
76 rudimentary abstraction facilities, and support for exception-handling
77 is completely missing. 'C' extension module writers are required to
78 manually manage Python reference counts, which is both annoyingly
79 tedious and extremely error-prone. Traditional extension modules also
80 tend to contain a great deal of boilerplate code repetition which
81 makes them difficult to maintain, especially when wrapping an evolving
84 These limitations have lead to the development of a variety of wrapping
85 systems. SWIG_ is probably the most popular package for the
86 integration of C/C++ and Python. A more recent development is SIP_,
87 which was specifically designed for interfacing Python with the Qt_
88 graphical user interface library. Both SWIG and SIP introduce their
89 own specialized languages for customizing inter-language bindings.
90 This has certain advantages, but having to deal with three different
91 languages (Python, C/C++ and the interface language) also introduces
92 practical and mental difficulties. The CXX_ package demonstrates an
93 interesting alternative. It shows that at least some parts of
94 Python's 'C' API can be wrapped and presented through a much more
95 user-friendly C++ interface. However, unlike SWIG and SIP, CXX does
96 not include support for wrapping C++ classes as new Python types.
98 The features and goals of Boost.Python_ overlap significantly with
99 many of these other systems. That said, Boost.Python attempts to
100 maximize convenience and flexibility without introducing a separate
101 wrapping language. Instead, it presents the user with a high-level
102 C++ interface for wrapping C++ classes and functions, managing much of
103 the complexity behind-the-scenes with static metaprogramming.
104 Boost.Python also goes beyond the scope of earlier systems by
107 * Support for C++ virtual functions that can be overridden in Python.
109 * Comprehensive lifetime management facilities for low-level C++
110 pointers and references.
112 * Support for organizing extensions as Python packages,
113 with a central registry for inter-language type conversions.
115 * A safe and convenient mechanism for tying into Python's powerful
116 serialization engine (pickle).
118 * Coherence with the rules for handling C++ lvalues and rvalues that
119 can only come from a deep understanding of both the Python and C++
122 The key insight that sparked the development of Boost.Python is that
123 much of the boilerplate code in traditional extension modules could be
124 eliminated using C++ compile-time introspection. Each argument of a
125 wrapped C++ function must be extracted from a Python object using a
126 procedure that depends on the argument type. Similarly the function's
127 return type determines how the return value will be converted from C++
128 to Python. Of course argument and return types are part of each
129 function's type, and this is exactly the source from which
130 Boost.Python deduces most of the information required.
132 This approach leads to *user guided wrapping*: as much information is
133 extracted directly from the source code to be wrapped as is possible
134 within the framework of pure C++, and some additional information is
135 supplied explicitly by the user. Mostly the guidance is mechanical
136 and little real intervention is required. Because the interface
137 specification is written in the same full-featured language as the
138 code being exposed, the user has unprecedented power available when
139 she does need to take control.
141 .. _Python: http://www.python.org/
142 .. _SWIG: http://www.swig.org/
143 .. _SIP: http://www.riverbankcomputing.co.uk/sip/index.php
144 .. _Qt: http://www.trolltech.com/
145 .. _CXX: http://cxx.sourceforge.net/
146 .. _Boost.Python: http://www.boost.org/libs/python/doc
148 ===========================
149 Boost.Python Design Goals
150 ===========================
152 The primary goal of Boost.Python is to allow users to expose C++
153 classes and functions to Python using nothing more than a C++
154 compiler. In broad strokes, the user experience should be one of
155 directly manipulating C++ objects from Python.
157 However, it's also important not to translate all interfaces *too*
158 literally: the idioms of each language must be respected. For
159 example, though C++ and Python both have an iterator concept, they are
160 expressed very differently. Boost.Python has to be able to bridge the
163 It must be possible to insulate Python users from crashes resulting
164 from trivial misuses of C++ interfaces, such as accessing
165 already-deleted objects. By the same token the library should
166 insulate C++ users from low-level Python 'C' API, replacing
167 error-prone 'C' interfaces like manual reference-count management and
168 raw ``PyObject`` pointers with more-robust alternatives.
170 Support for component-based development is crucial, so that C++ types
171 exposed in one extension module can be passed to functions exposed in
172 another without loss of crucial information like C++ inheritance
175 Finally, all wrapping must be *non-intrusive*, without modifying or
176 even seeing the original C++ source code. Existing C++ libraries have
177 to be wrappable by third parties who only have access to header files
180 ==========================
181 Hello Boost.Python World
182 ==========================
184 And now for a preview of Boost.Python, and how it improves on the raw
185 facilities offered by Python. Here's a function we might want to
188 char const* greet(unsigned x)
190 static char const* const msgs[] = { "hello", "Boost.Python", "world!" };
193 throw std::range_error("greet: index out of range");
198 To wrap this function in standard C++ using the Python 'C' API, we'd
199 need something like this::
201 extern "C" // all Python interactions use 'C' linkage and calling convention
203 // Wrapper to handle argument/result conversion and checking
204 PyObject* greet_wrap(PyObject* args, PyObject * keywords)
207 if (PyArg_ParseTuple(args, "i", &x)) // extract/check arguments
209 char const* result = greet(x); // invoke wrapped function
210 return PyString_FromString(result); // convert result to Python
212 return 0; // error occurred
215 // Table of wrapped functions to be exposed by the module
216 static PyMethodDef methods[] = {
217 { "greet", greet_wrap, METH_VARARGS, "return one of 3 parts of a greeting" }
218 , { NULL, NULL, 0, NULL } // sentinel
221 // module initialization function
222 DL_EXPORT init_hello()
224 (void) Py_InitModule("hello", methods); // add the methods to the module
228 Now here's the wrapping code we'd use to expose it with Boost.Python::
230 #include <boost/python.hpp>
231 using namespace boost::python;
232 BOOST_PYTHON_MODULE(hello)
234 def("greet", greet, "return one of 3 parts of a greeting");
237 and here it is in action::
240 >>> for x in range(3):
241 ... print hello.greet(x)
247 Aside from the fact that the 'C' API version is much more verbose,
248 it's worth noting a few things that it doesn't handle correctly:
250 * The original function accepts an unsigned integer, and the Python
251 'C' API only gives us a way of extracting signed integers. The
252 Boost.Python version will raise a Python exception if we try to pass
253 a negative number to ``hello.greet``, but the other one will proceed
254 to do whatever the C++ implementation does when converting an
255 negative integer to unsigned (usually wrapping to some very large
256 number), and pass the incorrect translation on to the wrapped
259 * That brings us to the second problem: if the C++ ``greet()``
260 function is called with a number greater than 2, it will throw an
261 exception. Typically, if a C++ exception propagates across the
262 boundary with code generated by a 'C' compiler, it will cause a
263 crash. As you can see in the first version, there's no C++
264 scaffolding there to prevent this from happening. Functions wrapped
265 by Boost.Python automatically include an exception-handling layer
266 which protects Python users by translating unhandled C++ exceptions
267 into a corresponding Python exception.
269 * A slightly more-subtle limitation is that the argument conversion
270 used in the Python 'C' API case can only get that integer ``x`` in
271 *one way*. PyArg_ParseTuple can't convert Python ``long`` objects
272 (arbitrary-precision integers) which happen to fit in an ``unsigned
273 int`` but not in a ``signed long``, nor will it ever handle a
274 wrapped C++ class with a user-defined implicit ``operator unsigned
275 int()`` conversion. Boost.Python's dynamic type conversion
276 registry allows users to add arbitrary conversion methods.
282 This section outlines some of the library's major features. Except as
283 neccessary to avoid confusion, details of library implementation are
290 C++ classes and structs are exposed with a similarly-terse interface.
295 void set(std::string msg) { this->msg = msg; }
296 std::string greet() { return msg; }
300 The following code will expose it in our extension module::
302 #include <boost/python.hpp>
303 BOOST_PYTHON_MODULE(hello)
305 class_<World>("World")
306 .def("greet", &World::greet)
307 .def("set", &World::set)
311 Although this code has a certain pythonic familiarity, people
312 sometimes find the syntax bit confusing because it doesn't look like
313 most of the C++ code they're used to. All the same, this is just
314 standard C++. Because of their flexible syntax and operator
315 overloading, C++ and Python are great for defining domain-specific
317 (DSLs), and that's what we've done in Boost.Python. To break it down::
319 class_<World>("World")
321 constructs an unnamed object of type ``class_<World>`` and passes
322 ``"World"`` to its constructor. This creates a new-style Python class
323 called ``World`` in the extension module, and associates it with the
324 C++ type ``World`` in the Boost.Python type conversion registry. We
325 might have also written::
327 class_<World> w("World");
329 but that would've been more verbose, since we'd have to name ``w``
330 again to invoke its ``def()`` member function::
332 w.def("greet", &World::greet)
334 There's nothing special about the location of the dot for member
335 access in the original example: C++ allows any amount of whitespace on
336 either side of a token, and placing the dot at the beginning of each
337 line allows us to chain as many successive calls to member functions
338 as we like with a uniform syntax. The other key fact that allows
339 chaining is that ``class_<>`` member functions all return a reference
342 So the example is equivalent to::
344 class_<World> w("World");
345 w.def("greet", &World::greet);
346 w.def("set", &World::set);
348 It's occasionally useful to be able to break down the components of a
349 Boost.Python class wrapper in this way, but the rest of this article
350 will stick to the terse syntax.
352 For completeness, here's the wrapped class in use: ::
355 >>> planet = hello.World()
356 >>> planet.set('howdy')
363 Since our ``World`` class is just a plain ``struct``, it has an
364 implicit no-argument (nullary) constructor. Boost.Python exposes the
365 nullary constructor by default, which is why we were able to write: ::
367 >>> planet = hello.World()
369 However, well-designed classes in any language may require constructor
370 arguments in order to establish their invariants. Unlike Python,
371 where ``__init__`` is just a specially-named method, In C++
372 constructors cannot be handled like ordinary member functions. In
373 particular, we can't take their address: ``&World::World`` is an
374 error. The library provides a different interface for specifying
375 constructors. Given::
379 World(std::string msg); // added constructor
382 we can modify our wrapping code as follows::
384 class_<World>("World", init<std::string>())
387 of course, a C++ class may have additional constructors, and we can
388 expose those as well by passing more instances of ``init<...>`` to
391 class_<World>("World", init<std::string>())
392 .def(init<double, double>())
395 Boost.Python allows wrapped functions, member functions, and
396 constructors to be overloaded to mirror C++ overloading.
398 Data Members and Properties
399 ===========================
401 Any publicly-accessible data members in a C++ class can be easily
402 exposed as either ``readonly`` or ``readwrite`` attributes::
404 class_<World>("World", init<std::string>())
405 .def_readonly("msg", &World::msg)
408 and can be used directly in Python: ::
410 >>> planet = hello.World('howdy')
414 This does *not* result in adding attributes to the ``World`` instance
415 ``__dict__``, which can result in substantial memory savings when
416 wrapping large data structures. In fact, no instance ``__dict__``
417 will be created at all unless attributes are explicitly added from
418 Python. Boost.Python owes this capability to the new Python 2.2 type
419 system, in particular the descriptor interface and ``property`` type.
421 In C++, publicly-accessible data members are considered a sign of poor
422 design because they break encapsulation, and style guides usually
423 dictate the use of "getter" and "setter" functions instead. In
424 Python, however, ``__getattr__``, ``__setattr__``, and since 2.2,
425 ``property`` mean that attribute access is just one more
426 well-encapsulated syntactic tool at the programmer's disposal.
427 Boost.Python bridges this idiomatic gap by making Python ``property``
428 creation directly available to users. If ``msg`` were private, we
429 could still expose it as attribute in Python as follows::
431 class_<World>("World", init<std::string>())
432 .add_property("msg", &World::greet, &World::set)
435 The example above mirrors the familiar usage of properties in Python
438 >>> class World(object):
439 ... __init__(self, msg):
442 ... return self.__msg
443 ... def set(self, msg):
445 ... msg = property(greet, set)
450 The ability to write arithmetic operators for user-defined types has
451 been a major factor in the success of both languages for numerical
452 computation, and the success of packages like NumPy_ attests to the
453 power of exposing operators in extension modules. Boost.Python
454 provides a concise mechanism for wrapping operator overloads. The
455 example below shows a fragment from a wrapper for the Boost rational
458 class_<rational<int> >("rational_int")
459 .def(init<int, int>()) // constructor, e.g. rational_int(3,4)
460 .def("numerator", &rational<int>::numerator)
461 .def("denominator", &rational<int>::denominator)
462 .def(-self) // __neg__ (unary minus)
463 .def(self + self) // __add__ (homogeneous)
464 .def(self * self) // __mul__
465 .def(self + int()) // __add__ (heterogenous)
466 .def(int() + self) // __radd__
469 The magic is performed using a simplified application of "expression
470 templates" [VELD1995]_, a technique originally developed for
471 optimization of high-performance matrix algebra expressions. The
472 essence is that instead of performing the computation immediately,
473 operators are overloaded to construct a type *representing* the
474 computation. In matrix algebra, dramatic optimizations are often
475 available when the structure of an entire expression can be taken into
476 account, rather than evaluating each operation "greedily".
477 Boost.Python uses the same technique to build an appropriate Python
478 method object based on expressions involving ``self``.
480 .. _NumPy: http://www.pfdubois.com/numpy/
485 C++ inheritance relationships can be represented to Boost.Python by adding
486 an optional ``bases<...>`` argument to the ``class_<...>`` template
487 parameter list as follows::
489 class_<Derived, bases<Base1,Base2> >("Derived")
492 This has two effects:
494 1. When the ``class_<...>`` is created, Python type objects
495 corresponding to ``Base1`` and ``Base2`` are looked up in
496 Boost.Python's registry, and are used as bases for the new Python
497 ``Derived`` type object, so methods exposed for the Python ``Base1``
498 and ``Base2`` types are automatically members of the ``Derived``
499 type. Because the registry is global, this works correctly even if
500 ``Derived`` is exposed in a different module from either of its
503 2. C++ conversions from ``Derived`` to its bases are added to the
504 Boost.Python registry. Thus wrapped C++ methods expecting (a
505 pointer or reference to) an object of either base type can be
506 called with an object wrapping a ``Derived`` instance. Wrapped
507 member functions of class ``T`` are treated as though they have an
508 implicit first argument of ``T&``, so these conversions are
509 neccessary to allow the base class methods to be called for derived
512 Of course it's possible to derive new Python classes from wrapped C++
513 class instances. Because Boost.Python uses the new-style class
514 system, that works very much as for the Python built-in types. There
515 is one significant detail in which it differs: the built-in types
516 generally establish their invariants in their ``__new__`` function, so
517 that derived classes do not need to call ``__init__`` on the base
518 class before invoking its methods : ::
521 ... def __init__(self):
527 Because C++ object construction is a one-step operation, C++ instance
528 data cannot be constructed until the arguments are available, in the
529 ``__init__`` function: ::
531 >>> class D(SomeBoostPythonClass):
532 ... def __init__(self):
535 >>> D().some_boost_python_method()
536 Traceback (most recent call last):
537 File "<stdin>", line 1, in ?
538 TypeError: bad argument type for built-in operation
540 This happened because Boost.Python couldn't find instance data of type
541 ``SomeBoostPythonClass`` within the ``D`` instance; ``D``'s ``__init__``
542 function masked construction of the base class. It could be corrected
543 by either removing ``D``'s ``__init__`` function or having it call
544 ``SomeBoostPythonClass.__init__(...)`` explicitly.
549 Deriving new types in Python from extension classes is not very
550 interesting unless they can be used polymorphically from C++. In
551 other words, Python method implementations should appear to override
552 the implementation of C++ virtual functions when called *through base
553 class pointers/references from C++*. Since the only way to alter the
554 behavior of a virtual function is to override it in a derived class,
555 the user must build a special derived class to dispatch a polymorphic
556 class' virtual functions::
559 // interface to wrap:
564 virtual int f(std::string x) { return 42; }
568 int calls_f(Base const& b, std::string x) { return b.f(x); }
575 struct BaseWrap : Base
577 // Store a pointer to the Python object
578 BaseWrap(PyObject* self_) : self(self_) {}
581 // Default implementation, for when f is not overridden
582 int f_default(std::string x) { return this->Base::f(x); }
583 // Dispatch implementation
584 int f(std::string x) { return call_method<int>(self, "f", x); }
588 def("calls_f", calls_f);
589 class_<Base, BaseWrap>("Base")
590 .def("f", &Base::f, &BaseWrap::f_default)
593 Now here's some Python code which demonstrates: ::
595 >>> class Derived(Base):
599 >>> calls_f(Base(), 'foo')
601 >>> calls_f(Derived(), 'forty-two')
604 Things to notice about the dispatcher class:
606 * The key element which allows overriding in Python is the
607 ``call_method`` invocation, which uses the same global type
608 conversion registry as the C++ function wrapping does to convert its
609 arguments from C++ to Python and its return type from Python to C++.
611 * Any constructor signatures you wish to wrap must be replicated with
612 an initial ``PyObject*`` argument
614 * The dispatcher must store this argument so that it can be used to
615 invoke ``call_method``
617 * The ``f_default`` member function is needed when the function being
618 exposed is not pure virtual; there's no other way ``Base::f`` can be
619 called on an object of type ``BaseWrap``, since it overrides ``f``.
621 Deeper Reflection on the Horizon?
622 =================================
624 Admittedly, this formula is tedious to repeat, especially on a project
625 with many polymorphic classes. That it is neccessary reflects some
626 limitations in C++'s compile-time introspection capabilities: there's
627 no way to enumerate the members of a class and find out which are
628 virtual functions. At least one very promising project has been
629 started to write a front-end which can generate these dispatchers (and
630 other wrapping code) automatically from C++ headers.
632 Pyste_ is being developed by Bruno da Silva de Oliveira. It builds on
633 GCC_XML_, which generates an XML version of GCC's internal program
634 representation. Since GCC is a highly-conformant C++ compiler, this
635 ensures correct handling of the most-sophisticated template code and
636 full access to the underlying type system. In keeping with the
637 Boost.Python philosophy, a Pyste interface description is neither
638 intrusive on the code being wrapped, nor expressed in some unfamiliar
639 language: instead it is a 100% pure Python script. If Pyste is
640 successful it will mark a move away from wrapping everything directly
641 in C++ for many of our users. It will also allow us the choice to
642 shift some of the metaprogram code from C++ to Python. We expect that
643 soon, not only our users but the Boost.Python developers themselves
644 will be "thinking hybrid" about their own code.
646 .. _`GCC_XML`: http://www.gccxml.org/HTML/Index.html
647 .. _`Pyste`: http://www.boost.org/libs/python/pyste
653 *Serialization* is the process of converting objects in memory to a
654 form that can be stored on disk or sent over a network connection. The
655 serialized object (most often a plain string) can be retrieved and
656 converted back to the original object. A good serialization system will
657 automatically convert entire object hierarchies. Python's standard
658 ``pickle`` module is just such a system. It leverages the language's strong
659 runtime introspection facilities for serializing practically arbitrary
660 user-defined objects. With a few simple and unintrusive provisions this
661 powerful machinery can be extended to also work for wrapped C++ objects.
668 World(std::string a_msg) : msg(a_msg) {}
669 std::string greet() const { return msg; }
673 #include <boost/python.hpp>
674 using namespace boost::python;
676 struct World_picklers : pickle_suite
679 getinitargs(World const& w) { return make_tuple(w.greet()); }
682 BOOST_PYTHON_MODULE(hello)
684 class_<World>("World", init<std::string>())
685 .def("greet", &World::greet)
686 .def_pickle(World_picklers())
690 Now let's create a ``World`` object and put it to rest on disk::
694 >>> a_world = hello.World("howdy")
695 >>> pickle.dump(a_world, open("my_world", "w"))
697 In a potentially *different script* on a potentially *different
698 computer* with a potentially *different operating system*::
701 >>> resurrected_world = pickle.load(open("my_world", "r"))
702 >>> resurrected_world.greet()
705 Of course the ``cPickle`` module can also be used for faster
708 Boost.Python's ``pickle_suite`` fully supports the ``pickle`` protocol
709 defined in the standard Python documentation. Like a __getinitargs__
710 function in Python, the pickle_suite's getinitargs() is responsible for
711 creating the argument tuple that will be use to reconstruct the pickled
712 object. The other elements of the Python pickling protocol,
713 __getstate__ and __setstate__ can be optionally provided via C++
714 getstate and setstate functions. C++'s static type system allows the
715 library to ensure at compile-time that nonsensical combinations of
716 functions (e.g. getstate without setstate) are not used.
718 Enabling serialization of more complex C++ objects requires a little
719 more work than is shown in the example above. Fortunately the
720 ``object`` interface (see next section) greatly helps in keeping the
727 Experienced 'C' language extension module authors will be familiar
728 with the ubiquitous ``PyObject*``, manual reference-counting, and the
729 need to remember which API calls return "new" (owned) references or
730 "borrowed" (raw) references. These constraints are not just
731 cumbersome but also a major source of errors, especially in the
732 presence of exceptions.
734 Boost.Python provides a class ``object`` which automates reference
735 counting and provides conversion to Python from C++ objects of
736 arbitrary type. This significantly reduces the learning effort for
737 prospective extension module writers.
739 Creating an ``object`` from any other type is extremely simple::
741 object s("hello, world"); // s manages a Python string
743 ``object`` has templated interactions with all other types, with
744 automatic to-python conversions. It happens so naturally that it's
747 object ten_Os = 10 * s[4]; // -> "oooooooooo"
749 In the example above, ``4`` and ``10`` are converted to Python objects
750 before the indexing and multiplication operations are invoked.
752 The ``extract<T>`` class template can be used to convert Python objects
755 double x = extract<double>(o);
757 If a conversion in either direction cannot be performed, an
758 appropriate exception is thrown at runtime.
760 The ``object`` type is accompanied by a set of derived types
761 that mirror the Python built-in types such as ``list``, ``dict``,
762 ``tuple``, etc. as much as possible. This enables convenient
763 manipulation of these high-level types from C++::
767 d["lucky_number"] = 13;
770 This almost looks and works like regular Python code, but it is pure
771 C++. Of course we can wrap C++ functions which accept or return
772 ``object`` instances.
778 Because of the practical and mental difficulties of combining
779 programming languages, it is common to settle a single language at the
780 outset of any development effort. For many applications, performance
781 considerations dictate the use of a compiled language for the core
782 algorithms. Unfortunately, due to the complexity of the static type
783 system, the price we pay for runtime performance is often a
784 significant increase in development time. Experience shows that
785 writing maintainable C++ code usually takes longer and requires *far*
786 more hard-earned working experience than developing comparable Python
787 code. Even when developers are comfortable working exclusively in
788 compiled languages, they often augment their systems by some type of
789 ad hoc scripting layer for the benefit of their users without ever
790 availing themselves of the same advantages.
792 Boost.Python enables us to *think hybrid*. Python can be used for
793 rapidly prototyping a new application; its ease of use and the large
794 pool of standard libraries give us a head start on the way to a
795 working system. If necessary, the working code can be used to
796 discover rate-limiting hotspots. To maximize performance these can
797 be reimplemented in C++, together with the Boost.Python bindings
798 needed to tie them back into the existing higher-level procedure.
800 Of course, this *top-down* approach is less attractive if it is clear
801 from the start that many algorithms will eventually have to be
802 implemented in C++. Fortunately Boost.Python also enables us to
803 pursue a *bottom-up* approach. We have used this approach very
804 successfully in the development of a toolbox for scientific
805 applications. The toolbox started out mainly as a library of C++
806 classes with Boost.Python bindings, and for a while the growth was
807 mainly concentrated on the C++ parts. However, as the toolbox is
808 becoming more complete, more and more newly added functionality can be
809 implemented in Python.
811 .. image:: images/python_cpp_mix.png
813 This figure shows the estimated ratio of newly added C++ and Python
814 code over time as new algorithms are implemented. We expect this
815 ratio to level out near 70% Python. Being able to solve new problems
816 mostly in Python rather than a more difficult statically typed
817 language is the return on our investment in Boost.Python. The ability
818 to access all of our code from Python allows a broader group of
819 developers to use it in the rapid development of new applications.
821 =====================
823 =====================
825 The first version of Boost.Python was developed in 2000 by Dave
826 Abrahams at Dragon Systems, where he was privileged to have Tim Peters
827 as a guide to "The Zen of Python". One of Dave's jobs was to develop
828 a Python-based natural language processing system. Since it was
829 eventually going to be targeting embedded hardware, it was always
830 assumed that the compute-intensive core would be rewritten in C++ to
831 optimize speed and memory footprint [#proto]_. The project also wanted to
832 test all of its C++ code using Python test scripts [#test]_. The only
833 tool we knew of for binding C++ and Python was SWIG_, and at the time
834 its handling of C++ was weak. It would be false to claim any deep
835 insight into the possible advantages of Boost.Python's approach at
836 this point. Dave's interest and expertise in fancy C++ template
837 tricks had just reached the point where he could do some real damage,
838 and Boost.Python emerged as it did because it filled a need and
839 because it seemed like a cool thing to try.
841 This early version was aimed at many of the same basic goals we've
842 described in this paper, differing most-noticeably by having a
843 slightly more cumbersome syntax and by lack of special support for
844 operator overloading, pickling, and component-based development.
845 These last three features were quickly added by Ullrich Koethe and
846 Ralf Grosse-Kunstleve [#feature]_, and other enthusiastic contributors arrived
847 on the scene to contribute enhancements like support for nested
848 modules and static member functions.
850 By early 2001 development had stabilized and few new features were
851 being added, however a disturbing new fact came to light: Ralf had
852 begun testing Boost.Python on pre-release versions of a compiler using
853 the EDG_ front-end, and the mechanism at the core of Boost.Python
854 responsible for handling conversions between Python and C++ types was
855 failing to compile. As it turned out, we had been exploiting a very
856 common bug in the implementation of all the C++ compilers we had
857 tested. We knew that as C++ compilers rapidly became more
858 standards-compliant, the library would begin failing on more
859 platforms. Unfortunately, because the mechanism was so central to the
860 functioning of the library, fixing the problem looked very difficult.
862 Fortunately, later that year Lawrence Berkeley and later Lawrence
863 Livermore National labs contracted with `Boost Consulting`_ for support
864 and development of Boost.Python, and there was a new opportunity to
865 address fundamental issues and ensure a future for the library. A
866 redesign effort began with the low level type conversion architecture,
867 building in standards-compliance and support for component-based
868 development (in contrast to version 1 where conversions had to be
869 explicitly imported and exported across module boundaries). A new
870 analysis of the relationship between the Python and C++ objects was
871 done, resulting in more intuitive handling for C++ lvalues and
874 The emergence of a powerful new type system in Python 2.2 made the
875 choice of whether to maintain compatibility with Python 1.5.2 easy:
876 the opportunity to throw away a great deal of elaborate code for
877 emulating classic Python classes alone was too good to pass up. In
878 addition, Python iterators and descriptors provided crucial and
879 elegant tools for representing similar C++ constructs. The
880 development of the generalized ``object`` interface allowed us to
881 further shield C++ programmers from the dangers and syntactic burdens
882 of the Python 'C' API. A great number of other features including C++
883 exception translation, improved support for overloaded functions, and
884 most significantly, CallPolicies for handling pointers and
885 references, were added during this period.
887 In October 2002, version 2 of Boost.Python was released. Development
888 since then has concentrated on improved support for C++ runtime
889 polymorphism and smart pointers. Peter Dimov's ingenious
890 ``boost::shared_ptr`` design in particular has allowed us to give the
891 hybrid developer a consistent interface for moving objects back and
892 forth across the language barrier without loss of information. At
893 first, we were concerned that the sophistication and complexity of the
894 Boost.Python v2 implementation might discourage contributors, but the
895 emergence of Pyste_ and several other significant feature
896 contributions have laid those fears to rest. Daily questions on the
897 Python C++-sig and a backlog of desired improvements show that the
898 library is getting used. To us, the future looks bright.
900 .. _`EDG`: http://www.edg.com
906 Boost.Python achieves seamless interoperability between two rich and
907 complimentary language environments. Because it leverages template
908 metaprogramming to introspect about types and functions, the user
909 never has to learn a third syntax: the interface definitions are
910 written in concise and maintainable C++. Also, the wrapping system
911 doesn't have to parse C++ headers or represent the type system: the
912 compiler does that work for us.
914 Computationally intensive tasks play to the strengths of C++ and are
915 often impossible to implement efficiently in pure Python, while jobs
916 like serialization that are trivial in Python can be very difficult in
917 pure C++. Given the luxury of building a hybrid software system from
918 the ground up, we can approach design with new confidence and power.
924 .. [VELD1995] T. Veldhuizen, "Expression Templates," C++ Report,
925 Vol. 7 No. 5 June 1995, pp. 26-31.
926 http://osl.iu.edu/~tveldhui/papers/Expression-Templates/exprtmpl.html
932 .. [#proto] In retrospect, it seems that "thinking hybrid" from the
933 ground up might have been better for the NLP system: the
934 natural component boundaries defined by the pure python
935 prototype turned out to be inappropriate for getting the
936 desired performance and memory footprint out of the C++ core,
937 which eventually caused some redesign overhead on the Python
938 side when the core was moved to C++.
940 .. [#test] We also have some reservations about driving all C++
941 testing through a Python interface, unless that's the only way
942 it will be ultimately used. Any transition across language
943 boundaries with such different object models can inevitably
946 .. [#feature] These features were expressed very differently in v1 of