[library Boost.PolyCollection [quickbook 1.6] [authors [López Muñoz, Joaquín M]] [copyright 2016-2020 Joaquín M López Muñoz] [category containers] [id poly_collection] [dirname poly_collection] [purpose High-performance containers for polymorphic objects. ] [license Distributed under the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at [@http://www.boost.org/LICENSE_1_0.txt]) ] ] [def _Boost.TypeErasure_ [@boost:/libs/type_erasure Boost.TypeErasure]] [def _Boost.TypeIndex_ [@boost:/libs/type_index Boost.TypeIndex]] [def _CopyConstructible_ [@http://en.cppreference.com/w/cpp/named_req/CopyConstructible [* `CopyConstructible`]]] [def _duck_typing_ [@https://en.wikipedia.org/wiki/Duck_typing ['duck typing]]] [def _EqualityComparable_ [@http://en.cppreference.com/w/cpp/named_req/EqualityComparable [* `EqualityComparable`]]] [def _ForwardIterator_ [@http://en.cppreference.com/w/cpp/named_req/ForwardIterator [* `ForwardIterator`]]] [def _MoveConstructible_ [@http://en.cppreference.com/w/cpp/named_req/MoveConstructible [* `MoveConstructible`]]] [def _MoveAssignable_ [@http://en.cppreference.com/w/cpp/named_req/MoveAssignable [* `MoveAssignable`]]] [def _std::for_each_ [@http://en.cppreference.com/w/cpp/algorithm/for_each `std::for_each`]] [def _std::function_ [@http://en.cppreference.com/w/cpp/utility/functional/function `std::function`]] [def _std::unordered_multiset_ [@http://en.cppreference.com/w/cpp/container/unordered_multiset `std::unordered_multiset`]] [def _RandomAccessIterator_ [@http://en.cppreference.com/w/cpp/named_req/RandomAccessIterator [* `RandomAccessIterator`]]] [template ticket[number]''''''#[number]''''''] [template github[module number]''''''#[number]''''''] [template github_pr[module number]''''''PR#[number]''''''] [section Introduction] Dynamic polymorphism in C++ requires that objects (such as instances of classes derived from an abstract base) be accessed through an indirection pointer because their actual /type/ and /size/ are not known at the point of usage. As a consequence, regular containers cannot store polymorphic objects directly: the usual workaround is to have containers of pointers to heap-allocated elements. In modern computer architectures this pattern incurs two types of inefficiency: * The lack of memory contiguity produced by heap allocation degrades CPU cache performance. * Executing virtual operations on a sequence of polymorphic objects whose actual types differ from one to the next results in failures in [@https://en.wikipedia.org/wiki/Branch_predictor branch prediction] and a lower execution speed. When the particular traversal order is not relevant to the user application, Boost.PolyCollection proposes an alternative data structure that restores memory contiguity and packs elements according to their concrete type. Three container class templates are provided: * `boost::base_collection` * `boost::function_collection` * `boost::any_collection` respectively dealing with three different types of dynamic polymorphism available in C++: * Classic base/derived or OOP polymorphism. * Function wrapping in the spirit of _std::function_. * So-called _duck_typing_ as implemented by _Boost.TypeErasure_. The interface of these containers closely follows that of standard containers. Additionally, the library provides versions of many of the standard library algorithms (including [@http://en.cppreference.com/w/cpp/algorithm/for_each `std::for_each`]) with improved performance and a special feature called /type restitution/ that allows user code to provide clues on the concrete types of the elements stored for further opportunities of increased efficiency related to inlining and [@http://hubicka.blogspot.com.es/2014/01/devirtualization-in-c-part-1.html ['devirtualization]]. [note Boost.PolyCollection is a header-only library. C++11 support is required. The library has been verified to work with Visual Studio 2015, GCC 4.8 and Clang 3.3.] [endsect] [section An efficient polymorphic data structure] Suppose we have a `base` abstract class with implementations `derived1`, `derived2` and `derived3`. The memory layout of a `std::vector` (or similar constructs such as `std::vector>` or [@boost:/libs/ptr_container/ `boost::ptr_vector`]) looks like the following: [$poly_collection/img/ptr_vector.png] Elements that are adjacent in the vector are not necessarily allocated contiguously, much less so if the vector has undergone mid insertions and deletions. A typical processing operation std::vector v; ... for(base* b: v){ ... // access base's virtual interface } is impacted negatively by two factors: * Scattering of elements throughout memory reduces CPU caching efficiency, which in general favor regular access loops to contiguous memory areas. * [@https://en.wikipedia.org/wiki/Branch_predictor Branch prediction] tries to minimize the effect of running conditional code (such as an `if`-`else` statement or the invocation of a `base` virtual function) by speculatively executing a given branch based on past history. This mechanism is rendered mostly useless when `derived1`, `derived2` and `derived3` elements are interspersed along the sequence without a definite pattern. These limitations are imposed by the very nature of dynamic polymorphism: as the exact types of the elements accessed through `base`'s interface are not known, an indirection through `base*` (a particular form of /type erasure/) is required. There is however a critical observation: even though derived types are not known when traversing a `std::vector`, the information is typically available /at compile time/ at the point of insertion in the vector: std::vector v; ... v.insert(new derived2{...}); // the type derived2 is "forgotten" by v A suitably designed container can take advantage of this information to arrange elements contiguously according to their exact type, which results in an internal data structure (a map of pointers to [@http://en.cppreference.com/w/cpp/types/type_info `std::type_info`] objects, actually) pointing to as many vectors or /segments/ as there are derived classes: [$poly_collection/img/segment_map.png] Traversing such a structure reduces to looping over all the segments one after another: this is extremely efficient both in terms of caching and branch prediction. In the process we have however lost the free-order capability of a `std::vector` (free order can only be retained at the segment level), but if this is not relevant to the user application the potential performance gains of switching to this structure are large. The discussion has focused on base/derived programming, also known as OOP, but it also applies to other forms of dynamic polymorphism: * _std::function_ abstracts callable entities with the same given signature under a common interface. Internally, pointer indirections and virtual-like function calls are used. Memory fragmentation is expected to be lower than with OOP, though, as implementations usually feature the so-called [@http://talesofcpp.fusionfenix.com/post-16/episode-nine-erasing-the-concrete#a-note-on-small-buffer-optimization [' small buffer optimization]] to avoid heap allocation in some situations. * The case of `std::function` can be seen as a particular example of a more general form of polymorphism called _duck_typing_, where unrelated types are treated uniformly if they conform to the same given /interface/ (a specified set of member functions and/or operations). Duck typing provides the power of OOP while allowing for greater flexibility as polymorphic types need not derive from a preexisting base class or in general be designed with any particular interface in mind --in fact, the same object can be duck-typed to different interfaces. Among other libraries, _Boost.TypeErasure_ provides duck typing for C++. Under the hood, duck typing requires pointer indirection and virtual call implementation techniques analogous to those of OOP, and so there are the same opportunities for efficient container data structures as we have described. Boost.PolyCollection provides three different container class templates dealing with OOP, `std::function`-like polymorphism and duck typing as implemented by Boost.TypeErasure: * `boost::base_collection` * `boost::function_collection` * `boost::any_collection` The interfaces of these containers are mostly the same and follow the usual conventions of standard library containers. [endsect] [section Tutorial] [section Basics] [import ../example/rolegame.hpp] [section `boost::base_collection`] [import ../example/basic_base.cpp] (Code samples from [@boost:/libs/poly_collection/example/basic_base.cpp `basic_base.cpp`].) Imagine we are developing a role playing game in C++ where sprites are rendered on screen; for the purposes of illustration we can model rendering simply as outputting some information to a `std::ostream`: [rolegame_1] The game features warriors, juggernauts (a special type of warrior) and goblins, each represented by its own class derived (directly or indirectly) from `sprite`: [rolegame_2] Let us populate a polymorphic collection with an assortment of sprites: [basic_base_1] There are two aspects to notice here: * `boost::base_collection` does not have a `push_back` member function like, say, `std::vector`, as the order in which elements are stored cannot be freely chosen by the user code \u2014we will see more about this later. The insertion mechanisms are rather those of containers like _std::unordered_multiset_, namely `insert` and `emplace` with or without a position /hint/. * Elements are not created with `new` but constructed on the stack and passed directly much like one would do with a standard non-polymorphic container. [important Elements inserted into a `boost::base_collection` (or the other containers of Boost.PolyCollection) must be copyable and assignable; strictly speaking, they must at least model _MoveConstructible_ and either be _MoveAssignable_ or not throw on move construction. This might force you to revisit your code as it is customary to explicitly forbid copying at the base level of a virtual hierarchy to avoid [@https://en.wikipedia.org/wiki/Object_slicing ['slicing]].] Rendering can now be implemented with a simple `for` loop over `c`: [basic_base_2] The output being: [pre juggernaut 0,juggernaut 4,juggernaut 7,goblin 1,goblin 3,goblin 5,warrior 2,warrior 6 ] As we have forewarned, the traversal order does not correspond to that of insertion. Instead, the elements are grouped in /segments/ according to their concrete type. Here we see that juggernauts come first, followed by goblins and warriors. In general, no assumptions should be made about segment ordering, which may be different for this very example in your environment. On the other hand, elements inserted into an already existing segment always come at the end (except if a hint is provided). For instance, after inserting a latecomer goblin with `id==8`: [basic_base_3] the rendering loop outputs (new element in red): [pre juggernaut 0,juggernaut 4,juggernaut 7,goblin 1,goblin 3,goblin 5,[role red goblin 8],warrior 2,warrior 6 ] Deletion can be done in the usual way: [basic_base_4] Now rendering emits: [pre juggernaut 0,juggernaut 4,goblin 1,goblin 3,goblin 5,goblin 8,warrior 2,warrior 6 ] [endsect] [section `boost::function_collection`] [import ../example/basic_function.cpp] (Code samples from [@boost:/libs/poly_collection/example/basic_function.cpp `basic_function.cpp`]. C++14 `std::make_unique` is used.) Well into the development of the game, the product manager requests that two new types of entities be added to the rendering loop: * Overlaid messages, such as scores, modelled as `std::string`s. * Pop-up windows coming from a third party library that, unfortunately, do not derive from `sprite` and use their own `display` member functon for rendering: [rolegame_3] We then decide to refactor the code so that sprites, messsages and windows are stored in dedicated containers: [basic_function_1] and define our rendering container as a `boost::function_collection` of /callable entities/ \u2014function pointers or objects with a function call `operator()`\u2014 accepting a `std::ostream&` as a parameter [basic_function_2] which we fill with suitable adaptors for `sprite`s, `std::string`s and `window`s, respectively. Lambda functions allow for a particularly terse code. [basic_function_3] The rendering loop now looks like this: [basic_function_4] and produces the following for a particular scenario of sprites, messages and windows: [pre juggernaut 0,goblin 1,warrior 2,goblin 3,"stamina: 10,000","game over",'''[pop-up 1]''','''[pop-up 2]''' ] The container we have just created works in many respects like a `std::vector>`, the main difference being that with `boost::function_collection` callable entities of the same type are packed together in memory [footnote Note that all `sprite`s come into one segment: this is why goblins #1 and #3 are not adjacent. Exercise for the reader: change the code of the example so that sprites are further segmented according to their concrete type.], thus increasing performance (which is the whole point of using Boost.PolyCollection), while a vector of `std::function`s results in an individual allocation for each entity stored [footnote Except when small buffer optimization applies.]. In fact, the `value_type` elements held by a `boost::function_collection` are actually /not/ `std::function`s, although they behave analogously and can be converted to `std::function` if needed: [basic_function_5] There is a reason for this: elements of a polymorphic collection cannot be freely assigned to by the user as this could end up trying to insert an object into a segment of a different type. So, unlike with `std::function`, this will not work: [basic_function_6] [endsect] [section `boost::any_collection`] [import ../example/basic_any.cpp] (Code samples from [@boost:/libs/poly_collection/example/basic_any.cpp `basic_any.cpp`].) [note Here we just touch on the bare essentials of _Boost.TypeErasure_ needed to present `boost::any_collection`. The reader is advised to read Boost.TypeErasure documentation for further information.] After measuring the performance of the latest changes, we find that rendering is too slow and decide to refactor once again: if we could store all the entities --sprites, messages and windows-- into one single container, that would eliminate a level of indirection. The problem is that these types are totally unrelated to each other. _Boost.TypeErasure_ provides a class template `boost::type_erasure::any` able to hold an object of whatever type as long as it conforms to the interface specified by `Concept`. In our case, we find it particularly easy to implement a common interface for rendering by providing overloads of `operator<<` [basic_any_1] so that we can use it to specify the interface all entities have to adhere to: [basic_any_2] The collection just created happily accepts insertion of heterogeneous entities (since interface conformity is silently checked at compile time) [basic_any_3] and rendering becomes [basic_any_4] with output [pre '''[pop-up 1]''','''[pop-up 2]''',juggernaut 0,goblin 1,goblin 3,warrior 2,"stamina: 10,000","game over" ] As was the case with `boost::function_collection`, this container is similar to a `std::vector>` but has better performance due to packing of same-type elements. Also, the `value_type` of a `boost::any_collection` is /not/ `boost::type_erasure::any`, but a similarly behaving entity [footnote Actually, it is `boost::type_erasure::any` for some internally defined `Concept2` that extends `Concept`.]. In any case, we are not accessing sprites through an abstract `sprite&` anymore, so we could as well dismantle the virtual hierarchy and implement each type autonomously: this is left as an exercise to the reader. [endsect] [endsect] [section Deeper into the segmented nature of Boost.PolyCollection] [import ../example/segmented_structure.cpp] (Code samples from [@boost:/libs/poly_collection/example/segmented_structure.cpp `segmented_structure.cpp`]. C++14 `std::make_unique` is used.) [section Type registration] Getting back to our [link poly_collection.tutorial.basics.boost_base_collection `boost::base_collection`] example, suppose we want to refactor the populating code as follows: [segmented_structure_1] Unexpectedly, this piece of code throws an exception of type [link poly_collection.reference.header_boost_poly_collection_exc.class_unregistered_type `boost::poly_collection::unregistered_type`]. What has changed from our original code? Suppose a `warrior` has been created by `make_sprite`. The statement `c.insert(*make_sprite())` is passing the object as a `sprite&`: even though `boost::base_collection` is smart enough to know that the object is actually derived from `sprite` (by using [@http://en.cppreference.com/w/cpp/language/typeid `typeid()`]) and [@https://en.wikipedia.org/wiki/Object_slicing slicing] is to be avoided, there is no way that a segment for it can be created without accessing the type `warrior` /at compile time/ for the proper internal class templates to be instantiated [footnote If this is conceptually difficult to grasp, consider the potentially more obvious case where `warrior` is defined in a dynamic module linked to the main program: the code of `boost::base_collection`, which has been compiled before linking, cannot even know the size of this as-of-yet unseen class, so hardly can it allocate a segment for the received object.]. This did not happen in the pre-refactoring code because objects were passed as references to their true types. A type is said to be /registered/ into a polymorphic collection if there is a (potentially empty) segment created for it. This can be checked with: [segmented_structure_2] Registration happens automatically when the object is passed as a reference to its true type or [link poly_collection.tutorial.insertion_and_emplacement.emplacement `emplace`]'d, and explicitly with `register_types`: [segmented_structure_3] Once `T` has been registered into a polymorphic collection, it remains so regardless of whether objects of type `T` are stored or not, except if the collection is moved from, assigned to, or swapped. As slicing is mainly an issue with OOP, `unregistered_type` exceptions are expected to be much less frequent with `boost::function_collection` and `boost::any_collection`. Contrived examples can be found, though: [segmented_structure_4] [endsect] [section Segment-specific interface] For most of the interface of a polymorphic collection, it is possible to only refer to the elements of a given segment by providing either their type or `typeid()`. For instance: [segmented_structure_5] Note that any of these (except [link poly_collection.tutorial.deeper_into_the_segmented_nature.reserve `reserve`]) will throw `boost::poly_collection::unregistered_type` if the type is not registered. Segment-specific addressability also includes traversal: [endsect] [section Local iterators] The following runs only over the `warrior`s of the collection: [segmented_structure_6] Output: [pre warrior 2,warrior 6 ] `begin|end(typeid(T))` return objects of type `local_base_iterator` associated to the segment for `T`. These iterators dereference to the same value as regular iterators (in the case of `boost::base_collection`, `base&`) but can only be used to traverse a given segment (for instance, `local_base_iterator`'s from different segments cannot be compared between them). In exchange, `local_base_iterator` is a _RandomAccessIterator_, whereas whole-collection iterators only model _ForwardIterator_. A terser range-based `for` loop can be used with the convenience `segment` member function: [segmented_structure_7] Even more powerful than `local_base_iterator` is `local_iterator`: [segmented_structure_8] This appends a `"super"` prefix to the `rank` data member of existing warriors: [pre superwarrior 2,superwarrior 6 ] The observant reader will have noticed that in order to access `rank`, which is a member of `warrior` rather than its base class `sprite`, `first` (or `x` in the range `for` loop version) has to refer to a `warrior&`, and this is precisely the difference between `local_iterator` (the type returned by `begin()`) and `local_base_iterator`. `local_iterator` is also a _RandomAccessIterator_: for most respects, \[`begin()`, `end()`) can be regarded as a range over an array of `T` objects. `local_iterator`s can be explicitly converted to `local_base_iterator`s. Conversely, if a `local_base_iterator` is associated to a segment for `T`, it can then be explictly converted to a `local_iterator` (otherwise the conversion is undefined behavior). The figure shows the action scopes of all the iterators associated to a polymorphic collection (`const_` versions not included): [$ poly_collection/img/poly_collection_iterators.png] We have yet to describe the bottom part of the diagram. Whereas `segment(typeid(T))` is used to range over the /elements/ of a particular segment, `segment_traversal()` returns an object for ranging over /segments/, so that the whole collection can be processed with a nested segment-element `for` loop like the following: [segmented_structure_9] Segment decomposition of traversal loops forms the basis of [link poly_collection.tutorial.algorithms improved-performance algorithms]. [endsect] [section Reserve] Much like `std::vector`, segments can be made to reserve memory in advance to minimize reallocations: [segmented_structure_10] If there is no segment for the indicated type (here, `goblin`), one is automatically created. This is in contrast with the rest of segment-specific member functions, which throw [link poly_collection.tutorial.deeper_into_the_segmented_nature.type_registration `boost::poly_collection::unregistered_type`]. `reserve` can deal with all segments at once. The following [segmented_structure_11] instructs every /existing/ segment to reserve 1,000 elements. If a segment is created later (through element insertion or with [link poly_collection.tutorial.deeper_into_the_segmented_nature.type_registration type registration]), its capacity will be different. [note Unlike standard containers, collection-level `capacity()` and `max_size()` are not provided because their usual semantics cannot be applied to Boost.PolyCollection: for instance, `capacity()` is typically used to check how many elements can be inserted without reallocation, but in a segmented structure that depends on the exact types of the elements and whether they are registered or not. ] [endsect] [endsect] [section Insertion and emplacement] [import ../example/insertion_emplacement.cpp] (Code samples from [@boost:/libs/poly_collection/example/insertion_emplacement.cpp `insertion_emplacement.cpp`].) We already know that `insert(x)`, without further positional information, stores `x` at the end of its associated segment. When a regular iterator `it` is provided, insertion happens at the position indicated if both `it` and `x` belong in the same segment; otherwise, `it` is ignored. For instance, if our sprite collection has the following entities: [pre juggernaut 0,juggernaut 4,juggernaut 7,goblin 1,goblin 3,goblin 5,warrior 2,warrior 6 ] then this code: [insertion_emplacement_1] puts the new `juggernaut` at the beginning: [pre [role red juggernaut 8],juggernaut 0,juggernaut 4,juggernaut 7,goblin 1,... ] whereas the position hint at [insertion_emplacement_2] is ignored and the new `goblin` gets inserted at the end of its segment: [pre juggernaut 8,...,juggernaut 7,goblin 1,goblin 3,goblin 5,[role red goblin 9],warrior 2,... ] [link poly_collection.tutorial.deeper_into_the_segmented_nature.local_iterators Local iterators] can also be used to indicate the insertion position: [insertion_emplacement_3] [pre juggernaut 8,juggernaut 0,[role red juggernaut 10],juggernaut 4,juggernaut 7,goblin 1,... ] There is a caveat, though: when using a local iterator, the element inserted /must belong to the indicated segment/: [insertion_emplacement_4] Member functions are provided for range insertion, with and without a position hint: [insertion_emplacement_5] As [link poly_collection.tutorial.deeper_into_the_segmented_nature.type_registration already explained], care must be taken that types be pre-registered into the collection if they are not passed as references to their actual type. So, why did not this line [insertion_emplacement_6] throw `boost::poly_collection::unregistered_type`? As it happens, in the special case where the inserted range belongs to a polymorphic collection of the same type, registration is done automatically [footnote That is, Boost.PolyCollection has enough static information to do type registration without further assistance from the user.]. [#poly_collection.tutorial.insertion_and_emplacement.emplacement] Emplacement is slightly different for Boost.PolyCollection than with standard containers. Consider this attempt at emplacing a `goblin`: [insertion_emplacement_7] If examined carefully, it is only natural that the code above not compile: Boost.PolyCollection cannot possibly know that it is precisely a `goblin` that we want to emplace and not some other type constructible from an `int` (like `warrior`, `juggernaut` or an unrelated class). So, the type of the emplaced element has to be specified explicitly: [insertion_emplacement_8] As with `insert`, a position can be indicated for emplacing: [insertion_emplacement_9] Note the naming here: `emplace_hint` is used when the position is indicated with a regular iterator, and for local iterators it is `emplace_pos`. [endsect] [section Exceptions] [import ../example/exceptions.cpp] (Code samples from [@boost:/libs/poly_collection/example/exceptions.cpp `exceptions.cpp`].) Besides the usual exceptions like [@http://en.cppreference.com/w/cpp/memory/new/bad_alloc `std::bad_alloc`] and those generated by user-provided types, polymorphic collections can throw the following: * `boost::poly_collection::unregistered_type` * `boost::poly_collection::not_copy_constructible` * `boost::poly_collection::not_equality_comparable` The situations in which the first is raised have been [link poly_collection.tutorial.deeper_into_the_segmented_nature.type_registration already discussed]; let us focus on the other two. We have a new type of sprite in our game defined by the non-copyable class `elf`: [rolegame_4] and we use it without problems until we write this: [exceptions_1] The first insertion works because the `elf` object passed is /temporary/ and can be /moved/ into the container, but the second statement actually needs to /copy/ the `elf` elements in `c` to `c2`, hence the exception. The potentially surprising aspect of this behavior is that standard containers signal this kind of problems by /failing at compile time/. Here we cannot afford this luxury because the exact types contained in a polymorphic collection are not known until run time (for instance, if `elf` elements had been erased before copying `c` to `c2` everything would have worked): basically, the deferral of errors from compile time to run time is an intrinsic feature of dynamic polymorphism. In the same vein, equality comparison requires that elements themselves be equality comparable: [exceptions_2] The above is unremarkable once we notice we have not defined `operator==` for any `sprite`. The problem may go unnoticed for quite some time, however, because determining that two polymorphic collections are equal (i.e. all their non-empty segments are equal) can return `false` without comparing anything at all (for instance, if segment sizes differ), in which case no exception is thrown. [note Operators for `<`, `<=`, `>` and `>=` comparison are not provided because /segment/ order is not fixed and may vary across otherwise identical collections. The situation is similar to that of standard [@http://en.cppreference.com/w/cpp/named_req/UnorderedAssociativeContainer unordered associative containers.] ] These three are all the intrinsic exceptions thrown by Boost.PolyCollection. The implication is that if a type is _CopyConstructible_, _MoveAssignable_ (or move construction does not throw) and _EqualityComparable_, then the entire interface of Boost.PolyCollection is unrestrictedly available for it [footnote Provided, of course, that the type has the /right/ to be in the collection, that is, it is derived from the specified base, or callable with the specified signature, etc.]. [endsect] [section Algorithms] [import ../example/algorithms.cpp] (Code samples from [@boost:/libs/poly_collection/example/algorithms.cpp `algorithms.cpp`]. C++14 generic lambda expressions are used.) The ultimate purpose of Boost.PolyCollection is to speed up the processing of large quantities of polymorphic entities, in particular for those operations that involve linear traversal as implemented with a `for`-loop or using the quintessential _std::for_each_ algorithm. [algorithms_1] Replacing the container used in the program from classic alternatives to Boost.PolyCollection: * `std::vector` (or similar) → `boost::base_collection` * `std::vector>` → `boost::function_collection` * `std::vector>` → `boost::any_collection` is expected to increase performance due to better caching and branch prediction behavior. But there is still room for improvement. Consider this transformation of the code above using a double segment-element loop (based on the [link poly_collection.tutorial.deeper_into_the_segmented_nature.local_iterators local iterator] capabilities of Boost.PolyCollection): [algorithms_2] Although not obvious at first sight, this version of the same basic operation is actually /faster/ than the first one: for a segmented structure such as used by Boost.PolyCollection, each iteration with the non-local iterator passed to `std::for_each` involves: # hopping to next position in the segment, # checking for /end of segment/ (always), # if at end (infrequent), selecting the next segment, # checking for /end of range/ (always), whereas in the second version, iteration on the inner loop, where most processing happens, is a simple increment-and-check operation, that is, there is one less check (which happens at the much shorter outer loop). When the workload of the algorithm (the actually useful stuff done with the elements themselves) is relatively light, the overhead of looping can be very significant. To make it easier for the user to take advantage of faster segment-element looping, Boost.PolyCollection provides its own version of `for_each` based on that technique: [algorithms_3] `boost::poly_collection::for_each` has the same interface and behavior as `std::for_each` except for the fact that it only works for (non-local) iterators of a polymorphic container [footnote For any other type of iterator, it is guaranteed not to compile.]. Versions of other standard algorithms are provided as well: [algorithms_4] In fact, variants are given of most (though not all) of the algorithms in [@http://en.cppreference.com/w/cpp/algorithm ``] among those that are compatible with polymorphic collections [footnote For example, algorithms requiring bidirectional iterators or a higher category are not provided because polymorphic collections have forward-only iterators.]. Check the [link poly_collection.reference.header_boost_poly_collection_alg reference] for details. [section Type restitution] By /type restitution/ we mean the generic process of getting a concrete entity from an abstract one by providing missing type information: [algorithms_5] Type restitution has the potential to increase performance. Consider for instance the following: [algorithms_6] Behaviorwise this code is equivalent to simply executing `std::cout< ''' ] [role aligncenter [*[name], [environment]]] [endsect] ] [import ../example/perf.cpp] (Testing program at [@boost:/libs/poly_collection/example/perf.cpp `perf.cpp`].) We ran tests to measure the performance of the containers of Boost.PolyCollection in two scenarios: * Insertion of elements. * Linear traversal and processing with `std::for_each` and `boost::poly_collection::for_each` (with and without [link poly_collection.tutorial.algorithms.type_restitution type restitution]). As a comparison baseline we used containers and facilities from the standard library and Boost (details below). Tests were run in the following environments: * *_vs2015_x86_*: Visual C++ 2015 in 32-bit (x86) release mode, Windows 7 64-bit, Intel Core i5-2520M @2.5GHz * *_vs2015_x64_*: Visual C++ 2015 in 64-bit (x64) release mode, same machine * *_gcc63_x64_*: GCC 6.3 release mode, Xubuntu 17.04 x64, Intel Core i7-5820 @3.3GHz * *_clang40_x64_*: Clang 4.0 release mode, same machine [section Container definitions] [section `boost::base_collection`] * Baseline container: `ptr_vector` = `boost::ptr_vector` * Polymorphic collection: `base_collection` = `boost::base_collection` * Element types: `T1` = `derived1`, `T2` = `derived2`, `T3` = `derived3` [perf_base_types] [endsect] [section `boost::function_collection`] * Baseline container: `func_vector` = `std::vector>` * Polymorphic collection: `function_collection` = `boost::function_collection` * Element types: `T1` = `concrete1`, `T2` = `concrete2`, `T3` = `concrete3` [perf_function_types] [endsect] [section `boost::any_collection`] * Baseline container: `any_vector` = `std::vector>` * Polymorphic collection: `any_collection` = `boost::any_collection` * Element types: `T1` = `int`, `T2` = `double`, `T3` = `char` [perf_any_types] [endsect] [endsect] [section Insertion tests] Tests measure the time taken to insert /n/ elements (/n/ between 10'''2''' and 10'''7''') from a source of values with types following the cyclic sequence `T1` `T1` `T2` `T2` `T3` `T1` `T1` `T2` `T2` `T3` ··· No `reserve` operation is done before insertion. The figures show resulting times in nanoseconds/element. The horizontal axis is logarithmic. [section Results for `boost::base_collection`] [perf_fig Insertion.._vs2015_x86_..insert_base_vs2015_x86] [perf_fig Insertion.._vs2015_x64_..insert_base_vs2015_x64] [perf_fig Insertion.._gcc63_x64_..insert_base_gcc63_x64] [perf_fig Insertion.._clang40_x64_..insert_base_clang40_x64] [endsect] [section Results for `boost::function_collection`] [perf_fig Insertion.._vs2015_x86_..insert_function_vs2015_x86] [perf_fig Insertion.._vs2015_x64_..insert_function_vs2015_x64] [perf_fig Insertion.._gcc63_x64_..insert_function_gcc63_x64] [perf_fig Insertion.._clang40_x64_..insert_function_clang40_x64] [endsect] [section Results for `boost::any_collection`] [perf_fig Insertion.._vs2015_x86_..insert_any_vs2015_x86] [perf_fig Insertion.._vs2015_x64_..insert_any_vs2015_x64] [perf_fig Insertion.._gcc63_x64_..insert_any_gcc63_x64] [perf_fig Insertion.._clang40_x64_..insert_any_clang40_x64] [endsect] [endsect] [section Processing tests] Tests measure the time taken to traverse a container of size /n/ (/n/ between 10'''2''' and 10'''7''') and execute an operation on each of its elements. The operation for `boost::base_collection` and `boost::function_collection` (and the associated baseline containers) is defined as [perf_for_each_callable] whereas for `boost::any_collection` we use [perf_for_each_incrementable] The baseline container is tested with three different setups: * Directly as initialized by the process described for the [link poly_collection.performance.insertion_tests insertion tests]. The sequence of types is complex enough that CPU's branch prediction mechanisms are not able to fully anticipate it [footnote This has been verified empirically: simpler cycles did indeed yield better execution times.]. As elements are ordered according to their construction time, certain degree of memory contiguity is expected. * With an extra post-insertion stage by which elements are sorted according to their `typeid()`. This increases branch prediction efficiency at the expense of having worse cache friendliness. * With an extra post-insertion stage that randomly shuffles all the elements in the container. This is the worst possible scenario both in terms of caching and branch prediction. As for the polymorphic collection, three variations are measured: * With `std::for_each` (the same as the baseline container). * Using `boost::poly_collection::for_each`. * Using `boost::poly_collection::for_each` with [link poly_collection.tutorial.algorithms.type_restitution /type restitution/] of `T1`, `T2` and `T3`. The figures show resulting times in nanoseconds/element. The horizontal axis is logarithmic. [section Results for `boost::base_collection`] [perf_fig Processing.._vs2015_x86_..for_each_base_vs2015_x86] [perf_fig Processing.._vs2015_x64_..for_each_base_vs2015_x64] [perf_fig Processing.._gcc63_x64_..for_each_base_gcc63_x64] [perf_fig Processing.._clang40_x64_..for_each_base_clang40_x64] [endsect] [section Results for `boost::function_collection`] [perf_fig Processing.._vs2015_x86_..for_each_function_vs2015_x86] [perf_fig Processing.._vs2015_x64_..for_each_function_vs2015_x64] [perf_fig Processing.._gcc63_x64_..for_each_function_gcc63_x64] [perf_fig Processing.._clang40_x64_..for_each_function_clang40_x64] [endsect] [section Results for `boost::any_collection`] [perf_fig Processing.._vs2015_x86_..for_each_any_vs2015_x86] [perf_fig Processing.._vs2015_x64_..for_each_any_vs2015_x64] [perf_fig Processing.._gcc63_x64_..for_each_any_gcc63_x64] [perf_fig Processing.._clang40_x64_..for_each_any_clang40_x64] [endsect] [endsect] [endsect] [include reference.qbk] [section Future work] A number of features asked by reviewers and users of Boost.PolyCollection are considered for inclusion into future versions of the library. [section Alternative RTTI systems] Boost.PolyCollection can be extended to use _Boost.TypeIndex_ in RTTI-challenged scenarios. Taking this idea further, it is not unusual that some environments (game engines, for instance) provide their own RTTI framework: an even more ambitious extension to Boost.PolyCollection would then be to make it configurable for user-provided RTTI through some sort of traits class specifying replacements for [@http://en.cppreference.com/w/cpp/types/type_info `std::type_info`] and [@http://en.cppreference.com/w/cpp/language/typeid `typeid`]. [endsect] [section Copy traits] `boost::base_collection` requires that stored objects be _MoveConstructible_ and _MoveAssignable_; unfortunately, it is customary to restrict copying in OOP hierarchies to avoid slicing, which would force users to revisit their class definitions in order to use Boost.PolyCollection. This can be alleviated by offering a configurable traits class where copy and assignment can be defined externally `` template struct copy_traits { void construct(void*,T&&); void assign(T&,T&&); }; `` with default implementations resorting to regular placement `new` and `T::operator=`. [endsect] [section Parallel algorithms] C++17 introduces [@http://en.cppreference.com/w/cpp/experimental/parallelism parallel algorithms], like for instance a parallel version of _std::for_each_; it is only natural then to provide the corresponding [link poly_collection.tutorial.algorithms Boost.PolyCollection-specific algorithms]. The segmented nature of polymorphic collections makes them particularly amenable to parallel processing. [endsect] [section `variant_collection`] /Closed polymorphism/ is a kind of dynamic polymorphism where the set of implementation types is fixed at definition time: the prime example of this paradigm in C++ is [@http://en.cppreference.com/w/cpp/utility/variant `std::variant`]. Although `boost::any_collection>` can act as a sort of replacement for `std::vector>`, this is in fact more similar to a `std::vector<`[@http://en.cppreference.com/w/cpp/utility/any `std::any`]`>`, and a collection class `boost::variant_collection` could be designed to better model closed polymorphism and take further advantage of the fact that implementation types are fixed (for instance, internal virtual calls can be completely eliminated). From a conceptual point of view, this would require introducing a new [* `ClosedPolymorphicCollection`] notion and renaming the current [link poly_collection.reference.polymorphic_containers.polymorphic_collections [* `PolymorphicCollection`]] model to [* `OpenPolymorphicCollection`]. [endsect] [section Ordered polymorphic collections] Users have expressed interest in polymorphic collections where elements are kept ordered within their segment and optionally duplicates are excluded, much like [@http://boost.org/doc/html/container/non_standard_containers.html#container.non_standard_containers.flat_xxx `boost::flat_set`/`boost::flat_multiset`] do over their internal data vector. The open question remains of whether these collections should also guarantee some order between segments (current ones don't) to allow for the definition of container-level `operator<` and related operators. [endsect] [endsect] [section Release notes] [section Boost 1.73] * Suppressed a potential redundant move warning in `boost::poly_collection::for_each`. * Fixed a bug by which elements were copied rather than moved in allocator-extended move construction and move assigment between collections with non-propagating, unequal allocators. * Allocator-extended move construction no longer decays to allocator-extended copy construction for the legacy version of libstdc++-v3 shipped with GCC 4.8 (which can also be used by Clang). [endsect] [section Boost 1.72] * Maintenance work. [endsect] [section Boost 1.71] * Maintenance work. [endsect] [section Boost 1.70] * Improved handling of stateful allocators and allocator propagation traits, after an error reported by Billy O'Neal ([github_pr poly_collection 9]). * Fixed a potentially serious bug with an internal cache structure. [endsect] [section Boost 1.69] * Added Boost.PolyCollection-specific versions of algorithms `std::for_each_n` and `std::sample`. [endsect] [section Boost 1.67] * Maintenance fixes. [endsect] [section Boost 1.66] * Boost.PolyCollection has been backported to GCC 4.8 to 4.9 and Clang 3.3 to 3.6. The version of libstdc++-v3 shipped with GCC 4.8 (which can also be used by Clang) has deficiencies that result in the following limitations when using Boost.PolyCollection: * Stateful allocators are not properly supported. * Allocator-extended move construction decays to allocator-extended copy construction. * Copy construction crashes if an exception is thrown during element copying. * Maintenance fixes. [endsect] [section Boost 1.65] * Initial release. [endsect] [endsect] [section Acknowledgments] The library uses [@http://www.pdimov.com/ Peter Dimov]'s [@http://www.pdimov.com/cpp2/simple_cxx11_metaprogramming.html implementation of `std::make_index_sequence`]. [@http://manu343726.github.io/ Manu Sánchez] aided immensely with CI setup and performance testing. [@http://stackoverflow.com/users/85371/sehe Sehe] contributed performance results for GCC 5.2, and [@https://www.linkedin.com/in/francisco-jose-tapia-iba%C3%B1ez-4239a07a Francisco José Tapia] for Clang 3.9. The Boost acceptance review took place between the 3rd and 17th of May, 2017. Thank you to Ion Gaztañaga for smoothly managing the process. The following people participated with full reviews or valuable comments: Pete Bartlett, Hans Dembinski, Dominique Devienne, Edward Diener, Vinnie Falco, Ion Gaztañaga, Andrzej Krzemienski, Brook Milligan, Thorsten Ottosen, Steven Watanabe, Adam Wulkiewicz. Many thanks to all of them. Steven Watanabe gave crucial help in solving some hard problems related to the usage of _Boost.TypeErasure_. Boost.PolyCollection was designed and written between rainy [@https://es.wikipedia.org/wiki/Viav%C3%A9lez Viavélez], noisy Madrid and beautiful [@https://en.wikipedia.org/wiki/C%C3%A1ceres,_Spain Cáceres], August-November, 2016. Most of the after-review work in preparation for the official Boost release was done in the quiet town of [@https://es.wikipedia.org/wiki/Oropesa_(Toledo) Oropesa] during the spring of 2017. In memory of Joaquín López Borrella (1939-2015), in memory of Héctor (2004-2017): may your ghosts keep us company. [endsect]