Iterators: The Devil's Abstraction
So, I'm writing a C++ application framework called
Crucible. Among other things, Crucible
provides a "supplemental standard library" for C++: code that is intended to add to, improve upon,
or straight up replace parts of the STL. Today I'd like to talk a bit about the iterator module,
which is both one of the most useful and flexible components, but also one of the most complex. That
complexity arises from two places: iterators inherently having non-trivial ownership semantics, and
having to provide an adaptor for the the awful STL iterator API.
Iterator ownership
There are three broad kinds of iterators:
- Copy-only iterators, which do not change the original sequence
- Move-only iterators, which always consume the original sequence
- Borrowing iterators, which allow reading or changing the original sequence in-place
Copy-only and move-only iterators are actually pretty simple, because they don't require any references.
STL Iterators
A note about range-based for
loops
So, let's get something out of the way: a lot of what I'm about to say doesn't matter if you only
care about supporting range-based for
loops, which only require:
- A
begin()
method that returns some kind of iterator-like object
- An
end()
method that returns some kind of sentinel object that is equality-comparable with the
object returned by begin()
- An
operator++()
that can return whatever it wants
- A possibly-
const
operator*()
that returns the item that the iterator is currently pointing at,
by value or by reference
The range-based for
loop is purely syntactic sugar; it was added in C++11, whereas iterator
concepts were added in C++20.
I think that modeling the iterator concepts properly is worthwhile, because however overcomplicated
they are, they do a good job of communicating the expectations that the STL is able to make for
different types of algorithms. But if you want to make "something that works with for
loops" and
never intend to pass it to something like std::ranges::find()
, you can ignore the rest of this
section.
Iterator concepts
Here is the full interface you need to provide to satisfy std::input_or_output_iterator
, the most basic
iterator concept provided by the STL:
- A member type called
difference_type
that is some kind of signed integer
- An
operator++()
that returns a T &
- An
operator++(int)
that can return whatever
- An
operator*()
that returns something other than void
That's not too bad. Maybe a bit verbose, but if it's just for an adaptor class, it's doable. But wait!
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
# Iterators: The Devil's Abstraction
So, I'm writing a C⁠+⁠+ application framework called
[Crucible](https://git.echowritescode.dev/repository/crucible/head). Among other things, Crucible
provides a "supplemental standard library" for C⁠+⁠+: code that is intended to add to, improve upon,
or straight up replace parts of the STL. Today I'd like to talk a bit about the iterator module,
which is both one of the most useful and flexible components, but also one of the most complex. That
complexity arises from two places: iterators inherently having non-trivial ownership semantics, and
having to provide an adaptor for the the awful STL iterator API.
## Iterator ownership
There are three broad kinds of iterators:
- Copy-only iterators, which do not change the original sequence
- Move-only iterators, which always consume the original sequence
- Borrowing iterators, which allow reading or changing the original sequence in-place
Copy-only and move-only iterators are actually pretty simple, because they don't require any references.
## STL Iterators
### A note about range-based `for` loops
So, let's get something out of the way: a lot of what I'm about to say doesn't matter if you only
care about supporting range-based `for` loops, which only require:
- A `begin()` method that returns some kind of iterator-like object
- An `end()` method that returns some kind of sentinel object that is equality-comparable with the
object returned by `begin()`
- An `operator++()` that can return whatever it wants
- A possibly-`const` `operator*()` that returns the item that the iterator is currently pointing at,
by value or by reference
The range-based `for` loop is purely syntactic sugar; it was added in C⁠+⁠+⁠11, whereas iterator
concepts were added in C⁠+⁠+⁠20.
I think that modeling the iterator concepts properly is worthwhile, because however overcomplicated
they are, they do a good job of communicating the expectations that the STL is able to make for
different types of algorithms. But if you want to make "something that works with `for` loops" and
never intend to pass it to something like `std::ranges::find()`, you can ignore the rest of this
section.
### Iterator concepts
Here is the full interface you need to provide to satisfy `std::input_or_output_iterator`, the most basic
iterator concept provided by the STL:
- A member type called `difference_type` that is some kind of signed integer
- An `operator++()` that returns a `T &`
- An `operator++(int)` that can return whatever
- An `operator*()` that returns something other than `void`
That's not too bad. Maybe a bit verbose, but if it's just for an adaptor class, it's doable. But wait!