This post is part of the Surprises in C++ series.
C++ has a reputation for being one of the fastest languages out there but this is only the case when it’s written reasonably well. One of the biggest offenders from a performance point of view is unnecessary copies and they can quickly lead to C++ code being extremely slow. One of the main solutions to this problem is to use references and pointers in the parameter and return types.
If you want to see just how much of a difference this can make, consider the following simple code:
std::string longestString(
std::vector<std::vector<std::string>> data) {
std::string longest = "";
for (auto subvec : data) {
for (auto curstring : subvec) {
if (curstring.size() > longest.size()) {
longest = curstring;
}
}
}
return longest;
}
This has a couple of crucial mistakes: in our loop iteration, subvec and curstring are captured by value. This means that the outer loop copies all of the subvectors and their associated strings and then the inner loop copies all of the strings again. Additionally, the data parameter is taken by value, meaning that the caller must also copy the entire nested vector structure. This copying all amounts to a huge amount of unnecessary work!
Fixing this example is, thankfully, simple. All we need to do is make the data parameter and loop range parameters const references:
std::string longestString(
const std::vector<std::vector<std::string>>&
data) {
std::string longest = "";
for (const auto& subvec : data) {
for (const auto& curstring : subvec) {
if (curstring.size() > longest.size()) {
longest = curstring;
}
}
}
return longest;
}
In my measurements, with a tiny input, the version with unnecessary copying was 15 times slower than the version without, which shows just how huge this cost can be. In this example, while the performance penalty was high, the unnecessary copy was, thankfully, not too hard to spot as the offending variables are explicitly not references. Unfortunately, however, it is not always quite so clear.
Lifetime Extension
There are often occasions in which we write functions that return a const reference. This might be to allow (logically) read-only access to a protected member variable of an object or a protected global variable (e.g. function local static). Another example would be ‘search’ like functions, including a more optimised version of our longestString function above.
To avoid the copy, the caller of this function must also specify they want a reference:
const std::string& getRefToGlobalVariable();
void foo() {
// Allowed and safe, but makes an unnecessary copy.
std::string aVal = getRefToGlobalVariable();
// Better, no copy!
const std::string& aRef
= getRefToGlobalVariable();
}
An interesting thing happens, however, if we try the same thing with a return by value function:
std::string getStringByValue();
void foo() {
// The obvious way, no problem this time
std::string aVal = getStringByValue();
// Interesting...
const std::string& aRef = getStringByValue();
}
This case of trying to assign a value to a temporary seems like it shouldn’t work. In the first case, getStringByValue will produce a temporary object which is then moved or copied into aVal as appropriate (ignoring copy elision, which you can read about in part 2 of the ‘C++ Lifetime Quiz’). The temporary is then immediately destroyed.
In the second case, we might expect getStringByValue to produce a temporary object whose reference is saved to aRef only for the temporary to be immediately destroyed. If this happened, aRef would be left as a dangling reference making any usage of it undefined behaviour!
However, that isn’t what happens at all. The C++ standard specifies that this is a special case and must be handled differently. Instead, a local unnamed variable is created to store the result of getStringByValue by value and aRef becomes a reference to that unnamed local variable. As a result, aRef isn’t a dangling reference at all and is completely safe to use!
However, we’re sort of getting the worst of both worlds here. We still have to take the result of the function call by value so no copies have been avoided. On top of that, we don’t get any of the benefits of a ‘value’ variable such as named return value optimisation (NRVO), an optimisation that can sometimes avoid copies when returning a named local variable.
The Ternary Operator
The ternary operator is a commonly used alternative to short if statements, as seen is the simple strMax function:
const std::string& strMax(
const std::string& a,
const std::string& b) {
return a > b ? a : b;
}
Thankfully, the ternary operator ‘passes through’ references as we would expect and so strMax doesn’t require any copies!
Now, let’s consider the case of loading a value from a map (by reference, to avoid unnecessary copies) but with a default value in the case that our key isn’t in the map:
const ValueType& makeDefault();
void foo(
const std::map<std::string, ValueType>& map,
const std::string& key) {
auto it = map.find(key);
const ValueType& myValue =
it != map.end()
? it->second
: makeDefault();
// Do something with myValue here
}
As in our strMax example, this works exactly as we would want. Our result in myValue doesn’t require any copies and the (possibly expensive) makeDefault is only called if the key is not in the map.
However, what if makeDefault returns by value instead? This may be the case if it is derived from some configuration we don’t want to cache, for example. We might hope that we get a normal reference in the ‘found in map’ case and lifetime extension of the default in the ‘not in map’ case but this is unfortunately not what happens.
Instead, we get lifetime extension in the ‘not in map’ case but a lifetime extended copy in the ‘found in map’ case, which is not all what we want! The reason this happens is that the ternary operator must have a consistent type across both its branches and, crucially, reference and value types are different.
The operator is forced to choose its type as either const ValueType& or ValueType. If it chose const ValueType& then it would have to cast ValueType to const ValueType&. The lifetime extension we discussed earlier cannot be used here; it is only applicable when the result of an expression is assigned to a local reference variable. This means lifetime extension can’t be used ‘mid-ternary’ and so the cast to const ValueType& would always result in a dangling reference and is therefore forbidden.
This forces the compiler to chose the type to be ValueType which requires a copy in the ‘found in map’ case but avoids dangling references. When assigning the result of this ternary to myValue, lifetime extension is applied. This nasty interaction between ternary operators having uniform return types and lifetime extension means this compiles and work correctly, but with an unnecessary copy subtly and silently eating away at our performance.
Resolving this is not very neat, but a possible alternative implementation would be:
ValueType makeDefault();
void foo(
const std::map<std::string, ValueType>& map,
const std::string& key) {
auto it = map.find(key);
// Load the default, but only if we need it
std::optional<ValueType> m_default =
it != map.end()
? std::nullopt
: makeDefault();
// Both it->second and *m_default are references,
// so there is no unnecessary copy
const ValueType& myValue =
it != map.end()
? it->second
: *m_default;
// Do something with myValue here
}
The only question now is how we can identify this case. At the time of writing, none of the tested compilers (Clang, GCC, and MSVC) generate any warnings for this case even with all warning enabled. I’m also not aware of any major static analysis tools that detect this.
Some other alternatives would be to delete the copy constructor of ValueType; all of your unnecessary (and, unfortunately, necessary) copies become compiler errors! This has a rather high false positive rate though and may not be possible at all if ValueType is part of code you do not own. Another alternative is to change the type of myValue to a non-const reference; lifetime extension is only allowed for const references and so this will also result in a compiler error. This only works for lines of code you’re already suspicious of and will also fail to compile in the ‘fixed’ code because the map is const.
If the copy does end up being a significant cost, however, then it is likely to show up in profiling results which is actually how I first discovered this behaviour. At least when it shows up there, you’ll be able to quickly spot the root cause of the problem!