C++ Effective Move Semantics

Introduction

C++ move semantics is extremely helpful for making program efficient by eliminating unnecessary large memory copies. However, in some scenarios, move semantics will not be effective.

In this blog post, I would like to discuss how to make move semantics effective.

Profile Environment

We will run some profiling for our example programs. Docker image gcc:9.4.0 for profiling. The test platform uses Intel Core i9-9900K CPU.

1
2
$ docker pull gcc:9.4.0
$ docker run -it --rm -v $(pwd):/mnt -w /mnt gcc:9.4.0

Effective Move Semantics

Move semantics is not helpful if:

  • Copy source is lvalue.
  • Type offers no move support.
  • Move no cheaper than copying.
  • Move unusable (i.e., move ops not noexcept).

When move semantics is not helpful, copy semantics could be or will be used instead.

Copy Source Is lvalue

The implementation has to use std::move for lvalue to invoke move semantics, including move constructor and move assignment. Otherwise, copy semantics, including copy constructor and copy assignment, is invoked.

It should be noted that the return type of std::move is a rvalue reference which has to match the move constructor and the move assignment input types.

Type Offers No Move Support

When there is no move semantics, including move constructor and move assignment, for a type, move semantics cannot be invoked even if std::move has been used, copy semantics will be invoked instead.

There are many scenarios where there is no move support for certain types. For example, built-in primitive types, such as int, does not have move support, default move constructors and move assignments could be implicitly deleted under some circumstances, the type has been declared with const, etc.

Notice that std::move can be used for any lvalue, even if the lvalue does not have move support.

no_move_support.cpp
1
2
3
4
5
6
7
#include <utility>

int main()
{
int a = 10;
int b = std::move(a);
}

In the above example, std::move(a) returns rvalue reference of type int&&. Because int has no move support. It will be converted to type const int& to use copy semantics instead.

Move No Cheaper Than Copying

Move is supposed to be faster than copy. However, it turns out that there are scenarios where copy is no slower than move, such as SSO (Small String Optimization) optimized strings.

sso.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
#include <chrono>
#include <cstddef>
#include <cstdlib>
#include <iostream>
#include <string>
#include <utility>
#include <vector>

std::string create_string(size_t n)
{
std::string str{""};
for (size_t i = 0; i < n; ++i)
{
str += "a";
}
return str;
}

int main(int argc, char** argv)
{
// Short string length
size_t len_ss = 100;
if (argc > 1)
{
len_ss = atoi(argv[1]);
}
// Test size
const size_t num_ss = 100000;
// Create a vector of short strings
std::vector<std::string> vec_ss_source(num_ss);
std::vector<std::string> vec_ss_copy_target(num_ss);
std::vector<std::string> vec_ss_move_target(num_ss);
for (size_t i = 0; i < num_ss; ++i)
{
vec_ss_source[i] = create_string(len_ss);
}

std::chrono::steady_clock::time_point t_copy_begin =
std::chrono::steady_clock::now();
for (size_t i = 0; i < num_ss; ++i)
{
vec_ss_copy_target[i] = vec_ss_source[i];
}
std::chrono::steady_clock::time_point t_copy_end =
std::chrono::steady_clock::now();

std::chrono::steady_clock::time_point t_move_begin =
std::chrono::steady_clock::now();
for (size_t i = 0; i < num_ss; ++i)
{
vec_ss_move_target[i] = std::move(vec_ss_source[i]);
}
std::chrono::steady_clock::time_point t_move_end =
std::chrono::steady_clock::now();

std::cout << "String length: " << len_ss << std::endl;
std::cout << "String copy assignment average time: "
<< std::chrono::duration_cast<std::chrono::nanoseconds>(
t_copy_end - t_copy_begin)
.count() /
static_cast<float>(num_ss)
<< "[ns]" << std::endl;
std::cout << "String move assignment average time: "
<< std::chrono::duration_cast<std::chrono::nanoseconds>(
t_move_end - t_move_begin)
.count() /
static_cast<float>(num_ss)
<< "[ns]" << std::endl;
}

Let’s profile the copy and move of strings of different lengths.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
$ g++ sso.cpp -o sso -std=c++14
$ ./sso 7
String length: 7
String copy assignment average time: 14.0527[ns]
String move assignment average time: 13.7724[ns]
$ ./sso 15
String length: 15
String copy assignment average time: 13.895[ns]
String move assignment average time: 13.3158[ns]
$ ./sso 50
String length: 50
String copy assignment average time: 48.3205[ns]
String move assignment average time: 10.7449[ns]
$ ./sso 100
String length: 100
String copy assignment average time: 60.3347[ns]
String move assignment average time: 10.4812[ns]

Probably because GCC does SSO for small std::string, we could see that the move assignment performance for small strings is almost the same as the copy assignment. However, the move assignment performance for small strings is even worse than the performance for long strings.

Move Unusable

Let’s investigate the consequence of not having noexcept for move constructor (and move assignment).

The noexcept_move.cpp implemented CustomString with move constructor with noexcept. We then tried to push_back and emplace_back CustomString instances into a std::vector.

noexcept_move.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
#include <chrono>
#include <cstddef>
#include <cstdlib>
#include <iostream>
#include <string>
#include <vector>

std::string create_string(size_t n)
{
std::string str{""};
for (size_t i = 0; i < n; ++i)
{
str += "a";
}
return str;
}

struct CustomString
{
public:
explicit CustomString(std::string n = "")
: name{n} {
// std::cout << "Explicit constructor being called." << std::endl;
};
CustomString(const CustomString& rhs)
: name{rhs.name},
value{rhs.value} {
// std::cout << "Copy constructor being called." << std::endl;
};
CustomString& operator=(const CustomString& rhs)
{
// std::cout << "Copy assignment being called." << std::endl;
name = rhs.name;
value = rhs.value;
return *this;
};
CustomString(CustomString&& rhs) noexcept
{
// std::cout << "Move constructor being called." << std::endl;
name = std::move(rhs.name);
value = rhs.value;
rhs.value = 0;
};
CustomString& operator=(CustomString&& rhs) noexcept
{
// std::cout << "Move assignment being called." << std::endl;
name = std::move(rhs.name);
value = rhs.value;
rhs.value = 0;
return *this;
};

private:
std::string name;
long long value;
};

int main(int argc, char** argv)
{
// String length
size_t len_string = 100;
if (argc > 1)
{
len_string = atoi(argv[1]);
}
const std::string std_string{create_string(len_string)};
const CustomString custom_string{std_string};
const size_t num_strings = 10000;
std::vector<CustomString> vec_strings_source_1;
std::vector<CustomString> vec_strings_source_2;

std::chrono::steady_clock::time_point t_push_back_begin =
std::chrono::steady_clock::now();
for (size_t i = 0; i < num_strings; ++i)
{
vec_strings_source_1.push_back(custom_string);
}
std::chrono::steady_clock::time_point t_push_back_end =
std::chrono::steady_clock::now();
std::chrono::steady_clock::time_point t_emplace_back_begin =
std::chrono::steady_clock::now();
for (size_t i = 0; i < num_strings; ++i)
{
vec_strings_source_2.emplace_back(std_string);
}
std::chrono::steady_clock::time_point t_emplace_back_end =
std::chrono::steady_clock::now();
std::cout << "String length: " << len_string << std::endl;
std::cout << "Custom string push back average time: "
<< std::chrono::duration_cast<std::chrono::nanoseconds>(
t_push_back_end - t_push_back_begin)
.count() /
static_cast<float>(num_strings)
<< "[ns]" << std::endl;
std::cout << "Custom string emplace back average time: "
<< std::chrono::duration_cast<std::chrono::nanoseconds>(
t_emplace_back_end - t_emplace_back_begin)
.count() /
static_cast<float>(num_strings)
<< "[ns]" << std::endl;
}

We could see that on average, push_back takes 153 ns per element and emplace_back takes 168 ns per element, for a std::vector consisting of 10000 CustomString elements.

1
2
3
4
5
$ g++ noexcept_move.cpp -o noexcept_move -std=c++14
$ ./noexcept_move
String length: 100
Custom string push back average time: 153.895[ns]
Custom string emplace back average time: 168.498[ns]

The except_move.cpp implemented CustomString with move constructor without noexcept. Except noexcept, except_move.cpp is identical to noexcept_move.cpp.

except_move.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
#include <chrono>
#include <cstddef>
#include <cstdlib>
#include <iostream>
#include <string>
#include <vector>

std::string create_string(size_t n)
{
std::string str{""};
for (size_t i = 0; i < n; ++i)
{
str += "a";
}
return str;
}

struct CustomString
{
public:
explicit CustomString(std::string n = "")
: name{n} {
// std::cout << "Explicit constructor being called." << std::endl;
};
CustomString(const CustomString& rhs)
: name{rhs.name},
value{rhs.value} {
// std::cout << "Copy constructor being called." << std::endl;
};
CustomString& operator=(const CustomString& rhs)
{
// std::cout << "Copy assignment being called." << std::endl;
name = rhs.name;
value = rhs.value;
return *this;
};
CustomString(CustomString&& rhs)
{
// std::cout << "Move constructor being called." << std::endl;
name = std::move(rhs.name);
value = rhs.value;
rhs.value = 0;
};
CustomString& operator=(CustomString&& rhs)
{
// std::cout << "Move assignment being called." << std::endl;
name = std::move(rhs.name);
value = rhs.value;
rhs.value = 0;
return *this;
};

private:
std::string name;
long long value;
};

int main(int argc, char** argv)
{
// String length
size_t len_string = 100;
if (argc > 1)
{
len_string = atoi(argv[1]);
}
const std::string std_string{create_string(len_string)};
const CustomString custom_string{std_string};
const size_t num_strings = 10000;
std::vector<CustomString> vec_strings_source_1;
std::vector<CustomString> vec_strings_source_2;

std::chrono::steady_clock::time_point t_push_back_begin =
std::chrono::steady_clock::now();
for (size_t i = 0; i < num_strings; ++i)
{
vec_strings_source_1.push_back(custom_string);
}
std::chrono::steady_clock::time_point t_push_back_end =
std::chrono::steady_clock::now();
std::chrono::steady_clock::time_point t_emplace_back_begin =
std::chrono::steady_clock::now();
for (size_t i = 0; i < num_strings; ++i)
{
vec_strings_source_2.emplace_back(std_string);
}
std::chrono::steady_clock::time_point t_emplace_back_end =
std::chrono::steady_clock::now();
std::cout << "String length: " << len_string << std::endl;
std::cout << "Custom string push back average time: "
<< std::chrono::duration_cast<std::chrono::nanoseconds>(
t_push_back_end - t_push_back_begin)
.count() /
static_cast<float>(num_strings)
<< "[ns]" << std::endl;
std::cout << "Custom string emplace back average time: "
<< std::chrono::duration_cast<std::chrono::nanoseconds>(
t_emplace_back_end - t_emplace_back_begin)
.count() /
static_cast<float>(num_strings)
<< "[ns]" << std::endl;
}

This time, on average, push_back takes 187 ns per element and emplace_back takes 211 ns per element, for a std::vector consisting of 10000 CustomString elements.

1
2
3
4
5
$ g++ except_move.cpp -o except_move -std=c++14
$ ./except_move
String length: 100
Custom string push back average time: 187.617[ns]
Custom string emplace back average time: 211.423[ns]

Both push_back and emplace_back from except_move.cpp are much slower than the ones from noexcept_move.cpp.

Let’s try to see what’s happening here by printing out the constructors being called and reducing the number of test iterations.

Now we only insert two CustomString instances into std::vector in noexcept_move.cpp.

noexcept_move.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
#include <chrono>
#include <cstddef>
#include <cstdlib>
#include <iostream>
#include <string>
#include <vector>

std::string create_string(size_t n)
{
std::string str{""};
for (size_t i = 0; i < n; ++i)
{
str += "a";
}
return str;
}

struct CustomString
{
public:
explicit CustomString(std::string n = "")
: name{n} {
std::cout << "Explicit constructor being called." << std::endl;
};
CustomString(const CustomString& rhs)
: name{rhs.name},
value{rhs.value} {
std::cout << "Copy constructor being called." << std::endl;
};
CustomString& operator=(const CustomString& rhs)
{
std::cout << "Copy assignment being called." << std::endl;
name = rhs.name;
value = rhs.value;
return *this;
};
CustomString(CustomString&& rhs) noexcept
{
std::cout << "Move constructor being called." << std::endl;
name = std::move(rhs.name);
value = rhs.value;
rhs.value = 0;
};
CustomString& operator=(CustomString&& rhs) noexcept
{
std::cout << "Move assignment being called." << std::endl;
name = std::move(rhs.name);
value = rhs.value;
rhs.value = 0;
return *this;
};

private:
std::string name;
long long value;
};

int main(int argc, char** argv)
{
// String length
size_t len_string = 100;
if (argc > 1)
{
len_string = atoi(argv[1]);
}
const std::string std_string{create_string(len_string)};
const CustomString custom_string{std_string};
const size_t num_strings = 2;
std::vector<CustomString> vec_strings_source_1;
std::vector<CustomString> vec_strings_source_2;

std::chrono::steady_clock::time_point t_push_back_begin =
std::chrono::steady_clock::now();
for (size_t i = 0; i < num_strings; ++i)
{
vec_strings_source_1.push_back(custom_string);
}
std::chrono::steady_clock::time_point t_push_back_end =
std::chrono::steady_clock::now();
std::chrono::steady_clock::time_point t_emplace_back_begin =
std::chrono::steady_clock::now();
for (size_t i = 0; i < num_strings; ++i)
{
vec_strings_source_2.emplace_back(std_string);
}
std::chrono::steady_clock::time_point t_emplace_back_end =
std::chrono::steady_clock::now();
std::cout << "String length: " << len_string << std::endl;
std::cout << "Custom string push back average time: "
<< std::chrono::duration_cast<std::chrono::nanoseconds>(
t_push_back_end - t_push_back_begin)
.count() /
static_cast<float>(num_strings)
<< "[ns]" << std::endl;
std::cout << "Custom string emplace back average time: "
<< std::chrono::duration_cast<std::chrono::nanoseconds>(
t_emplace_back_end - t_emplace_back_begin)
.count() /
static_cast<float>(num_strings)
<< "[ns]" << std::endl;
}
1
2
3
4
5
6
7
8
9
10
11
12
$ g++ noexcept_move.cpp -o noexcept_move -std=c++14
$ ./noexcept_move
Explicit constructor being called.
Copy constructor being called.
Copy constructor being called.
Move constructor being called.
Explicit constructor being called.
Explicit constructor being called.
Move constructor being called.
String length: 100
Custom string push back average time: 3255[ns]
Custom string emplace back average time: 10184.5[ns]

Similarly, we only insert two CustomString instances into std::vector in except_move.cpp.

except_move.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
#include <chrono>
#include <cstddef>
#include <cstdlib>
#include <iostream>
#include <string>
#include <vector>

std::string create_string(size_t n)
{
std::string str{""};
for (size_t i = 0; i < n; ++i)
{
str += "a";
}
return str;
}

struct CustomString
{
public:
explicit CustomString(std::string n = "")
: name{n} {
std::cout << "Explicit constructor being called." << std::endl;
};
CustomString(const CustomString& rhs)
: name{rhs.name},
value{rhs.value} {
std::cout << "Copy constructor being called." << std::endl;
};
CustomString& operator=(const CustomString& rhs)
{
std::cout << "Copy assignment being called." << std::endl;
name = rhs.name;
value = rhs.value;
return *this;
};
CustomString(CustomString&& rhs) noexcept
{
std::cout << "Move constructor being called." << std::endl;
name = std::move(rhs.name);
value = rhs.value;
rhs.value = 0;
};
CustomString& operator=(CustomString&& rhs) noexcept
{
std::cout << "Move assignment being called." << std::endl;
name = std::move(rhs.name);
value = rhs.value;
rhs.value = 0;
return *this;
};

private:
std::string name;
long long value;
};

int main(int argc, char** argv)
{
// String length
size_t len_string = 100;
if (argc > 1)
{
len_string = atoi(argv[1]);
}
const std::string std_string{create_string(len_string)};
const CustomString custom_string{std_string};
const size_t num_strings = 2;
std::vector<CustomString> vec_strings_source_1;
std::vector<CustomString> vec_strings_source_2;

std::chrono::steady_clock::time_point t_push_back_begin =
std::chrono::steady_clock::now();
for (size_t i = 0; i < num_strings; ++i)
{
vec_strings_source_1.push_back(custom_string);
}
std::chrono::steady_clock::time_point t_push_back_end =
std::chrono::steady_clock::now();
std::chrono::steady_clock::time_point t_emplace_back_begin =
std::chrono::steady_clock::now();
for (size_t i = 0; i < num_strings; ++i)
{
vec_strings_source_2.emplace_back(std_string);
}
std::chrono::steady_clock::time_point t_emplace_back_end =
std::chrono::steady_clock::now();
std::cout << "String length: " << len_string << std::endl;
std::cout << "Custom string push back average time: "
<< std::chrono::duration_cast<std::chrono::nanoseconds>(
t_push_back_end - t_push_back_begin)
.count() /
static_cast<float>(num_strings)
<< "[ns]" << std::endl;
std::cout << "Custom string emplace back average time: "
<< std::chrono::duration_cast<std::chrono::nanoseconds>(
t_emplace_back_end - t_emplace_back_begin)
.count() /
static_cast<float>(num_strings)
<< "[ns]" << std::endl;
}
1
2
3
4
5
6
7
8
9
10
11
12
$ g++ except_move.cpp -o except_move -std=c++14
$ ./except_move
Explicit constructor being called.
Copy constructor being called.
Copy constructor being called.
Copy constructor being called.
Explicit constructor being called.
Explicit constructor being called.
Copy constructor being called.
String length: 100
Custom string push back average time: 2088[ns]
Custom string emplace back average time: 1439[ns]

We could see the difference is that during push_back or emplace_back, the std::vector has to resize to have a larger buffer and all the elements has to be migrated from the old smaller buffer to the new larger buffer. With noexcept for the move constructor, move constructor was called for the migration, whereas with noexcept for the move constructor, copy constructor, instead of move constructor, was called for the migration.

The major reason behind this phenomenon is that C++ STL enforces strong exception safety guarantees for std::vector push_back and emplace_back. During the std::vector data migration from old smaller buffer to new larger buffer, each element in data has to be either copied using copy constructor or moved using move constructor. If move constructor is not declared with noexcept, if an exception is thrown during move, there could be data loss from the source and there will be no way to recover it. Copy constructor will not suffer from data loss from the source even if an exception is thrown during copy. Because C++ STL enforces strong exception safety guarantees for std::vector push_back and emplace_back, when move constructor is not declared with noexcept, copy constructor, instead of move constructor, is used for data migration. Only when move constructor is declared with noexcept, move constructor will be used for data migration.

Conclusions

To make your C++ program have better performance, try to make your move semantics effective, especially by making the move constructor and move assignment noexcept when working with C++ STL containers.

References

Author

Lei Mao

Posted on

08-09-2022

Updated on

08-09-2022

Licensed under


Comments