C++ ABI Breaking Change

Introduction

In computer software, an application binary interface (ABI) is an interface between two binary program modules. Often, one of these modules is a library or operating system facility, and the other is an application or another library that is being used by a user.

An ABI breaking change from the library module update will cause undefined behaviors from the dependent application or library.

In this blog post, I would like to discuss C++ ABI breaking change and how to solve the problems brought by the ABI breaking changes.

ABI Breaking Change

In C++, the ABI is almost equivalent as the vtable. We will go through an example to understand what exactly the consequence will be when there is an ABI breaking change and how to fix it.

Repository Layout

The ABI Breaking Change Demo is hosted on GitHub. The repository layout is as follows.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
.
├── application
│ ├── app.cpp
│ ├── app_header_v1.cpp.001l.class
│ └── app_header_v2.cpp.001l.class
├── library_v1
│ ├── liblibrary.so-library.cpp.001l.class
│ ├── library.cpp
│ └── library.h
├── library_v2
│ ├── liblibrary.so-library.cpp.001l.class
│ ├── library.cpp
│ └── library.h
├── LICENSE
└── README.md

Library V1

This is the first version of the library. We will create Rectangle instances in our application and compute their areas using the overriding Rectangle::area() function.

library.h
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
#ifndef LIBRARY_H
#define LIBRARY_H

class Shape
{
public:
virtual double area() const = 0;
};

class Rectangle : public Shape
{
private:
double m_width;
double m_height;

public:
Rectangle(double w, double h);
double area() const override;
};

#endif // LIBRARY_H
library.cpp
1
2
3
4
5
#include "library.h"

Rectangle::Rectangle(double w, double h) : m_width(w), m_height(h) {}

double Rectangle::area() const { return m_width * m_height; }

We build the library as a shared library.

1
2
3
4
# Build library V1.
$ cd library_v1
$ g++ -g -fdump-lang-class -shared -fPIC library.cpp -I. -o liblibrary.so
$ cd ..
liblibrary.so-library.cpp.001l.class
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
Vtable for Shape
Shape::_ZTV5Shape: 3 entries
0 (int (*)(...))0
8 (int (*)(...))(& _ZTI5Shape)
16 (int (*)(...))__cxa_pure_virtual

Class Shape
size=8 align=8
base size=8 base align=8
Shape (0x0x7fbff136d420) 0 nearly-empty
vptr=((& Shape::_ZTV5Shape) + 16)

Vtable for Rectangle
Rectangle::_ZTV9Rectangle: 3 entries
0 (int (*)(...))0
8 (int (*)(...))(& _ZTI9Rectangle)
16 (int (*)(...))Rectangle::area

Class Rectangle
size=24 align=8
base size=24 base align=8
Rectangle (0x0x7fbff120e340) 0
vptr=((& Rectangle::_ZTV9Rectangle) + 16)
Shape (0x0x7fbff136d540) 0 nearly-empty
primary-for Rectangle (0x0x7fbff120e340)

Library V2

This is the second version of the library, which has an ABI change comparing to the first version. Notice that we got a new additional virtual function perimeter() comparing to the first version, which leads to the ABI change. However, there is no API change and the method Rectangle::area() was not changed, so the user would not have to change their application implementations when they started to use the second version of the library.

As we will later see, depending on the location of the virtual function perimeter(), the ABI change may or may not be ABI breaking.

library.h
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#ifndef LIBRARY_H
#define LIBRARY_H

class Shape
{
public:
// Adding the new method here breaks the ABI.
virtual double perimeter() const = 0;
virtual double area() const = 0;
// Adding the new method here does not break the ABI.
// virtual double perimeter() const = 0;
};

class Rectangle : public Shape
{
private:
double m_width;
double m_height;

public:
Rectangle(double w, double h);
double perimeter() const override;
double area() const override;
};

#endif // LIBRARY_H
library.cpp
1
2
3
4
5
6
7
#include "library.h"

Rectangle::Rectangle(double w, double h) : m_width(w), m_height(h) {}

double Rectangle::area() const { return m_width * m_height; }

double Rectangle::perimeter() const { return 2 * (m_width + m_height); }

Again, we build the library as a shared library.

1
2
3
4
# Build library V2.
$ cd library_v2
$ g++ -g -fdump-lang-class -shared -fPIC library.cpp -I. -o liblibrary.so
$ cd ..
liblibrary.so-library.cpp.001l.class
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
Vtable for Shape
Shape::_ZTV5Shape: 4 entries
0 (int (*)(...))0
8 (int (*)(...))(& _ZTI5Shape)
16 (int (*)(...))__cxa_pure_virtual
24 (int (*)(...))__cxa_pure_virtual

Class Shape
size=8 align=8
base size=8 base align=8
Shape (0x0x7f78cfb6d420) 0 nearly-empty
vptr=((& Shape::_ZTV5Shape) + 16)

Vtable for Rectangle
Rectangle::_ZTV9Rectangle: 4 entries
0 (int (*)(...))0
8 (int (*)(...))(& _ZTI9Rectangle)
16 (int (*)(...))Rectangle::perimeter
24 (int (*)(...))Rectangle::area

Class Rectangle
size=24 align=8
base size=24 base align=8
Rectangle (0x0x7f78cfa0e340) 0
vptr=((& Rectangle::_ZTV9Rectangle) + 16)
Shape (0x0x7f78cfb6d540) 0 nearly-empty
primary-for Rectangle (0x0x7f78cfa0e340)

Application

This is the application implementation from user that uses the library we created. Basically we will create a Rectangle instance and compute the area using the overriding Rectangle::area function.

app.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include "library.h"
#include <iostream>
#include <memory>
#include <stdexcept>

int main()
{
double const width{5.0};
double const height{3.0};
std::unique_ptr<Shape> rect{std::make_unique<Rectangle>(width, height)};
double const area{rect->area()};
std::cout << "Area: " << area << std::endl;
if (area != width * height)
{
throw std::runtime_error{"Area is not correct."};
}
return 0;
}

We will separately build the application using the header files from different versions of the libraries.

1
2
3
4
5
6
# Build application.
$ cd application
# Build application using the header from library V1.
$ g++ -g -fdump-lang-class -o app_header_v1.o -c app.cpp -I../library_v1
# Build application using the header from library V2.
$ g++ -g -fdump-lang-class -o app_header_v2.o -c app.cpp -I../library_v2
app_header_v1.cpp.001l.class
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
Vtable for Shape
Shape::_ZTV5Shape: 3 entries
0 (int (*)(...))0
8 (int (*)(...))(& _ZTI5Shape)
16 (int (*)(...))__cxa_pure_virtual

Class Shape
size=8 align=8
base size=8 base align=8
Shape (0x0x7fcdee1d7c00) 0 nearly-empty
vptr=((& Shape::_ZTV5Shape) + 16)

Vtable for Rectangle
Rectangle::_ZTV9Rectangle: 3 entries
0 (int (*)(...))0
8 (int (*)(...))(& _ZTI9Rectangle)
16 (int (*)(...))Rectangle::area

Class Rectangle
size=24 align=8
base size=24 base align=8
Rectangle (0x0x7fcdedd015b0) 0
vptr=((& Rectangle::_ZTV9Rectangle) + 16)
Shape (0x0x7fcdee1d7d20) 0 nearly-empty
primary-for Rectangle (0x0x7fcdedd015b0)
app_header_v2.cpp.001l.class
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
Vtable for Shape
Shape::_ZTV5Shape: 4 entries
0 (int (*)(...))0
8 (int (*)(...))(& _ZTI5Shape)
16 (int (*)(...))__cxa_pure_virtual
24 (int (*)(...))__cxa_pure_virtual

Class Shape
size=8 align=8
base size=8 base align=8
Shape (0x0x7fe8be9d8c00) 0 nearly-empty
vptr=((& Shape::_ZTV5Shape) + 16)

Vtable for Rectangle
Rectangle::_ZTV9Rectangle: 4 entries
0 (int (*)(...))0
8 (int (*)(...))(& _ZTI9Rectangle)
16 (int (*)(...))Rectangle::perimeter
24 (int (*)(...))Rectangle::area

Class Rectangle
size=24 align=8
base size=24 base align=8
Rectangle (0x0x7fe8be6215b0) 0
vptr=((& Rectangle::_ZTV9Rectangle) + 16)
Shape (0x0x7fe8be9d8d20) 0 nearly-empty
primary-for Rectangle (0x0x7fe8be6215b0)

Because we have two libraries, we will get two application object files, app_header_v1.o and app_header_v2.o.

To link the application object files to the libraries, each application object file has two options, library V1 and V2.

1
2
3
4
5
6
7
8
9
10
11
12
13
# Link application to libraries.
# There are no linker errors, suggesting that Rectangle::area() that is
# used in the app.cpp have been successfully found.
# Link application using the header from library V1 correctly with library V1.
g++ -o app_header_v1_library_v1.app app_header_v1.o -L../library_v1 -llibrary
# Link application using the header from library V2 correctly with library V2.
g++ -o app_header_v2_library_v2.app app_header_v2.o -L../library_v2 -llibrary
# Link application using the header from library V1 correctly with library V2.
# ABI breaking change (vtable conflict)
g++ -o app_header_v1_library_v2.app app_header_v1.o -L../library_v2 -llibrary
# Link application using the header from library V2 correctly with library V1.
# ABI breaking change (vtable conflict)
g++ -o app_header_v2_library_v1.app app_header_v2.o -L../library_v1 -llibrary

Taken together, we will get four application executable files, including app_header_v1_library_v1.app, app_header_v2_library_v2.app, app_header_v1_library_v2.app, and app_header_v2_library_v1.app.

After running the four application executable files, we found that only app_header_v1_library_v1.app and app_header_v2_library_v2.app behaved as expected whereas app_header_v1_library_v2.app and app_header_v2_library_v1.app did not have the expected behaviors.

1
2
$ LD_LIBRARY_PATH=../library_v1 ./app_header_v1_library_v1.app
Area: 15
1
2
$ LD_LIBRARY_PATH=../library_v1 ./app_header_v2_library_v1.app
Segmentation fault (core dumped)
1
2
$ LD_LIBRARY_PATH=../library_v2 ./app_header_v2_library_v2.app
Area: 15
1
2
3
4
5
$ LD_LIBRARY_PATH=../library_v2 ./app_header_v1_library_v2.app
Area: 16
terminate called after throwing an instance of 'std::runtime_error'
what(): Area is not correct.
Aborted (core dumped)

This is because the library V2 introduced an ABI breaking change to the library V1. Therefore, the application object file compiled with the header file from the library V1 becomes incompatible with the library V2, and the application object file compiled with the header file from the library V2 becomes incompatible with the library V1.

What’s Happened?

The vtable for the Rectangle from the library V1 and the application executable object file app_header_v1.o compiled with the header file from the library V1 is as follows.

1
2
3
4
5
Vtable for Rectangle
Rectangle::_ZTV9Rectangle: 3 entries
0 (int (*)(...))0
8 (int (*)(...))(& _ZTI9Rectangle)
16 (int (*)(...))Rectangle::area

The vtable for the Rectangle from the library V2 and the application executable object file app_header_v2.o compiled with the header file from the library V2 is as follows.

1
2
3
4
5
6
Vtable for Rectangle
Rectangle::_ZTV9Rectangle: 4 entries
0 (int (*)(...))0
8 (int (*)(...))(& _ZTI9Rectangle)
16 (int (*)(...))Rectangle::perimeter
24 (int (*)(...))Rectangle::area

By the implementation of my GCC compiler, after compiling the application executable file app_header_v1.o with the library V1 header file, the rect->area() from the app_header_v1.o has to invoke the function (int (*)(...))Rectangle::area, which is the 3rd entry ((& Rectangle::_ZTV9Rectangle) + 16) in the vtable static array, via the vtable pointer. However, if the vtable static array, provided by the library (which is compiler implementation dependent), is from the library V2, the 3rd entry in the static array ((& Rectangle::_ZTV9Rectangle) + 16) actually becomes the function (int (*)(...))Rectangle::perimeter. That’s why, the rect->area() from the app_header_v1_library_v2.app computed the perimeter instead of area.

Similarly, after compiling the application executable file app_header_v2.o with the library V2 header file, the rect->area() from the app_header_v2.o has to invoke the function (int (*)(...))Rectangle::area, which is the 4th entry ((& Rectangle::_ZTV9Rectangle) + 24) in the vtable static array, via the vtable pointer. However, if the vtable static array, provided by the library (which is compiler implementation dependent), is from the library V1, the 4th entry is actually out of the boundary of the vtable static library. That’s why, the rect->area() from the app_header_v2_library_v1.app results in a segmentation fault.

Fundamentally, this ABI breaking change is due to the location of the virtual function area got shifted after a new virtual function perimeter got added.

Avoid ABI Breaking Change

To avoid ABI breaking change, in our case, the library developer can insert the new virtual function perimeter after the existing virtual function area. Then the location of the virtual function area will remain the same in the library V2. The app_header_v1_library_v2.app and app_header_v2_library_v1.app will also run fine as expected.

From the user’s perspective, if we can, whenever we get an upgraded version of the library, we should recompile the dependent application or library and always run app_header_v1_library_v1.app or app_header_v2_library_v2.app to ensure any ABI incompatibility has been eliminated.

Real-World Implications

Fundamentally, an ABI breaking change is due to the vtable used for the updated library becomes incompatible with the vtable used for the dependent library or application. This has a real-world implications.

Suppose there is a library, say the OpenCV library libopencv-dev which depends on a third-party library libgtk-3-dev. In the libopencv-dev0.1 release, it was built against libgtk-3-dev0.2. When the users install libopencv-dev0.1 on Linux, they will also have to install libgtk-3-dev0.2. This is fine.

Later, when the user upgraded their systems, libgtk-3-dev got upgraded to libgtk-3-dev0.3 but libopencv-dev remained to be libopencv-dev0.1. If there is no ABI breaking change, the user can still use libopencv-dev0.1 normally. However, if there is an ABI breaking change, libopencv-dev0.1 cannot just be used normally anymore. The user would have to recompile libopencv-dev using the upgraded library libgtk-3-dev0.3 and it’s sometimes difficult and time-consuming to do. In addition, as we have seen in our example, sometimes the ABI breaking change can even silently change the behavior of programs without crashing them. That’s why the software developers should not break ABI unless it’s absolutely inevitable to make the user’s life easier.

Conclusions

Ideally, the best practice for the user is always rebuilding the application whenever the library gets updated so that any ABI change will not affect the application behavior. The best practice for the library developer is avoiding ABI breaking changes as many as possible.

References

Author

Lei Mao

Posted on

08-07-2023

Updated on

08-07-2023

Licensed under


Comments