C++ runtime polymorphisms can be achieved using a base class pointer pointing to the derived class object and the overridden functions implemented in the derived class object can be found and called via the base class vtable pointer. C++ pointer adjustment at runtime ensures that the base class vtable pointer is always found correctly.
In this blog post, I would like to quickly discuss C++ pointer adjustment using an example and compiler analysis tools.
C++ Pointer Adjustment
Example
This is a quick example that shows the C++ runtime polymorphisms and how the (base) pointer addresses got adjusted during runtime.
Base1* base1_ptr = derived_ptr; // Derived::foo() is called. base1_ptr->foo();
Base2* base2_ptr = derived_ptr; // Derived::bar() is called. base2_ptr->bar();
std::cout << "Base1 Ptr Address: " << base1_ptr << std::endl; std::cout << "Base2 Ptr Address: " << base2_ptr << std::endl; std::cout << "Derived Ptr Address: " << derived_ptr << std::endl; // This will not work and should not be used. // std::cout << static_cast<Base2*>(base1Ptr) << std::endl; // In practice, need to check if dynamic cast is successful at runtime. std::cout << "Base2 Ptr Dynamically Casted from Base1 Ptr Address: " << dynamic_cast<Base2*>(base1_ptr) << std::endl; std::cout << "Base1 Ptr Dynamically Casted from Base2 Ptr Address: " << dynamic_cast<Base1*>(base2_ptr) << std::endl; // C-casts are dangerous and incorrect in these use cases. std::cout << "Base2 Ptr C-Casted from Base1 Ptr Address: " << (Base2*)(base1_ptr) << std::endl; std::cout << "Base1 Ptr C-Casted from Base2 Ptr Address: " << (Base1*)(base2_ptr) << std::endl;
return0; }
Build and Run Example
We will build the example with debugging mode for later analysis. Running the example shows that the pointer addresses did get adjusted if correct implementations are used.
1 2 3 4 5 6 7 8 9 10 11
$ g++ -g pointer_adjustment.cpp -o pointer_adjustment $ ./pointer_adjustment Derived::foo() is called. Derived::bar() is called. Base1 Ptr Address: 0x7ffd5d192940 Base2 Ptr Address: 0x7ffd5d192958 Derived Ptr Address: 0x7ffd5d192940 Base2 Ptr Dynamically Casted from Base1 Ptr Address: 0x7ffd5d192958 Base1 Ptr Dynamically Casted from Base2 Ptr Address: 0x7ffd5d192940 Base2 Ptr C-Casted from Base1 Ptr Address: 0x7ffd5d192940 Base1 Ptr C-Casted from Base2 Ptr Address: 0x7ffd5d192958
Analyze Example
We will analyze the example built with debugging mode using pahole. It can be installed simply via sudo apt install pahole on Linux. The pahole analysis result is as follows.
On the stack, the first element of a Base1 object is the vtable pointer _vptr.Base1 and it takes 8 bytes. The base size of a Base1 object is sizeof(Base1::_vptr.Base1) + sizeof(Base1::m_v1) + sizeof(Base1::m_v2) = 8 + 8 + 1 = 17 bytes, and it is padded to 24 bytes because of the 8 byte alignment requirement for the class.
Similarly, the first element of a Base2 object is the vtable pointer _vptr.Base2 and it takes 8 bytes. The base size of a Base2 object is sizeof(Base1::_vptr.Base2) + sizeof(Base1::m_v1) = 8 + 4 = 12 bytes, and it is padded to 16 bytes because of the 8 byte alignment requirement for the class.
The memory layout of a Derived object on the stack is the stack content of a Base1 object which takes 24 bytes followed by a Base2 object which takes 16 bytes followed by the Derived member objects. The base size of a Derived object is supposed to be sizeof(Base1) + sizeof(Base2) + sizeof(Derived::m_v1) = 24 + 16 + 4 = 44 bytes and it will be further padded to 48 bytes because of the alignment requirement. However, we could see that the actual size of a Derived object is just 40 bytes. This is actually due to a compiler optimization. Because a Base2 object will add 4 bytes for padding, and it happens to fit the Derived::m_v1 which is of int type, thus Derived::m_v1 is placed to the place where originally the padded bytes are for a Base2 object. The pahole analysis result also tells us that the offset of Derived::m_v1 is 36 instead of 40.
Anyhow, we would expect that a Base1 pointer to the Derived object will be the same as a Derived pointer to the Derived object, a Base2 pointer to the Derived object will be 24 bytes larger than a Base1 pointer to the Derived object. In fact, our program print out has verified this. Notice that hexadecimal subtraction 0x7ffd5d192958 - 0x7ffd5d192940 results in 24 decimal.
1 2 3 4 5 6 7 8 9 10 11
$ g++ -g pointer_adjustment.cpp -o pointer_adjustment $ ./pointer_adjustment Derived::foo() is called. Derived::bar() is called. Base1 Ptr Address: 0x7ffd5d192940 Base2 Ptr Address: 0x7ffd5d192958 Derived Ptr Address: 0x7ffd5d192940 Base2 Ptr Dynamically Casted from Base1 Ptr Address: 0x7ffd5d192958 Base1 Ptr Dynamically Casted from Base2 Ptr Address: 0x7ffd5d192940 Base2 Ptr C-Casted from Base1 Ptr Address: 0x7ffd5d192940 Base1 Ptr C-Casted from Base2 Ptr Address: 0x7ffd5d192958
In our case, the two vtables from the two base classes are distinct. The Derived::foo() is not in the Base2 vtable and the Derived::bar() is not in the Base1 vtable because the base classes did not declare them respectively. When the overridden functions are called via a Base1 pointer to the Derived object, the Base1::_vptr.Base1 vtable pointer will be used for finding the overridden function pointers, when the overridden functions are called via a Base2 pointer to the Derived object, the Base2::_vptr.Base1 vtable will be used for finding the overridden function pointers.
Without pointer adjustment, say, a Base1 pointer and a Base2 pointer both points to the beginning of the Derived object and the Base1::_vptr.Base1 vtable pointer will be used for finding the overridden functions, the Base2 pointer overridden function call will not work as expected and result in undefined behaviors.
Conclusion
The pointer adjustment ensures that base pointers can correctly point to the member objects it owns, including the vtable pointer. This is critical for ensuring the runtime polymorphic behaviors via the correct vtable pointers are always expected.