AddressSanitizer

Introduction

AddressSanitizer (ASan) is a fast memory error detector that can identify various memory errors such as accessing the not addressable memory, use-after-free, and memory leaks. It is a compile-time instrumentation tool that modifies the source code of programs to insert additional checks for memory accesses.

In this blog post, I would like to discuss the AddressSanitizer algorithm and its usages.

AddressSanitizer Algorithm

The AddressSanitizer algorithm has been described in the AddressSanitizer GitHub Wiki. I am going to elaborate on some more details that I found during my study of the algorithm.

Application Memory and Shadow Memory Mapping

Fundamentally, AddressSanitizer maps 8 bytes of the application memory into 1 byte of the shadow memory.

There are only 9 different values for any aligned 8 bytes of the application memory:

  • All 8 bytes in qword are unpoisoned (i.e. addressable). The shadow value is 0.
  • All 8 bytes in qword are poisoned (i.e. not addressable). The shadow value is negative.
  • First k bytes are unpoisoned, the rest 8-k are poisoned. The shadow value is k.
    This is guaranteed by the fact that malloc returns 8-byte aligned chunks of memory.
    The only case where different bytes of an aligned qword have different state is the tail of a malloc-ed region. For example, if we call malloc(13), we will have one full unpoisoned qword and one qword where 5 first bytes are unpoisoned.

Here addressable means the valid memory region can be accessed using the memory address provided. It is different from the conceptual out-of-bound (OOB) memory access in some cases. For example, if we have a malloc-ed region of 16 bytes, and we created a container that uses the first 12 bytes, leaving the last 4 bytes unused and uninitialized. Then accessing the last 4 bytes of the malloc-ed region is addressable but out-of-bound for the container. Such OOB access might cause undefined behaviors of the program but cannot be detected by AddressSanitizer.

Having understood the difference between addressable and OOB memory access, we could understand that for any aligned 8 bytes there can only be $8 + 1 = 9$ poison states. It is impossible that the application memory allocated from malloc has not addressable bytes followed by addressable bytes. More generally, for any aligned $N$ bytes, there can only be $N + 1$ poison states. This means that 1 byte of shadow memory, which has $2^8 = 256$ possible states, can be used for representing up to 128 bytes of the application memory theoretically.

The AddressSanitizer instrumentation for detecting memory errors looks like this:

1
2
3
4
5
6
7
8
9
10
// Given the address user provided, get the corresponding shadow memory address which points to the shadow value of the byte.
byte *shadow_address = MemToShadow(address);
// Get the shadow value of the byte pointed by the address user provided.
byte shadow_value = *shadow_address;
// If the shadow value is non-zero, there are not addressable bytes in the byte pointed by the address user provided, so we may have a memory error and it requires further checking using the access size information.
if (shadow_value) {
if (SlowPathCheck(shadow_value, address, kAccessSize)) {
ReportError(address, kAccessSize, kIsWrite);
}
}

Here we could see there is a memory error that AddressSanitizer cannot detect. Suppose the user only allocated 8 bytes of memory using malloc(8), and then accessed 16 bytes starting from the address returned by malloc using some vectorized access instructions. The shadow value for the first 8 bytes is 0, but apparently accessing 16 bytes from the address returned by malloc(8) is illegal. A possible fix for this is to use 1 byte of shadow memory for representing larger than 8 bytes of application memory, for example, 16 bytes, as we have discussed earlier. But there can be some impact to the performance of AddressSanitizer, which requires further analysis and experiments.

The further checking using the access size information is done in the SlowPathCheck function. The implementation of SlowPathCheck is like this:

1
2
3
4
5
6
// Check the cases where we access first k bytes of the qword
// and these k bytes are unpoisoned.
bool SlowPathCheck(shadow_value, address, kAccessSize) {
last_accessed_byte = (address & 7) + kAccessSize - 1;
return (last_accessed_byte >= shadow_value);
}

In this case, address & 7 gives the offset of the byte pointed by address in the aligned 8-byte qword, and it is equivalent as address % 8. If last_accessed_byte >= shadow_value, definitely some not addressable bytes are accessed and we should report this as a memory error.

Shadow Memory Layout

On a 64-bit operating system, the memory address can be mapped to the shadow memory address using the following formula.

1
Shadow = (Mem >> 3) + 0x7fff8000;

where Mem >> 3 is just Mem / 8 since AddressSanitizer maps 8 bytes of application memory to 1 byte of shadow memory and 0x7fff8000 is the offset to map the application memory to the shadow memory region.

[0x10007fff8000, 0x7fffffffffff] HighMem
[0x02008fff7000, 0x10007fff7fff] HighShadow
[0x00008fff7000, 0x02008fff6fff] ShadowGap
[0x00007fff8000, 0x00008fff6fff] LowShadow
[0x000000000000, 0x00007fff7fff] LowMem

On a 64-bit operating system, the address space is 48 bits, which means the valid memory address ranges from 0x000000000000 to 0xffffffffffff, where 0xffffffffffff is $2^{48} - 1$. However, the upper half of the address space is reserved for kernel use only, so the valid memory address for user space is from 0x000000000000 to 0x7fffffffffff, where 0x7fffffffffff is $2^{47} - 1$. Then the shadow memory address space size must be $\frac{2^{47}}{8} = 2^{44}$. Because of the existence of shadow memory, there must be some memory regions among the shadow memory address space that shall never be accessed. Such memory regions are called ShadowGap. Such shadow gap region must be of size $\frac{2^{44}}{8} = 2^{41}$.

For some reason which I am not completely sure, AddressSanitizer chooses to use shadow memory to divide the application memory into two halves, HighMem and LowMem. This introduces the shadow memory offset. The offset used in the formula above is arbitrarily 0x7fff8000. So the end address of the shadow memory must be 0x7fff8000 + 2^44 - 1 = 0x00007fff8000 + 0x100000000000 - 0x000000000001 = 0x10007fff8000 - 0x000000000001 = 0x10007fff7fff. By applying the aforementioned equation that maps the application memory to the shadow memory, we find the shadow gap then starts from 0x7fff8000 >> 3 + 0x7fff8000 = 0x0ffff000 + 0x7fff8000 = 0x00008fff7000 and ends at 0x10007fff7fff >> 3 + 0x7fff8000 = 0x10007fff7fff + 0x7fff8000 = 0x02008fff6fff. The shadow memory is divided into two halves, HighShadow and LowShadow, by the shadow gap.

These derivations justify the shadow memory layout shown in the table above. One thing that bothered me a little bit concerned is that 0x00007fff8000 ≈ 2^31 is a little bit too small whereas 0x10007fff8000 ≈ 2^44 is too large for a contemporary conventional computer. This means for contemporary conventional computer, when AddressSanitizer is used, the application could only use 2^31 bytes, i.e., 2 GB, of memory in the low memory region, whereas the high memory region is not usable at all. I wonder if AddressSanitizer would adjust the shadow memory offset depending on the actual memory size of the computer it runs on, so that the application could use more memory.

Stack Memory Instrumentation

AddressSanitizer replaces the malloc and free functions to poison and unpoison the memory allocated from the heap. Because stack memory does not use malloc and free, AddressSanitizer has to instrument the stack memory accesses in a different way. The approach is to add red zones around the buffer on the stack, which are poisoned.

For example, we might have an array on the stack like the following.

1
2
3
4
5
void foo() {
char a[8];
...
return;
}

After AddressSanitizer instrumentation, it becomes like the following.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
void foo() {
char redzone1[32]; // 32-byte aligned
char a[8]; // 32-byte aligned
char redzone2[24];
char redzone3[32]; // 32-byte aligned
int *shadow_base = MemToShadow(redzone1);
// Here we poison or unpoison 32 bytes at a time because shadow_base points to a 4-byte integer.
shadow_base[0] = 0xffffffff; // poison redzone1
shadow_base[1] = 0xffffff00; // poison redzone2, unpoison 'a'
shadow_base[2] = 0xffffffff; // poison redzone3
...
shadow_base[0] = shadow_base[1] = shadow_base[2] = 0; // unpoison all
return;
}

Unlike the not addressable heap memory access identification, adding red zones are necessary for the not addressable stack memory access identification. Because AddressSanitizer identifies memory access to the not addressable memory instead of the OOB memory, and the memory before and after the buffer on the stack are mostly always addressable, AddressSanitizer would not be able to detect illegal stack buffer memory access without red zones.

Also note that because the red zones instrumented are limited, AddressSanitizer might not be able to detect all the illegal stack buffer memory access. In the example above, if we do a[128], it would not be detected because a[128] is outside all the red zones.

AddressSanitizer Usages

AddressSanitizer is supported by both GCC and Clang. It can also be enabled using high-level build tools such as CMake.

GCC/Clang

We have taken the OOB example from the previous blog post “Illegal Memory Access and Segmentation Fault” and used AddressSanitizer to detect the OOB memory access.

oob.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#include <vector>
#include <iostream>

void access_out_of_bounds_element(std::vector<int> const& v, size_t oob_offset)
{
// Accessing out-of-bounds element
std::cout << "Accessing Out-Of-Bounds Element At Index: "
<< v.size() - 1 + oob_offset << std::endl;
// This line has higher chance of causing a segmentation fault if oob_offset is large.
int out_of_bounds_element = v[v.size() - 1 + oob_offset];
}

int main()
{
constexpr size_t vector_size{32U};
std::vector<int> const v(vector_size, 0);

std::cout << "Vector Size: " << v.size() << std::endl;
access_out_of_bounds_element(v, 1);
access_out_of_bounds_element(v, 10);
access_out_of_bounds_element(v, 100);
access_out_of_bounds_element(v, 1000);
access_out_of_bounds_element(v, 10000);
access_out_of_bounds_element(v, 100000);
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
$ g++ -fsanitize=address -g oob.cpp -o oob
$ ./oob
Vector Size: 32
Accessing Out-Of-Bounds Element At Index: 32
=================================================================
==631273==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x50c0000000c0 at pc 0x6146786d848e bp 0x7ffce16d5860 sp 0x7ffce16d5850
READ of size 4 at 0x50c0000000c0 thread T0
#0 0x6146786d848d in access_out_of_bounds_element(std::vector<int, std::allocator<int> > const&, unsigned long) /home/leimao/Workspace/asan_test_2/oob.cpp:10
#1 0x6146786d862f in main /home/leimao/Workspace/asan_test_2/oob.cpp:19
#2 0x76aba9e2a1c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
#3 0x76aba9e2a28a in __libc_start_main_impl ../csu/libc-start.c:360
#4 0x6146786d8304 in _start (/home/leimao/Workspace/asan_test_2/oob+0x2304) (BuildId: 2836d209262b57c3d9524362c64face028118252)

0x50c0000000c0 is located 0 bytes after 128-byte region [0x50c000000040,0x50c0000000c0)
allocated by thread T0 here:
#0 0x76abaa6fe548 in operator new(unsigned long) ../../../../src/libsanitizer/asan/asan_new_delete.cpp:95
#1 0x6146786d933d in std::__new_allocator<int>::allocate(unsigned long, void const*) /usr/include/c++/13/bits/new_allocator.h:151
#2 0x6146786d9212 in std::allocator_traits<std::allocator<int> >::allocate(std::allocator<int>&, unsigned long) /usr/include/c++/13/bits/alloc_traits.h:482
#3 0x6146786d9212 in std::_Vector_base<int, std::allocator<int> >::_M_allocate(unsigned long) /usr/include/c++/13/bits/stl_vector.h:381
#4 0x6146786d8f68 in std::_Vector_base<int, std::allocator<int> >::_M_create_storage(unsigned long) /usr/include/c++/13/bits/stl_vector.h:398
#5 0x6146786d8bc4 in std::_Vector_base<int, std::allocator<int> >::_Vector_base(unsigned long, std::allocator<int> const&) /usr/include/c++/13/bits/stl_vector.h:335
#6 0x6146786d88de in std::vector<int, std::allocator<int> >::vector(unsigned long, int const&, std::allocator<int> const&) /usr/include/c++/13/bits/stl_vector.h:571
#7 0x6146786d85aa in main /home/leimao/Workspace/asan_test_2/oob.cpp:16
#8 0x76aba9e2a1c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
#9 0x76aba9e2a28a in __libc_start_main_impl ../csu/libc-start.c:360
#10 0x6146786d8304 in _start (/home/leimao/Workspace/asan_test_2/oob+0x2304) (BuildId: 2836d209262b57c3d9524362c64face028118252)

SUMMARY: AddressSanitizer: heap-buffer-overflow /home/leimao/Workspace/asan_test_2/oob.cpp:10 in access_out_of_bounds_element(std::vector<int, std::allocator<int> > const&, unsigned long)
Shadow bytes around the buggy address:
0x50bffffffe00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x50bffffffe80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x50bfffffff00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x50bfffffff80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x50c000000000: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
=>0x50c000000080: 00 00 00 00 00 00 00 00[fa]fa fa fa fa fa fa fa
0x50c000000100: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x50c000000180: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x50c000000200: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x50c000000280: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x50c000000300: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==631273==ABORTING

CMake

The following CMake build file could be used to build the OOB example above.

CMakeLists.txt
1
2
3
4
5
6
7
cmake_minimum_required(VERSION 3.10)
project(OOB)

set(CMAKE_CXX_STANDARD 14)
set(CMAKE_CXX_STANDARD_REQUIRED ON)

add_executable(oob oob.cpp)

To build the project using AddressSanitizer, we could run the following commands. AddressSanitizer could be simply enabled by adding the compiler flag -fsanitize=address and the linker flag -fsanitize=address.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
$ cmake -DCMAKE_CXX_FLAGS="-fsanitize=address -g" -DCMAKE_EXE_LINKER_FLAGS="-fsanitize=address" -B build
$ cmake --build build --parallel
$ ./build/oob
Vector Size: 32
Accessing Out-Of-Bounds Element At Index: 32
=================================================================
==351==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x50c0000000c0 at pc 0x584a376e748e bp 0x7ffcd889df00 sp 0x7ffcd889def0
READ of size 4 at 0x50c0000000c0 thread T0
#0 0x584a376e748d in access_out_of_bounds_element(std::vector<int, std::allocator<int> > const&, unsigned long) /mnt/oob.cpp:10
#1 0x584a376e762f in main /mnt/oob.cpp:19
#2 0x7749df7a81c9 (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 42c84c92e6f98126b3e2230ebfdead22c235b667)
#3 0x7749df7a828a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 42c84c92e6f98126b3e2230ebfdead22c235b667)
#4 0x584a376e7304 in _start (/mnt/build/oob+0x2304) (BuildId: fe1d15fae14bf13c1abc06669083bc55a9d17fc9)

0x50c0000000c0 is located 0 bytes after 128-byte region [0x50c000000040,0x50c0000000c0)
allocated by thread T0 here:
#0 0x7749dfe23548 in operator new(unsigned long) ../../../../src/libsanitizer/asan/asan_new_delete.cpp:95
#1 0x584a376e833f in std::__new_allocator<int>::allocate(unsigned long, void const*) /usr/include/c++/13/bits/new_allocator.h:151
#2 0x584a376e8212 in std::allocator_traits<std::allocator<int> >::allocate(std::allocator<int>&, unsigned long) /usr/include/c++/13/bits/alloc_traits.h:482
#3 0x584a376e8212 in std::_Vector_base<int, std::allocator<int> >::_M_allocate(unsigned long) /usr/include/c++/13/bits/stl_vector.h:381
#4 0x584a376e7f68 in std::_Vector_base<int, std::allocator<int> >::_M_create_storage(unsigned long) /usr/include/c++/13/bits/stl_vector.h:398
#5 0x584a376e7bc4 in std::_Vector_base<int, std::allocator<int> >::_Vector_base(unsigned long, std::allocator<int> const&) /usr/include/c++/13/bits/stl_vector.h:335
#6 0x584a376e78de in std::vector<int, std::allocator<int> >::vector(unsigned long, int const&, std::allocator<int> const&) /usr/include/c++/13/bits/stl_vector.h:571
#7 0x584a376e75aa in main /mnt/oob.cpp:16
#8 0x7749df7a81c9 (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 42c84c92e6f98126b3e2230ebfdead22c235b667)
#9 0x7749df7a828a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 42c84c92e6f98126b3e2230ebfdead22c235b667)
#10 0x584a376e7304 in _start (/mnt/build/oob+0x2304) (BuildId: fe1d15fae14bf13c1abc06669083bc55a9d17fc9)

SUMMARY: AddressSanitizer: heap-buffer-overflow /mnt/oob.cpp:10 in access_out_of_bounds_element(std::vector<int, std::allocator<int> > const&, unsigned long)
Shadow bytes around the buggy address:
0x50bffffffe00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x50bffffffe80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x50bfffffff00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x50bfffffff80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x50c000000000: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
=>0x50c000000080: 00 00 00 00 00 00 00 00[fa]fa fa fa fa fa fa fa
0x50c000000100: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x50c000000180: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x50c000000200: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x50c000000280: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x50c000000300: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==351==ABORTING

To build the project without AddressSanitizer, we could run the following commands.

1
2
3
4
5
6
7
8
9
10
11
12
13
# Remove the build directory if it already exists and run CMake build without AddressSanitizer flags.
$ rm -rf build
$ cmake -B build
$ cmake --build build --parallel
$ ./build/oob
Vector Size: 32
Accessing Out-Of-Bounds Element At Index: 32
Accessing Out-Of-Bounds Element At Index: 41
Accessing Out-Of-Bounds Element At Index: 131
Accessing Out-Of-Bounds Element At Index: 1031
Accessing Out-Of-Bounds Element At Index: 10031
Accessing Out-Of-Bounds Element At Index: 100031
Segmentation fault (core dumped)

AddressSanitizer VS Valgrind

The AddressSanitizer versus other memory error checking tools, such as Valgrind, has been described in the AddressSanitizer GitHub Wiki.

The major advantages of AddressSanitizer over Valgrind is that it has much higher performance because it does code instrumentation at compile time and does not rely on a virtual machine to run the program like Valgrind. The major disadvantages of AddressSanitizer over Valgrind is that it requires recompilation of the program with AddressSanitizer enabled, which means the source code must be available, whereas Valgrind can be used to check memory errors of any binary executable.

Conclusions

If AddressSanitizer reports an error, it is a true error and requires fixing. However, AddressSanitizer may not catch all the memory errors because of its design limitations.

References

Author

Lei Mao

Posted on

09-27-2025

Updated on

09-27-2025

Licensed under


Comments