Large Array Safety Issue

Introduction

In a safety system development, we should not only pay attention to the dynamic memory allocation on heap, but also be extremely careful with the static array on stack. Both of them, if not being used or handled appropriately, can cause system crash and lead to dire consequences. Unlike the dynamic memory allocation on heap, the static array on stack is something that many safety system developers will overlook.

In this blog post, I would like to discuss the safety issues related to large arrays in a safety system.

Stack Overflow

All programmers must have been familiar with “stack overflow” which is usually encountered when a recursive function is being called without a stop or exit condition.

The reason why stack will “overflow” is because the stack is usually of a small size. Each time the recursive function is called, the previous function being called on the stack will not be popped and the function being called is pushed to the stack. If there are too many such recursive function calls, the stack on the memory will be used up and the program becomes abnormal.

C Memory Model

The “stack overflow” can be better understood and visualized using the memory model of the C programming language. The memory model of other programming languages might not be exactly the same as the C memory model, but usually they are close.

Here is the copy of the introduction to the C memory model from my previous article “Inline Specifier Compilation in C/C++”.

The memory model for modern computer programs has stack. When a function is called during runtime, some necessary information, such as the return address, local variables, and function arguments, will be allocated and pushed to the stack.

Consider computing a the factorial of $n$ using a recursive function fact, for each recursion, a fact function information will be pushed onto the stack.

C Memory Model

Memory Allocation in Safety System

In C and C++, the size of an object is known at compile time. The memory buffer that is allocated dynamically on the heap is known at runtime. Some objects do not manage memory buffer on heap, some objects do.

In a safety system, the memory management has some difference comparing to an ordinary system. In a safety system, because dynamic memory allocation on heap might fail, such as when a nullptr is returned from malloc, all dynamic memory allocation on heap must happen during the system initialization stage, any dynamic memory allocation on heap during the system execution stage is not allowed. If there is ever any failure related to dynamic memory allocation during the system initialization stage, the system will just crash or the error can be handled during the initialization stage so that catastrophic consequences will not happen.

To prevent safety system developers from accidentally allocating dynamic memory on heap during the system execution stage, the object that manages the memory buffer will usually be designed in a way such that its size on the memory cannot be adjusted after its construction or initialization. Some typical memory buffer containers, such as std::vector, are usually not used in a safety system, because they have methods, such as resize, that can be called to adjust its size after construction or initialization. Notice that technically this cannot prevent all the dynamic memory allocation behaviors on heap during the system execution stage, because the developers can alway do new to create objects on heap.

Although we have paid enough attention to the objects that manage memory buffer on heap in a safety system, we often overlooked the objects that do not memory buffer on heap which can also cause system crashes during the system execution stage and lead to catastrophic consequences.

In most cases, the object which does not manage memory buffer on heap is very small, such as int32_t, int8_t, std::pair and std::tuple. However, there are also objects that can easily go very large. Static arrays whose sizes are known at compile time, such as C array and C++ std::array (although where to store the array buffer is an implementation detail for std::array in C++), are entirely created on stack. Since they are created on stack and they can be very large, stack overflow becomes very possible and thus the system safety will become a concern.

Large Array on Stack

Let’s first visually verify how objects are created on stack using the C memory model. Specifically, in our function work_on_some_arrays, two small arrays are created.

small_array.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include <cstdint>
#include <array>
#include <iostream>

void work_on_some_arrays()
{
std::array<int32_t, 4> a{};
a[0] = 1;
std::array<int32_t, 4> b{};
b[0] = 1;
}

int main()
{
int v{0};
std::cout << "Hello Underworld!" << std::endl;
work_on_some_arrays();
}

The program builds and runs fine.

1
2
3
$ g++ small_array.cpp -o small_array -std=c++14
$ ./small_array
Hello Underworld!

Using a C++ program visualizer, we could see that when work_on_some_arrays is called during runtime, the two small arrays are created on the stack. Concretely, because each array stores 4 int32_t values, $4 \times 4 \times 2 = 32$ bytes are used on the stack.

Let’s then check how large our stack size is allowed on our operating system. For my Linux system specifically, the default stack size allowed is 8192 KB.

1
2
$ ulimit -a | grep "stack size"
stack size (kbytes, -s) 8192

Knowing this, it is very easy to orchestrate a crash caused by stack overflow at some point during the runtime of our program without even having to use recursive functions.

Because the stack size is 8192 KB, the number of int32_t values on the stack cannot be larger than $\frac{8192 \times 1024}{4} = 2097152$. So we will create two arrays for storing int32_t values in the function work_on_some_arrays whose total size is 2097152. We would expect that the program crashes when work_on_some_arrays is called.

large_array.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
#include <cstdint>
#include <array>
#include <iostream>

void work_on_some_arrays()
{
// 8192 x 1024 / 4 = 2097152
// 2097152 / 2 = 1048576
std::array<int32_t, 1048576> a{};
a[0] = 1;
std::array<int32_t, 1048576> b{};
b[0] = 1;
}

int main()
{
int v{0};
std::cout << "Hello Underworld!" << std::endl;
work_on_some_arrays();
}

The program builds fine but it crashed immediately when work_on_some_arrays is called because of stack overflow, which exactly matches our expectation.

1
2
3
4
$ g++ large_array.cpp -o large_array -std=c++14
$ ./large_array
Hello Underworld!
Segmentation fault (core dumped)

Conclusions

We should always be very careful creating large objects, such as large arrays, in a safety system. Ideally, if we ever want to create a large object, we should create them on heap rather than stack, so that at least the success of dynamically memory allocation on heap can be checked. In the worst scenario, we can create a large object on stack only during the system initialization stage so that the system can “safely” crash even if there is a stack overflow, but it is still not recommended.

References

Author

Lei Mao

Posted on

10-21-2022

Updated on

10-21-2022

Licensed under


Comments