Lei Mao bio photo

Lei Mao

Machine Learning, Artificial Intelligence, Computer Science.

Twitter Facebook LinkedIn GitHub   G. Scholar E-Mail RSS

Introduction

In C programming, arrays are used to store objects of the same type. Unlike the STL container std::vector in C++, determining the number of bytes for C arrays and the number of elements in C arrays are not straightforward.


In this blog post, I would like to discuss some of the concepts behind C arrays and how to determine the size of C arrays if it is possible.

Create Arrays

There are two types of arrays. The size of one type of the arrays would need to be known at compile-time, and the size of the other type of the arrays could be known at runtime. The former array would be created on the stack, and the latter array would be created on the heap.

/*
 * array.cpp
 */
#include <iostream>
#include <vector>
#include <cstdlib>

int main()
{
    const size_t n1 = 10;
    // C99 standard started to allow variable sized arrays on the stack.
    // Previously, variable sized arrays could only be created on the heap via new.
    // Disable this feature by adding -Werror=vla to the compilation command.
    // g++ array.cpp -o array --std=c++11 -Werror=vla
    // https://stackoverflow.com/questions/737240/array-size-at-run-time-without-dynamic-allocation-is-allowed
    // The following code would not work if -Werror=vla has been used.
    // size_t n1 = 10;
    size_t n2 = 10;
    size_t n3 = 10;
    // n1 has to be known at compile time (without using vla).
    // Array on stack.
    int arr1[n1];
    // n2 could be known at runtime.
    // Array on heap.
    int* ptrArr2 = new int[n2];
    // n3 could be known at runtime.
    // Array on heap.
    int* ptrArr3 = malloc(sizeof(int[n3]));

    delete[] ptrArr2;
    free(ptrArr3);
}

To compile the program, please run the following command in the terminal.

$ g++ array.cpp -o array --std=c++11 -Werror=vla

C/C++ sizeof Operator

In C/C++, sizeof is used to determine the size of a variable or a type. Although the sizeof expressions, such as sizeof(variable) and sizeof(type), could be used with parentheses, sizeof is a compile-time operator instead of a function in C/C++. This means that the compiler knows the exact value of sizeof(variable) and sizeof(type) at compile time.

Determine Array Size

The correct way of determining the size of arrays on the stack has been presented as follows. The size of arrays on the heap could not be determined.

/*
 * arraySize.cpp
 */
#include <iostream>
#include <vector>
#include <cstdlib>

int main()
{
    const size_t n1 = 10;
    size_t n2 = 10;
    size_t n3 = 10;
    int arr1[n1];
    int* ptrArr1 = arr1;
    int* ptrArr2 = new int[n2];
    int* ptrArr3 = static_cast<int*>(malloc(n3 * sizeof(int)));

    std::cout << sizeof(arr1) << std::endl; // Number of bytes for array on the stack
    std::cout << sizeof(ptrArr1) << std::endl; // Number of bytes for pointer to int
    std::cout << sizeof(ptrArr2) << std::endl; // Number of bytes for pointer to int
    std::cout << sizeof(ptrArr3) << std::endl; // Number of bytes for pointer to int
    std::cout << sizeof(arr1[0]) << std::endl; // Number of bytes for int
    std::cout << sizeof(ptrArr2[0]) << std::endl; // Number of bytes for int
    std::cout << sizeof(ptrArr3[0]) << std::endl; // Number of bytes for int

    std::cout << "----------------------" << std::endl;

    std::cout << sizeof(arr1)/sizeof(arr1[0]) << std::endl; // Correct number of elements in the array on the stack
    std::cout << sizeof(ptrArr1)/sizeof(ptrArr1[0]) << std::endl; // Incorrect number of elements in the array on the stack
    std::cout << sizeof(ptrArr2)/sizeof(ptrArr2[0]) << std::endl; // Incorrect number of elements in the array on the heap
    std::cout << sizeof(ptrArr3)/sizeof(ptrArr3[0]) << std::endl; // Incorrect number of elements in the array on the heap

    delete[] ptrArr2;
    free(ptrArr3);
}

To compile the program, please run the following command in the terminal.

$ g++ arraySize.cpp -o arraySize --std=c++11 -Werror=vla

I got the following outputs from the program. Some of the values might be different depending on the computers.

$ ./arraySize
40
8
8
8
4
4
4
----------------------
10
2
2
2

The questions is why sizeof(arr1) and sizeof(ptrArr1) result in different values. The compiler knows arr1 is neither a variable of integer nor a pointer of int, but an integer array of size 10. Therefore, sizeof(arr1) is returning the number of bytes for the array arr1. It is some of the few cases where the array type does not decay to a pointer type. ptrArr1, however, is just a normal pointer decayed from integer array type. Therefore, sizeof(ptrArr1) is returning the number of bytes for a single pointer ptrArr1.

FAQs

What are the differences between malloc and new?

Obviously, one major difference is that malloc does not call constructor since it does not have to know the type of variables we want to write to the memory, while new has to call a constructor.

How does delete and free know the size of an array?

It is a little bit weird that we could not determine the size of an array on the heap, but free and delete[] knows the number of bytes for the array so that they could recycle the memory accordingly. The answer is it really depends on the implementation.


There is a good diagram on the StackOverflow which I borrowed. It is a good implementation for the arrays on the heap.

 ____ The allocated block ____
/                             \
+--------+--------------------+
| Header | Your data area ... |
+--------+--------------------+
          ^
          |
          +-- The address you are given

free and delete[] could always look at the headers on the memory before the pointer and determine the size of the array, because free and delete[] assume all the pointers given to them are pointers pointing to the memories on the heap. sizeof, however, has no such assumptions. In addition, as mentioned above sizeof is a compile-time operator and it could not look at the potential header at runtime.

References