slashbinbash.de / C++ Programming

This page is work in progress

When I started learning C/C++ programming, I only had a very vague idea of what a computer was doing when it executed a program. In languages like Java, JavaScript, or Python, you don't necessarily care about what your lines of code mean in terms of low-level program execution, memory, or performance. In C++ however, you are playing a dangerous guessing-game if you don't know what the program is doing.

In this article, I will explain the concepts of C++ on the implementation level. I will give examples, show memory dumps, and assembly code, to illustrate my points. This will give you an understanding of what is going on under the hood, and maybe why the C++ language is designed as it is.

This article will not teach you C++ language features or the basics of programming. There are much better tutorials for that.

Do you really need to know all the things in this article? From the perspective of a hobby programmer, probably not. From the perspective of a software engineer, you should be very familiar with the language and the generated assembly code of your compiler, including its implications on performance and memory. Otherwise there is no point in using a language like C++.

Please consider that the article only scratches the tip of the iceberg. I don't cover compiler, operating system and CPU design, which play a major role in how your code executes. Knowing what virtual memory is, how data is cached, or what out-of-order execution is, will give you further insight into the inner workings of your computer.

Please check out the Further Reading section for related topics that are not covered in this article.

Overview

Memory

Memory can be addressed in byte sized increments. To help you visualize it, open the memory view of your IDE while debugging your application. You will see something like this:

        00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
44C3D0  2b 2b 5c 77 69 6e 33 32 5c 63 70 70 74 65 73 74  ++\win32\cpptest
44C3E0  5c 62 75 69 6c 64 5c 52 65 6c 57 69 74 68 44 65  \build\RelWithDe
44C3F0  62 49 6e 66 6f 5c 63 70 70 74 65 73 74 2e 65 78  bInfo\cpptest.ex
44C400  65 00 00 00 00 00 00 00 72 35 16 22 66 fc 00 10  e.......r5."fü..

On the far left side is the memory address of the first byte in the row. It is followed by 16 bytes of data. Right next to it is the interpretation of the data as ASCII characters. You can switch it to different representations to see what integers or floating-point numbers are written in memory.

Memory can be interpreted in different ways. The same bits and bytes can mean different things. In C++ you decide how the memory is interpreted via the type-system.

To extract single bits of information, bit-operations like shift, AND, OR are used.

The memory is not initialized by default. It can consist of random data. You have to decide if and when to initialize the memory.

There are instances in which you don't have to initialize the memory to a default value. For instance, it would be a waste of CPU cycles to initialize the memory if it is completely overwritten anyway. Initializing memory becomes very important when dealing with structures or classes. Reading data from a partially initialized struct can lead to undefined behavior in your application.

Stack

The stack is a data structure with two operations: push and pop. To push an element to the stack means that it is put on top of the other elements. To pop an element from the stack means that the top element is removed. Think of it as a stack of plates. The implementation of a stack typically has an index or pointer that points to the top element of the stack. This pointer is modified whenever push or pop is called.

How does this apply to C++ programming? The processor has push and pop operations, as well as a stack pointer. The stack pointer is a memory address that points to the block of memory that is used by the processor as stack memory. When push is executed by the processor, the stack pointer is decremented, and data is written to the stack. When pop is executed by the processor, data is read from the stack, and the stack pointer is incremented. Pop does not actually remove the data from memory. The data stays written in memory until it is overwritten, either by subsequent push operations or direct memory access.

As a C++ programmer you cannot call push or pop directly on the processor, unless your compiler supports inline assembly. Instead, you can influence how and when data is pushed and poped with the tools of the language. There is nothing that keeps you from manipulating the stack memory directly.

The compiler uses the stack memory as a call stack. Whenever a function is called, its context is written to the stack in a so called stack frame. It contains register values, parameters, a return address, local variables, etc. When returning from a function, the memory of the stack frame becomes invalid.

In some cases, your application can run out of stack memory and potentially overwrite data in other parts of the memory. This is also known as a stack overflow.

Heap

The heap is memory that you can perform read and write operations on. You have to explicitely allocate the memory that you want to use in your program, and deallocate the memory when you are done using it. If you don't free the memory that you have allocated, you can potentially run out of memory.

As with the stack, there is nothing that keeps you from reading or writing outside of the bounds of the allocated memory.

Static

Numbers, string literals, and even brace-enclosed lists of values that are used to initialize arrays or structs, are all stored in the executable in some way or another.

Numbers become part of CPU instructions:

;int a = 42;
mov dword ptr [a],2Ah

A string like this:

const char* str = "Hello World!";

Is written to the executable file, like this:

000000000000BBB0  48 65 6c 6c 6f 20 57 6f 72 6c 64 21 00 00 00 00  Hello World!....

When the executable is loaded into memory, the string is always at the same memory address. Whenever you write the string literal, the address of the string in memory is used to refer to the string. Since the string is loaded into read-only memory, you have to copy the string to modify it.

The same goes for initializer lists.

Registers

Registers are the memory that the CPU uses for calculations. The number of registers vary from CPU to CPU. The size of a register determines the range of memory that can be addressed. For instance, a 16-bit register can only address 65535 bytes of memory.[1]

The state of all registers during program execution can give you a lot of useful information. You will find integer values that are used for calculation, memory addresses of objects, the stack pointer, or the instruction pointer. CPUs that support 32bit/64bit floating-point arithmetic might have special registers and instructions.

Registers only play a secondary role when programming in C++. You as a programmer have very little influence over how the registers are utilized, unless your compiler supports inline assembly.

Pointers

A pointer is a data type which holds a memory address.

Take a look at the following code:

struct Test {
    int a;
    int b;
    int* ptr; // pointer to an integer
    int c;
};

Test test;

test.a = 1;
test.b = 2;
test.ptr = &test.b;
test.c = *test.ptr;

By writing Test test, the object is stored in stack memory. We write two integer values, a pointer to an integer value, and a third integer to the memory.

We specify the memory address to a variable by using the ampersand character (&). We specify the value that is stored at a given memory address by using the asterisk character (*).

A look at the memory reveals our structure:

000000000026FB28  01 00 00 00  ....
000000000026FB2C  02 00 00 00  ....
000000000026FB30  2c fb 26 00  ....
000000000026FB34  00 00 00 00  ....
000000000026FB38  02 00 00 00  ....

At address 0x26FB28 and 0x26FB2C there are two 4 byte long numbers, our integers. At 0x26FB30 there is what looks like an address. If we read it correctly, it is the memory address 0x26FB2C. This is our pointer. It points to the second integer in memory. On a 64bit machine the pointer is 8 bytes long, this explains the 4 bytes of zeros at 0x26FB34.

At address 0x26FB38 there is another number, our third integer. By writing test.c = *test.ptr, we tell the compiler to copy the value that is located at address 0x26FB2C to address 0x26FB38. Lucky for us, both values are 4 byte integers - no conversion is needed.

Notice how the structure is represented value by value in memory. Since the compiler knows the layout of our structure, it can use offsets in memory to locate each of its values.

To illustrate this mechanism, here is the generated code from the compiler:

;test.a = 1;
mov dword ptr [test],1

;test.b = 2;
mov dword ptr [rsp+2Ch],2

;test.ptr = &test.b;
lea rax,[rsp+2Ch]               ;RAX=0x26FB2C (&test.b)
mov qword ptr [rsp+30h],rax     ;test.ptr=RAX

;test.c = *test.ptr;
mov rax,qword ptr [rsp+30h]     ;RAX=0x2CFB26 (test.ptr)
mov eax,dword ptr [rax]         ;EAX=2        (*)
mov dword ptr [rsp+38h],eax     ;test.c=EAX   (test.c=)
You can recognize stack memory access by the use of the stack pointer (SP).

Pointer types

I showed that the compiler writes pointers as plain memory addresses to memory. But how does the compiler know the type of the object, that the address points to?

If we write int*, the compiler knows that the pointer points to an integer value. This means that the compiler will interpret the bytes at the address that the pointer points to as integers.

The type of the pointer serves as a schematic of how to interpret the memory.

If you change the type of a pointer, the memory will be interpreted differently.

Take this code for example:

int a = 8;
int* ptrToInt = &a;
float* ptrToFloat = (float*)ptrToInt;
float b = *ptrToFloat;

This is the stack memory:

000000000023F894  08 00 00 00  ....
000000000023F898  84 f8 23 00  ....
000000000023F89C  00 00 00 00  ....
000000000023F8A0  84 f8 23 00  ....
000000000023F8A4  00 00 00 00  ....
000000000023F8A8  08 00 00 00  ....

The pointers at 0x23F898 and 0x23F8A0 point to the same address, which is 0x23F894. The values at 0x23F894 and 0x23F8A8 appear to be equal because the memory contents is exactly the same. Yet, to the compiler, they are two totally different things.

The variable a (0x23F894) holds the integer value 8. The variable b (0x23F8A8) holds the floating-point value 1.121e-44#DEN. If you try to do calculations with these two numbers, the result will be very different.

This applies to all other types in C++, including structures and classes. You can cast any pointer type to another pointer type and access the memory as if the object was created with that type. That said, this behavior is undefined in C++ except for a few cases that are outlined in the strict aliasing rules.[2]

The only pointer that is a pure memory address is the void pointer (void*).

Pointer arithmetic

I showed how pointers are stored in memory, and how the compiler knows what type of object the pointer points to. But what happens when we do calculations with memory addresses?

The key to understanding pointer arithmetic is that the result of the calculation depends entirely on the pointer type. If your type is a 4 byte integer, then the integer pointer will change in 4 byte increments.

For instance:

int a = 4;
int* ptrA = &a;
int* ptrB = ptrA + 1;

A look at the memory reveals the following picture:

000000000015FCB8  a4 fc 15 00 00 00 00 00  ¤ü......
000000000015FCC0  a8 fc 15 00 00 00 00 00  ¨ü......

ptrA is stored at 0x15FCB8. ptrB is stored at 0x15FCC0.

Although we have incremented ptrA by 1, the difference between ptrA (0x15FCA4) and ptrB (0x15FCA8) is exactly 4 bytes.

The same exact rule applies to any other type. If a struct or class has a size of 16 bytes, incrementing a pointer to that structure by 1 will result in an offset of 16 bytes. If you want to step through memory in 1 byte increments, you have to chose the appropriate type, like a char.

There are situations in which you can work with memory and memory addresses without knowing what the contents is. You don't need to know that you are copying floating-point values, as long as you know how many bytes you are copying. It is possible to rewrite algorithms to be more generic by writing them in terms of memory operations, instead of operations on types. Alternatively, you can use templates if code readability and maintainability is a priority, and executable size is not an issue.

Subscript operator

In languages like Java, the subscript operator ([]) is only used for arrays. In C++ however, there is no difference between a pointer and an array, in terms of how the data is accessed.

The following code will get the third element of the array:

int* array = ...;
int value = array[2];

Why the third and not the second? The number 2 is an offset, relative to the address of the array. The code increments the integer pointer by 2. It returns the value at the memory address by implicitly dereferencing the pointer. This is equivalent to writing:

int* array = ...;
int value = *(array + 2);

Adding 2 to the integer pointer gives us an offset of 8 bytes (sizeof(int) * 2) in memory. We have to explicitly dereference the pointer to get the value.

You can see the equivalence when comparing the generated assembly code:

;int val0 = arr0[2];
mov  eax,4
imul rax,rax,2                ;RAX=8
mov  rcx,qword ptr [arr0]
mov  eax,dword ptr [rcx+rax]
mov  dword ptr [val0],eax

;int val1 = *(arr1 + 2);
mov  rax,qword ptr [arr1]
mov  eax,dword ptr [rax+8]
mov  dword ptr [val1],eax

This is also the reason why array[0] is the first value of the array. The offset is 0. You can write *array instead.

Function pointers

Many programming languages have a mechanism that allows you to treat a function like a data type. We can create objects of it, and assign it to variables. This means that we can pass it to other functions that can in turn call that function.

In assembly, a function call is not much more than setting the instruction pointer of the CPU to a different memory address. The CPU will then continue to read instructions from that address and execute them. If the memory happens to contain random data, it can lead to undefined behavior in your program.

If you define a function in C++, the compiler generates code for the instructions that are inside the function body. It associates the memory address of the generated code with a unique name. This name is typically a combination of the namespace the function is defined in, the function name, and the parameters of the function (see name mangling). The memory address to the function body can be copied as any other pointer, which makes it a function pointer.

The type system allows us to use the function call operator (()) on function pointers.

;foo()
call foo (013F4C1294h)

;bar(foo)
lea  rcx,[foo (013F4C1294h)]
call bar (013F4C128Fh)

;void bar(void(*fn)(void)) {
mov  qword ptr [rsp],rcx
;fn()
call qword ptr [rsp]

Structs

Before I get to object oriented programming, I want to touch on some properties of C++ structs. This will be essential to understanding how inheritance and polymorphism is implemented.

Conceptually, classes and structs are the same in C++. The main difference is that members of structs are public by default. Members of classes are private by default.

The following struct definition:

struct A {
    int a = 0x10;
    int b = 0x20;
    int c = 0x30;
};

Will present itself in memory like this:

000000000021FE28  10 00 00 00  ....
000000000021FE2C  20 00 00 00  ....
000000000021FE30  30 00 00 00  ....

The structure has a size of 12 bytes.

There are cases in which we want to put a structure inside another structure. When I started learning C++, coming from Java, I would go for this approach:

struct B {
    int b = 0x20;
    int c = 0x30;
};

struct A {
    int a = 0x10;
    B* b = nullptr;
};

The intention was to include struct B into struct A by using a pointer. It presents itself in memory like this:

000000000025FD10  20 00 00 00   ...
000000000025FD14  30 00 00 00  0...
000000000025FD18  10 00 00 00  ....
000000000025FD1C  00 00 00 00  ....
000000000025FD20  10 fd 25 00  .ý%.
000000000025FD24  00 00 00 00  ....

You can already see that there is something odd about this. The size of struct A (0x25FD18) is 16 bytes, but we only store an integer of 4 bytes and a pointer of 8 bytes, which is a total of 12 bytes. Since I'm compiling this code on a 64bit machine, the compiler has chosen to pad the structure in order to align the types to certain addresses. This makes it easier for the CPU to read and process the data. This is known as data structure alignment. Otherwise the processor would have to read the memory multiple times and merge the data.

But there is also another way to define struct B as being part of struct A:

struct B {
    int b = 0x20;
    int c = 0x30;
};

struct A {
    int a = 0x10;
    B b;
};

A look at the memory reveals:

000000000017FDD8  10 00 00 00  ....
000000000017FDDC  20 00 00 00   ...
000000000017FDE0  30 00 00 00  0...

We told the compiler that struct B is part of struct A and it complies. The size of struct A is 12 bytes.

Both definitions carry certain implications when it comes to memory consumption, memory access, and object lifetime. I will discuss this in a later section.

Member fields and functions

The member fields and member functions of structs and classes are only abstract concepts.

Take this definition for instance:

struct A {
    int value;

    void setValue(int value) { this.value = value; }
};

A a;
a.setValue(0x10);

In assembly, you will see the following access pattern:

;a.setValue(0x10);
mov  edx,10h  
lea  rcx,[a]  
call A::setValue (013F07125Dh) 

In C, where you cannot define member functions, you would write something like this:

void setValue(struct A *this, int size);

struct A a;
setValue(&a, 0x10);

You always have to pass a pointer of the object that you want to modify to the function. Calling a member function in C++ does this implicitly.

[TODO show assembly of code inside member function]

Object-Oriented Programming

There are different ways how object-oriented programming and its main concepts are taught. One of the commonly used examples is that of animals. A cat is an animal. A dog is an animal. As animals, both have certain properties, some of which they commonly share. Both can walk, both can emit some kind of sound, both hunt prey, both have fur, and so on. That said, they exhibit these properites in very different ways that are specific to their species. This is to illustrate inheritance and polymorphism based on how humans categorize abstract concepts in a hierarchical structure, with is-a relations. A Cat is-a Mammal, a Mammal is-a Animal, etc.

In practice however, this example is very misleading. The main advantage of inheritance and polymorphism can be illustrated much better when you think about interfaces and abstractions. For instance, if you have a console, a file, a printer, and an internet connection, each of them requires very specific I/O code. This code is based on the hardware, the operating system, and communication protocols.

As a programmer, you find yourself in situations in which you don't care what the implementation is. You just want to read and write data. Thus, one possible abstraction to all the four types of I/O devices is a data stream. The stream has read and write, as well as open and close operations. What actually happens is hidden behind the abstraction. You don't have to know how TCP/IP works in order to use a data stream.

Inheritance is also a tool for extending the functionality of a class, by adding new functions and properties. On a low level, it is implemented exactly like an extension.

Inheritance

If you have read the section about structures, you will understand how inheritance is implemented immediately. Lets take the following example:

struct A {
    int a = 0x10;

    void test();
    void testA();
};

struct B : public A {
    int b = 0x20;
    int c = 0x30;

    void test();
};

B b;

Which is represented in memory as:

00000000002AFAF8  10 00 00 00  ....
00000000002AFAFC  20 00 00 00   ...
00000000002AFB00  30 00 00 00  0...

Look familiar? It's no coincidence! You are telling the compiler that you are extending the definition of struct A by the definition of struct B, and it complies.

Since the memory of struct A is contained in the memory of struct B, you can call functions defined in struct A with an object of type struct B. For instance:

B b;
b.test();
b.testA();

This is the generated assembly code:

;B b;
lea  rcx,[b]
call B::B (013F1612DFh)

;b.test();
lea  rcx,[b]
call B::test (013F1612EEh)

;b.testA();
lea  rcx,[b]
call A::testA (013F1612F3h)

Notice how each time the memory address of b is written to the register RCX.

Since struct B defines a function named B::test, it is called instead of A::test. [TODO why? give source]

As you can see in the assembly code, when calling b.testA(), it calls the implementation of A::testA with a pointer to object b. This is possible because struct B contains struct A at the beginning of the memory - the offsets to its values are the same. The implementation of A::testA simply assumes that there is a struct A in memory. It doesn't know about struct B.

Polymorphism

Going back to the example of the data stream: if the file implementation and the printer implementation have a function called write, how do we know what implementation to call when we only have the abstract data stream to work with?

Lets look at the following example:

struct Stream {
    virtual int read() = 0; // pure virtual function
    virtual int write() = 0;
};

struct File : public Stream {
    int v = 0x10;

    int read() override {
        return v;
    }

    int write() override {
        return v;
    }
};

struct Printer : public Stream {
    int v = 0x20;

    int read() override {
        return v;
    }

    int write() override {
        return v;
    }
};

void operate(Stream* stream) {
    std::cout << stream->write() << std::endl;
}

File file;
Printer printer;

operate(&file);
operate(&printer);

If we look at how struct File is represented in memory, we notice something odd:

000000000024F8B8  78 9d dd 3f  x."?
000000000024F8BC  01 00 00 00  ....
000000000024F8C0  10 00 00 00  ....

struct Stream doesn't contain any members, so why are there 8 bytes before our 0x10?

As you can see from the source code, the compiler allows us to pass a pointer to an object of type struct File to a function that takes a pointer to a struct Stream. When Stream::write is called, the only way for the processor to know what implementation to call, is by looking at the so called virtual function table.

The VFT is an array of function pointers. We give the compiler a hint what functions to put into the VFT by using the keyword virtual.

Virtual function calls cannot be resolved by the compiler to a jump address. This can only happen at run-time.

The first element of the memory listing is in fact a pointer to the VFT of the type struct File. All instances of struct File share the same VFT, which happens to be located at 0x13FDD9D78. Lets see what we can find there:

000000013FDD9D78  4f 11 dd 3f 01 00 00 00  O..?....
000000013FDD9D80  62 12 dd 3f 01 00 00 00  b..?....

Since struct Stream has two virtual functions, and we override both in struct File, there are two function pointers in the VFT. One pointing to the implementation of File::read at 0x13FDD114F, and the other one to the implementation of File::write at 0x13FDD1262.

The assembly code will shed some light on the virtual function call mechanism:

;void operate(Stream* stream) {
mov  rax,qword ptr [stream]  ;RAX=0x24F8B8
mov  rax,qword ptr [rax]     ;RAX=0x13FDD9D78
mov  rcx,qword ptr [stream]  ;RCX=0x24F8B8
call qword ptr [rax+8]       ;call 0x13FDD1262

As you will see by the console output, the processor manages to call the correct implementation. For the file object, the number 0x10 is printed out. For the printer object, the number 0x20 is printed out.

Memory Management

Copying

When reading and writing C++ programs, you should develop a sense when memory is being copied, and in doubt, check the implementation.

Take this code for example:

struct A {
    int a;
    int b;
    int c;
};

A createStruct();
A* createStructPtr();

A a = { 1, 2, 3 };
A b = a;
A c = createStruct();
A* d = createStructPtr();
A e = *d;

When you create a class or structure, the following constructors and operators are created for you by default. They control how objects are moved or copied in memory.

By overloading them, you can specify exactly what needs to happen, when an object is copied. For instance:

struct Array {
    Array(int size) : size(size) { data = new int[size]; }
    ~Array() { delete data; }

    const int* data;
    const int size;
};

Array a(0x20);
Array b = a;

The first struct Array object is created on the stack. The constructor allocates enough memory on the heap to store 32 integer values. The pointer to this memory is written to our object. So far so good, but what happens when we assign array a to array b?

The assembly code holds the answer:

;Array a(0x20);
mov      edx,20h  
lea      rcx,[a]  
call     Array::Array (013FE71294h)

;Array b = a;
lea      rax,[b]  
lea      rcx,[a]  
mov      rdi,rax  
mov      rsi,rcx  
mov      ecx,10h  
rep movs byte ptr [rdi],byte ptr [rsi]  ;memcpy

The constructor for b is not called. Instead, the contents of a is simply copied to b, byte by byte. As a result, b contains the pointer to the same memory as a. The data on the heap is not copied.

This is very risky. The destructor of struct Array deallocates the buffer on the heap. If the other array tries to access it, it will result in undefined behavior. If the other array object is destructed, the code will attempt to delete the memory for a second time, which is also undefined behavior.

If you overload the copy assignment operator, you can tell the compiler to allocate new memory and make a copy of the data on the heap. Alternatively, you can delete the copy constructor and copy assignment operator. You will get a compile error whenever you try to use the array in a way that was not intended.

This is one of many examples in which it is not clear what the assignment operator actually does. If you are in doubt, always check the implementation. Otherwise, you could be copying a lot of data without even knowing it.

Scope

Every variable you declare between two curly braces ({}) is only valid inside that scope. Once the program exists the scope, the variable becomes invalid.

This mechanism is strongly tied to the stack. If you create an object on the stack, its lifetime is defined by the scope in which it was created. Once the program exits the scope, the destructor of the object is called.

Consider the following code:

Object* createInt() {
    Object a;

    return &a;
}

If you create an object on the stack, inside a function scope, and decide to return a pointer to that object, you are returning a pointer to a memory address that will be invalidated and probably overwritten. This can result in undefined behavior if you try to read from this pointer at a later time.

If you want a value to survive the lifetime of a scope, you have to either copy it or allocate it on the heap.

Consider the following code:

{
    int* arrA = new int[1000];

    if (conditionAIsFalse) {
        delete arrA;
        return -1;
    }

    int* arrB = new int[2000];

    if (conditionBIsFalse) {
        delete arrB;
        delete arrA;
        return -2;
    }

    delete arrB;
    delete arrA;

    return 0;
}

We use dynamic memory allocation to allocate memory on the heap. For the program to run correctly, and not leak memory, we have to delete the allocated memory when we exit the function. Since we have multiple exits, it becomes difficult to track what needs to be deleted and what not.

An alternative is to only have one exit point and nest the if-statements. You have to be very careful to set the return value correctly. The readability suffers the more nested if-statements you have.

{
    int returnValue = 0;

    int* arrA = new int[1000];

    if (conditionAIsTrue) {
        int* arrB = new int[2000];

        if (conditionBIsTrue) {
            returnValue = 0;
        } else {
            returnValue = -2;
        }

        delete arrB;
    } else {
        returnValue = -1;
    }

    delete arrA;

    return returnValue;
}

Or if you feel very adventurous, you can even use the goto statement to jump to the clean-up code whenever an error occurs.

In any case, you have to be very careful about deallocating memory. If you forget, your program will leak memory. If you delete memory more than once, you end up with undefined behavior.

In C++, we can hide some of the memory management details by using the scope of variables to our advantage, and write the following code:

struct Array {
    Array(int size) { data = new int[size]; }
    ~Array() { delete data; }

    int* data;
};

{
    Array arrA(0x10);

    if (conditionAIsFalse)
        return -1;

    Array arrB(0x20);

    if (conditionBIsFalse)
        return -2;

    return 0;
}

Notice how we create arrA and arrB on the stack. They only contain a pointer to a buffer. The actual integer buffers are created dynamically on the heap when the constructor is called. The moment the program leaves the scope, the destructor of struct Array is called. The compiler figures out when and where to do this for us. We can simply use the array and not worry about the memory management.

Here is an excerpt from the generated assembly code:

;Array arrB(0x20);
mov   edx,20h
lea   rcx,[arrB]
call  Array::Array (013FA910D7h)

;if (conditionBIsFalse)
movzx eax,byte ptr [conditionBIsFalse]
test  eax,eax
je    main+0A6h (013FA918E6h)

    ;return -2;
mov   dword ptr [rsp+68h],0FFFFFFFEh
lea   rcx,[arrB]
call  Array::~Array (013FA91127h)
nop
lea   rcx,[arrA]
call  Array::~Array (013FA91127h)
mov   eax,dword ptr [rsp+68h]
jmp   main+0C7h (013FA91907h)

Before the struct Array constructor is called, the array size is stored in EDX, and the memory address of arrB is stored in RCX.

Next the value of the bool is checked. If it is false, we jump forward to the code that continues our function. If it is true, we continue executing the return code, which stores the value -2 in stack memory for later reference, and calls the struct Array destructor on arrA and arrB. Last but not least, the return value is copied to EAX, and the program jumps to some function wrap up code.

Note how arrA and arrB are destructed in reverse. This adheres to the Last-In-First-Out behavior of the stack data structure.

Using the scoped lifetime of objects in order to allocate and deallocate resources automatically is known as RAII. The same principle applies to using std::vector and other data structures of the standard library. It is also used to close files and streams, release locks, etc.

Note: Please use std::unique_ptr in combination with std::make_unique. Using std::unique_ptr will ensure that the allocated memory is cleaned up when the pointer object goes out of scope.

Discussion

Why read assembly code?

I hope that I could illustrate in this article that checking the generated assembly code is a good way to prove that the instructions you write, do what you expect them to do.

Assembly code never lies! ;)

This will not only give you more insight into the language and its implementation, but also stronger arguments for or against the use of certain language features or framework functions, when you are discussing your particular case with colleagues.

There are also cases in which a program will crash due to unsupported instructions, i.e. when executing AVX instructions on a CPU that does not support the AVX instruction set. There you have no other choice than to look at the assembly, otherwise you won't know what is causing the error. The source code alone can't tell you.

The downside to this is that every compiler generates different assembly for the same C++ code, and the factors that influence this are plentiful:

If you want to quickly check the generated assembly for a piece of code with different compilers, Godbolt's Compiler Explorer has you covered.

That said, you will always have to profile your application because the generated assembly code is not the only factor that determines the behavior of your program during execution.

Should I use stack, heap, or static memory?

You should base your decision on the properties of each memory type.

  1. Lifetime
  2. Memory requirements

Further Reading


  1. There are CPU architectures that allow you to combine registers in order to address more memory.
  2. The strict aliasing rules allow the compiler to do memory access optimizations that are based on the assumption that you are not reading from memory through one type and writing to the same memory through another type. The consequence of ignoring the rules is that your read operations might return unexpected values.