Aussie AI

Chapter 31. Preventive Memory Safety

Book Excerpt from "Advanced C++ Memory Techniques: Efficiency and Safety"

by David Spuler, Ph.D.

Chapter 31. Preventive Memory Safety

Prevention Versus Detection

This chapter examines the question as to what DIY memory safety techniques can be used to prevent an error from occurring, or to prevent a security exploit being used. There are many other techniques to “detect” a memory error, which are valuable, but do not directly prevent a memory glitch in production. These improve quality indirectly by finding bugs, which can then be fixed.

The list of memory errors to consider for prevention includes:

Uninitialized memory usage (heap and stack)
Null pointer dereference
Buffer overflows (reads and writes)
Buffer underflows (reads and writes)
Use-after-free
Double-deallocation
Mismatched allocation and deallocation
Standard library container memory issues
Standard library function problems

Some of the standard library issues include:

Unsafe string functions — e.g., strcpy, strcat, sprintf.
Detecting when the “safe” string functions truncate the text (e.g., snprintf, strcpy_s).
strncpy is a special problematic case that is easily fixed by a wrapper.
File pointer problems and file operation sequence errors (e.g., null file pointers, double-fclose).
Removing an object from a container in the middle of an iterator.

The DIY memory techniques that we can consider include:

Memory sanitizer tools
Macro intercepts (e.g., malloc and free)
Linker intercepts (e.g., new and delete)
Initialization methods
Canary values
Redzone memory regions
Memory poisoning
Delayed-deallocation
Safe wrapper functions
Smart wrapper classes

Memory Sanitizer Tools

The most obvious method of prevention of memory problems is to use runtime memory checkers and sanitizers. Examples include:

Valgrind (Linux)
AddressSanitizer (GCC)
compute-sanitizer (CUDA C++)

These tools will detect and prevent a vast range of memory errors in the stack and heap. Examples include uninitialized memory usage, array bounds overflows, and use-after-free errors.

But these tools are simply too slow to use in production. They are valuable in terms of indirectly improving memory safety because glitches are detected early and fixed by programmers. But they really don’t solve the prevention problem.

Preventing Memory Initialization Errors

One of the simplest DIY fixes is to avoid uninitialized memory errors in C++ by initializing memory ourselves. To do this, we need to use these techniques:

Intercept malloc with macros (or linking) and replace with a wrapper that uses calloc (or uses memset to zero).
Intercept other heap allocation primitives (e.g., strdup, realloc).
Link-time intercept new and change to calloc (also requires matching linker intercepts of delete to change to free).
Intercept alloca dynamic stack memory function (and use memset to zero memory).
Use smart buffer wrapper classes to initialize local buffer variables on the stack (i.e., function local variables).

A whole class of memory errors disappears!

Most of the above techniques require minimal code changes to existing code, such as to add a header file for macro intercepts. Note that C++ already zeroes all memory for global variables and local static variables, without needing any special changes.

The most invasive of the above methods is adding safety class wrappers for stack buffers, but there’s not really any intercepts possible in C++ for stack memory. Other possible solutions for stack buffers would involve changes to the code itself, such as to use heap memory instead, or changing to dynamic alloca stack memory (which can be macro-intercepted).

Overall, there’s only a few exceptions to what memory we can initialize with DIY techniques, in that compiler changes are probably needed for:

Full stack frame initialization to zero on function entry.
Initialization of small local variables on the stack (without extra class wrapper variables).
Register variable initialization (also related to local variables).

Mismatched Allocation and Deallocation

Mismatches between the various types of allocation and deallocation cause undefined behavior, and can even crash. In some cases, they won’t crash, but will fail to run the correct constructors or destructors. The correct matches are:

malloc, calloc, strdup — free
new — delete
new[] — delete[]

Any crossover between any of the three categories is technically a failure. However, these are easily resolved by DIY memory primitive wrappers. By using link-time intercepting of the four new and delete primitives, everything can be converted to malloc/calloc and free. In this way, there won’t be any crashes anymore, even if this error occurs. However, note that many of these failures are still higher-level errors even if they don’t crash, because they won’t correctly run all the destructors if non-scalar objects are being deallocated.

Why Use Wrapper Functions?

The idea of debug wrapper functions is to fill a small gap in the self-checking available in the C++ ecosystem. There are two types of self-testing that happen when you run C++ programs:

Self-tests such as error return checks, assertions, and wrappers in the main C++ code.
valgrind or sanitizer detection of numerous run-time errors.

Both of these methods are highly capable and will catch a lot of bugs. To optimize your use of these capabilities in debugging, you should:

Test all error return codes (e.g., a fancy macro method), and
Run valgrind and/or other sanitizers on lots of unit tests and regression tests in your CI/CD approval process, or, when that gets too slow, at least in the nightly builds.

But this is not perfection! But there’s two main reasons that some bugs will be missed:

Self-testing doesn’t detect all the bugs.
You have to remember to run sanitizers on your code.

Okay, so I’m joking about “remembering” to run the debug tests, because you’ve probably got them running automatically in your build. But there’s some real cases where the application won’t ever be run in debug mode:

Many internal failures trigger no visible symptoms for users (silent failures).
Customers cannot run valgrind on their premises (unless you ask nicely).
Your website “customers” also cannot run it on the website backends.
Some applications are too costly to re-run just to debug an obscure error (I’m looking at you, AI training).

Hence, in the first case, there’s bugs missed in total silence, never to be fixed. And in the latter cases, there’s a complex level of indirection between the failure occurring and the C++ programmer trying to reproduce it in the test lab. It’s much easier if your application self-diagnoses the error!

Fast Debug Wrapper Code

But it’s too slow, I hear you say. Running the code with valgrind or other runtime memory checkers is much slower than without. We can’t ship an executable where the application has so much debug instrumentation that they’re running that much slower.

You’re not wrong, and it’s the age-old quandary about whether to ship testing code. Fortunately, there are a few solutions:

Use fast self-testing tricks like magic numbers in memory.
Have a command-line flag or config option that turns debug tests on and off at runtime.
Have “fast” and “debug” versions of your executable (e.g., ship both to beta customers).

At the very least, you could have a lot of your internal C++ code development and QA testing done on the debug wrapper version that self-detects and reports internal errors.

As the first point states, there are “layers” of debugging wrappers (also ogres, like Shrek). You can define very fast or very slow types of self-checking code into debug wrapper code. These self-tests can be as simple as parameter null tests or as complex as detecting memory stomp overwrites with your own custom code. In approximate order of time cost, here are some ideas:

Parameter basic validation (e.g., null pointer tests).
Magic values added to the initial bytes of uninitialized and freed memory blocks.
Magic values stored in every byte of these blocks.
Tracking 1 or 2 (or 3) of the most recently allocated/freed addresses.
Hash tables to track addresses of every allocated or freed memory block.

I’ve actually done all of the above for a debug library in standard C++. Make sure you check the Aussie AI website to see when it gets released.

Standard C++ Debug Wrapper Functions

It can be helpful during debugging to wrap several standard C++ library function calls with your own versions, so as to add additional parameter validation and self-checking code. Some of the functions which you might consider wrapping include:

malloc
calloc
memset
memcpy
memcmp

If you’re doing string operations in your code, you might consider wrapping these:

strdup
strcmp
strcpy
sprintf

Note that you can wrap the C++ “new” and “delete” operators at the linker level by defining your own versions, but not as macro intercepts. You can also intercept the “new[]” and “delete[]” array allocation versions at link-time.

Example: Wrapping malloc

You can use macros to intercept various standard C++ functions. For example, here’s a simple interception of malloc:

    // intercept malloc
    #undef malloc
    #define malloc aussie_malloc
    void*aussie_malloc(int sz);

Once intercepted, the wrapper code can perform simple validation tests of the various parameters. Here’s a simple wrapper for the malloc function in a debug library for C++ that I’m working on:

    void *aussie_malloc(int sz)
    {
        // Debug wrapper version: malloc() 
        AUSSIE_DEBUGLIB_TRACE("malloc called");
        AUSSIE_DEBUG_PRINTF("%s: == ENTRY malloc === sz=%d\n", 
             __func__, sz);

        g_aussie_malloc_count++;
        AUSSIE_CHECK(sz != 0, "AUS007", "malloc size is zero");
        AUSSIE_CHECK(sz >= 0, "AUS008", "malloc size is negative");

        // Call the real malloc
        void *new_v = NULL;
        new_v = malloc(sz);
        if (new_v == NULL) {
                AUSSIE_ERROR("AUS200", "ERROR: malloc failure");
                // Try to keep going?
        } 
        return new_v;
    }

This actually has multiple levels of tests:

Validation of called parameter values.
Detection of memory allocation failure.
Builtin debug tracing macros that can be enabled.

A more advanced version could also attempt to check pointer addresses are valid and have not been previously freed, and a variety of other memory errors. Coming soon!

Example: memset Wrapper Self-Checks

Here’s an example of what you can do in a wrapper function called “memset_wrapper” from one of the Aussie AI projects:

    void *memset_wrapper(void *dest, int val, int sz)  // Wrap memset
    {
        if (dest == NULL) {
                aussie_assert2(dest != NULL, "memset null dest");
                return NULL;
        }
        if (sz < 0) {
                // Why we have "int sz" not "size_t sz" above
                aussie_assert2(sz >= 0, "memset size negative");
                return dest;  // fail
        }
        if (sz == 0) {
                aussie_assert2(sz != 0, "memset zero size (reorder params?)");
                return dest;
        }
        if (sz <= sizeof(void*)) {
                // Suspiciously small size
                aussie_assert2(sz > sizeof(void*), "memset with sizeof array parameter?");
                // Allow it, keep going
        }
        if (val >= 256) {
                aussie_assert2(val < 256, "memset value not char");
                return dest; // fail
        }
        void* sret = ::memset(dest, val, sz);  // Call real one!
        return sret;
    }

It’s a judgement call whether or not to leave the debug wrappers in place, in the vein of speed versus safety. Do you prefer sprinting to make your flight, or arriving two hours early? Here’s one way to remove the wrapper functions completely with the preprocessor if you’ve been manually changing them to the wrapper names:

    #if DEBUG
        // Debug mode, leave wrappers..
    #else // Production (remove them all)
        #define memset_wrapper memset
        //... others
    #endif

Compile-time self-testing macro wrappers

Here’s an idea for combining the runtime debug wrapper function idea with some additional compile-time tests using static_assert.

    #define memset_wrapper(addr,ch,n) ( \
        static_assert(n != 0), \
        static_assert(ch == 0), \
        memset_wrapper((addr),(ch),(n),__FILE__,__LINE__,__func__))

The idea is interesting, but it doesn’t really work, because not all calls to the memset wrapper will have constant arguments for the character or the number of bytes, so the static_assert commands will fail in that case. You could use standard assertions, but this adds runtime cost. Note that it’s a self-referential macro, but that C++ guarantees it only gets expanded once (i.e., there’s no infinite recursion of preprocessor macros).

Preventing Null Pointer Dereferences

A huge number of null pointer dereferences can be prevented and detected by wrapping the many standard library functions. Here’s a simple example of the intercept:

    #define strcmp strcmp_safe

And here’s the wrapper function with parameter validation checks that prevent null pointer crashes:

    int strcmp_safe(const char* s1, const char* s2)
    {
	if (!s1 && s2) {
		AUSSIE_ASSERT(s1);
		return -1;
	}
	else if (s1 && !s2) {
		AUSSIE_ASSERT(s2);
		return 1;
	}
	else if (!s1 && !s2) {
		AUSSIE_ASSERT(s1);
		AUSSIE_ASSERT(s2);
		return 0;  // Equal-ish
	}
	else {
		// Both non-null
		return strcmp(s1, s2);
	}
	// NOTREACHED
   }

Unfortunately, detecting null pointer usage requires compiler changes for direct pointer or array operations, such as:

    *ptr = 0;
    ptr->value = 0;
    arr[0] = 0;

Generalized Self-Testing Debug Wrappers

The technique of debug wrappers can be extended to offer a variety of self-testing and debug capabilities. The types of messages that can be emitted by debug wrappers include:

Input parameter validation failures (e.g., non-null)
Failure returns (e.g., allocation failures)
Common error usages
Informational tracing messages
Statistical tracking (e.g., call counts)

Personally, I’ve built some quite extensive debug wrapping layers over the years. It always surprises me that this can be beneficial, because it would be easier if it were done fully by the standard libraries of compiler vendors. The level of debugging checks has been increasing significantly (e.g., in GCC), but I still find value in adding my own wrappers.

There are several major areas where you can really self-check for a lot of problems with runtime debug wrappers:

File operations
Memory allocation
String operations

Wrapping Math Functions

It might seem that it’s not worth wrapping the mathematical functions, as their failures are rare. However, these are some things you can check:

errno is already set on entry.
errno is set afterwards (if not already set).
Function returns NaN.
Function returns negative zero.

Most of these can be implemented as a single integer test (e.g., errno) or as a bitwise trick on the underlying floating-point representation (e.g., convert float to an unsigned). There are also builtin library functions to detect floating-point categories such as NaN.

In this way, a set of math wrapper functions has automated a lot of your detection of common issues. These aren’t as common as memory issue, but it’s yet another way to move towards a safe C++ implementation.

Wrapping File Operations

Many of the file operations are done via function calls, and are a good candidate for debug wrapper functions. Examples of standard C++ functions that you could intercept include:

fopen, fread, fwrite, fseek, fclose
open, read, write, creat, close

Note that intercepting fstream operations in this way is not workable. They don’t use a function-like syntax for file operations.

Using the approach of wrapping file operations can add error detection, error prevention, and tracing capabilities to these operations. Undefined situations and errors that can be auto-detected include:

File did not open (i.e., trace this).
Read or write failed or was truncated.
Read and write without intervening seek operation.

Link-Time Interception: new and delete

Macro interception works for C++ functions like the standard C++ functions like malloc and free, but you can’t macro-intercept the new and delete operators, because they don’t use function-like syntax. Fortunately, you can use link-time interception of these operators instead, simply by defining your own versions. This is a standard feature of C++ that has been long supported.

Note that defining class-level versions of the new and delete operators is a well-known optimization for a class to manage its own memory allocation pool, but this isn’t what we’re doing here. Instead, this link-time interception requires defining four operators at global scope:

new
new[]
delete
delete[]

You cannot use the real new and delete inside these link-time wrappers. They would get intercepted again, and you’d have infinite stack recursion.

However, you can call malloc and free instead, assuming they aren’t also macro-intercepted in this code. Here’s the simplest versions:

    void * operator new(size_t n)
    {
        return malloc(n);        
    }

    void* operator new[](size_t n)
    {
        return malloc(n);        
    }

    void operator delete(void* v)
    {
        free(v);
    }

    void operator delete[](void* v)
    {
        free(v);
    }

This method of link-time interception is an officially sanctioned standard C++ language feature since the 1990s. Be careful, though, that the return types and parameter types are precise, using size_t and void*, as you cannot use int or char*. Also, declaring these functions as inline gets a compilation warning, and is presumably ignored by the compiler, as this requires link-time interception.

Here’s an example of some ideas of some basic possible checks:

    #define AUSSIE_ERROR(mesg, ...) \
        ( printf((mesg) __VA_OPT__(,) __VA_ARGS__ ) )

    void * operator new(size_t n)
    {
        if (n == 0) {
            AUSSIE_ERROR("new operator size is zero\n");
        }
        void *v = malloc(n);        
        if (v == NULL) {
            AUSSIE_ERROR("new operator: allocation failure\n");
        }        
        return v;
    }

Note that you can’t use __FILE__ or __LINE__ as these are link-time intercepts, not macros. Maybe you could use std::backtrace instead, but I have my doubts.

Destructor Problems with Debug Wrappers

The use of a debug wrapper library can be very valuable. However, there are a few problematic areas:

Destructors should not throw an exception.
Destructors should not call exit or abort.
Destructor issues with assert.

Any of these happenstances can trigger an infinite loop situation. Exception handlers can trigger destructors, which in turn trigger exceptions again. Exiting or aborting in a destructor may trigger global variable destruction, which calls the same destructor, which tries to exit or abort again. Be careful of the system assert macro inside destructors, because it’s a hidden call to abort if it fails.

Although these infinite-looping problems are serious, it would seem that these are minor issues to add to your coding standards: don’t do these things inside a destructor. However, we’re talking about debug wrapper libraries, rather than explicit calls, and destructors often have need to:

De-allocate memory
Close files

Both of these tasks are often intercepted by debug wrapper libraries, whether macro-intercepted or at link-time. Hence, the issue we have is that any failure detected by the debug wrapper code may trigger one of the above disallowed calls, depending on our policy for handling a detected failure.

Unfortunately, I’m not aware of an API that checks if “I’m running a destructor” in C++. Hence, it’s hard for the debug library to address this issue itself. There are a few mitigations you can use in coding destructors:

Recursive re-entry detection inside destructors using a static local variable.
Modify the debug library’s error handling flags on entry and exit of a destructor
Have global flags called “I’m exiting” or “I’m failing” that are checked by all your destructors, in which case it should probably do nothing.

Alternatively, you could manage your own global flag “I’m in a destructor” in every destructor function. More accurately, this is not a flag, but a counter of destructor depth. This flag or counter is then checked by the debug library to check if it’s in a destructor before it throws an exception, exits, or aborts.

But I’m not sure what the debug library should do instead? Maybe it can itself set a global flag saying “I want to exit soon” and then it will later detect this flag is set on the next intercepted call to the debug library, provided that it’s not still inside a destructor. Perhaps your application’s main processing loop could regularly check with the debug library whether it wants to quit, by just checking that global variable often.

Ugh! None of that sounds workable.

A better plan is probably that your debugging library wrapper functions should never throw an exception, exit, abort, or use the builtin system assert function, because it can’t ever be sure it’s not inside a destructor. Instead, report errors and log errors in another way, but try to keep going, which is a good idea anyway.

• Online: Table of Contents

• PDF: Free PDF book download

• Buy: Advanced C++ Memory Techniques: Efficiency and Safety