Aussie AI Blog

DIY Preventive C++ Memory Safety

  • November 2nd, 2024
  • by David Spuler, Ph.D.

Prevention Versus Detection

This article examines the question as to what DIY memory safety techniques can be used to prevent an error from occurring, or to prevent a security exploit being used. There are many other techniques to "detect" a memory error, which are valuable, but do not directly prevent a memory glitch in production. These improve quality indirectly by finding bugs, which can then be fixed.

The list of memory errors to consider for prevention includes:

  • Uninitialized memory usage (heap and stack)
  • Null pointer dereference
  • Buffer overflows (reads and writes)
  • Buffer underflows (reads and writes)
  • Use-after-free
  • Double-deallocation
  • Mismatched allocation and deallocation
  • Standard library container memory issues
  • Standard library function problems

Some of the standard library issues include:

  • Unsafe string functions — e.g., strcpy, strcat, sprintf.
  • Detecting when the "safe" string functions truncate the text (e.g., snprintf, strcpy_s).
  • strncpy is a special problematic case that is easily fixed by a wrapper.
  • File pointer problems and file operation sequence errors (e.g., null file pointers, double-fclose).
  • Removing an object from a container in the middle of an iterator.

The DIY memory techniques that we can consider include:

  • Memory sanitizer tools
  • Macro intercepts (e.g., malloc and free)
  • Linker intercepts (e.g., new and delete)
  • Initialization methods
  • Canary values
  • Redzone memory regions
  • Memory poisoning
  • Delayed-deallocation
  • Safe wrapper functions
  • Smart wrapper classes

Memory Sanitizer Tools

The most obvious method of prevention of memory problems is to use runtime memory checkers and sanitizers. Examples include:

  • Valgrind (Linux)
  • AddressSanitizer (GCC)
  • compute-sanitizer (CUDA C++)

These tools will detect and prevent a vast range of memory errors in the stack and heap. Examples include uninitialized memory usage, array bounds overflows, and use-after-free errors.

But these tools are simply too slow to use in production. They are valuable in terms of indirectly improving memory safety because glitches are detected early and fixed by programmers. But they really don't solve the prevention problem.

Preventing Null Pointer Dereferences

A huge number of null pointer dereferences can be prevented and detected by wrapping the many standard library functions. Here's a simple example of the intercept:

    #define strcmp strcmp_safe

And here's the wrapper function with parameter validation checks that prevent null pointer crashes:

    int strcmp_safe(const char* s1, const char* s2)
    {
	if (!s1 && s2) {
		AUSSIE_ASSERT(s1);
		return -1;
	}
	else if (s1 && !s2) {
		AUSSIE_ASSERT(s2);
		return 1;
	}
	else if (!s1 && !s2) {
		AUSSIE_ASSERT(s1);
		AUSSIE_ASSERT(s2);
		return 0;  // Equal-ish
	}
	else {
		// Both non-null
		return strcmp(s1, s2);
	}
	// NOTREACHED
   }

Unfortunately, detecting null pointer usage requires compiler changes for direct pointer or array operations, such as:

    *ptr = 0;
    ptr->value = 0;
    arr[0] = 0;

Preventing Memory Initialization Errors

One of the simplest DIY fixes is to avoid uninitialized memory errors in C++ by initializing memory ourselves. To do this, we need to use these techniques:

  • Intercept malloc with macros (or linking) and replace with a wrapper that uses calloc (or uses memset to zero).
  • Intercept other heap allocation primitives (e.g., strdup, realloc).
  • Link-time intercept new and change to calloc (also requires matching linker intercepts of delete to change to free).
  • Intercept alloca dynamic stack memory function (and use memset to zero memory).
  • Use smart buffer wrapper classes to initialize local buffer variables on the stack (i.e., function local variables).

A whole class of memory errors disappears!

Most of the above techniques require minimal code changes to existing code, such as to add a header file for macro intercepts. Note that C++ already zeroes all memory for global variables and local static variables, without needing any special changes.

The most invasive of the above methods is adding safety class wrappers for stack buffers, but there's not really any intercepts possible in C++ for stack memory. Other possible solutions for stack buffers would involve changes to the code itself, such as to use heap memory instead, or changing to dynamic alloca stack memory (which can be macro-intercepted).

Overall, there's only a few exceptions to what memory we can initialize with DIY techniques, in that compiler changes are probably needed for:

  • Full stack frame initialization to zero on function entry.
  • Initialization of small local variables on the stack (without extra class wrapper variables).
  • Register variable initialization (also related to local variables).

Mismatched Allocation and Deallocation

Mismatches between the various types of allocation and deallocation cause undefined behavior, and can even crash. In some cases, they won't crash, but will fail to run the correct constructors or destructors. The correct matches are:

  • malloc, calloc, strdupfree
  • newdelete
  • new[]delete[]

Any crossover between any of the three categories is technically a failure. However, these are easily resolved by DIY memory primitive wrappers. By using linke-time intercepting of the four new and delete primitives, everything can be converted to malloc/calloc and free. In this way, there won't be any crashes anymore, even if this error occurs.

Related Memory Safety Blog Articles

See also these articles:

Safe C++ Book



Safe C++: Fixing Memory Safety Issues The new Safe C++ coding book by David Spuler:
  • Memory Safety
  • Rust versus C++
  • The Safe C++ Standard
  • Pragmatic Memory Safety

Get your copy from Amazon: Safe C++: Fixing Memory Safety Issues

Aussie AI Advanced C++ Coding Books



C++ AVX Optimization C++ AVX Optimization: CPU SIMD Vectorization:
  • Introduction to AVX SIMD intrinsics
  • Vectorization and horizontal reductions
  • Low latency tricks and branchless programming
  • Instruction-level parallelism and out-of-order execution
  • Loop unrolling & double loop unrolling

Get your copy from Amazon: C++ AVX Optimization: CPU SIMD Vectorization



C++ Ultra-Low Latency C++ Ultra-Low Latency: Multithreading and Low-Level Optimizations:
  • Low-level C++ efficiency techniques
  • C++ multithreading optimizations
  • AI LLM inference backend speedups
  • Low latency data structures
  • Multithreading optimizations
  • General C++ optimizations

Get your copy from Amazon: C++ Ultra-Low Latency



Advanced C++ Memory Techniques Advanced C++ Memory Techniques: Efficiency & Safety:
  • Memory optimization techniques
  • Memory-efficient data structures
  • DIY memory safety techniques
  • Intercepting memory primitives
  • Preventive memory safety
  • Memory reduction optimizations

Get your copy from Amazon: Advanced C++ Memory Techniques



Safe C++ Safe C++: Fixing Memory Safety Issues:
  • The memory safety debate
  • Memory and non-memory safety
  • Pragmatic approach to safe C++
  • Rust versus C++
  • DIY memory safety methods
  • Safe standard C++ library

Get it from Amazon: Safe C++: Fixing Memory Safety Issues



Efficient C++ Multithreading Efficient C++ Multithreading: Modern Concurrency Optimization:
  • Multithreading optimization techniques
  • Reduce synchronization overhead
  • Standard container multithreading
  • Multithreaded data structures
  • Memory access optimizations
  • Sequential code optimizations

Get your copy from Amazon: Efficient C++ Multithreading



Efficient Mordern C++ Data Structures Efficient Modern C++ Data Structures:
  • Data structures overview
  • Modern C++ container efficiency
  • Time & space optimizations
  • Contiguous data structures
  • Multidimensional data structures

Get your copy from Amazon: Efficient C++ Data Structures



Low Latency C++: Multithreading and Hotpath Optimizations Low Latency C++: Multithreading and Hotpath Optimizations: advanced coding book:
  • Low Latency for AI and other applications
  • C++ multithreading optimizations
  • Efficient C++ coding
  • Time and space efficiency
  • C++ slug catalog

Get your copy from Amazon: Low Latency C++



CUDA C++ Optimization CUDA C++ Optimization book:
  • Faster CUDA C++ kernels
  • Optimization tools & techniques
  • Compute optimization
  • Memory optimization

Get your copy from Amazon: CUDA C++ Optimization



CUDA C++ Optimization CUDA C++ Debugging book:
  • Debugging CUDA C++ kernels
  • Tools & techniques
  • Self-testing & reliability
  • Common GPU kernel bugs

Get your copy from Amazon: CUDA C++ Debugging

More AI Research Topics

Read more about: