Aussie AI
Chapter 31. Preventive Memory Safety
-
Book Excerpt from "Advanced C++ Memory Techniques: Efficiency and Safety"
-
by David Spuler, Ph.D.
Chapter 31. Preventive Memory Safety
Prevention Versus Detection
This chapter examines the question as to what DIY memory safety techniques can be used to prevent an error from occurring, or to prevent a security exploit being used. There are many other techniques to “detect” a memory error, which are valuable, but do not directly prevent a memory glitch in production. These improve quality indirectly by finding bugs, which can then be fixed.
The list of memory errors to consider for prevention includes:
- Uninitialized memory usage (heap and stack)
- Null pointer dereference
- Buffer overflows (reads and writes)
- Buffer underflows (reads and writes)
- Use-after-free
- Double-deallocation
- Mismatched allocation and deallocation
- Standard library container memory issues
- Standard library function problems
Some of the standard library issues include:
- Unsafe string functions — e.g.,
strcpy,strcat,sprintf. - Detecting when the “safe” string functions truncate the text (e.g.,
snprintf,strcpy_s). strncpyis a special problematic case that is easily fixed by a wrapper.- File pointer problems and file operation sequence errors (e.g., null file pointers, double-
fclose). - Removing an object from a container in the middle of an iterator.
The DIY memory techniques that we can consider include:
- Memory sanitizer tools
- Macro intercepts (e.g.,
mallocandfree) - Linker intercepts (e.g.,
newanddelete) - Initialization methods
- Canary values
- Redzone memory regions
- Memory poisoning
- Delayed-deallocation
- Safe wrapper functions
- Smart wrapper classes
Memory Sanitizer Tools
The most obvious method of prevention of memory problems is to use runtime memory checkers and sanitizers. Examples include:
- Valgrind (Linux)
- AddressSanitizer (GCC)
- compute-sanitizer (CUDA C++)
These tools will detect and prevent a vast range of memory errors in the stack and heap. Examples include uninitialized memory usage, array bounds overflows, and use-after-free errors.
But these tools are simply too slow to use in production. They are valuable in terms of indirectly improving memory safety because glitches are detected early and fixed by programmers. But they really don’t solve the prevention problem.
Preventing Memory Initialization Errors
One of the simplest DIY fixes is to avoid uninitialized memory errors in C++ by initializing memory ourselves. To do this, we need to use these techniques:
- Intercept
mallocwith macros (or linking) and replace with a wrapper that usescalloc(or usesmemsetto zero). - Intercept other heap allocation primitives (e.g.,
strdup,realloc). - Link-time intercept
newand change tocalloc(also requires matching linker intercepts ofdeleteto change tofree). - Intercept
allocadynamic stack memory function (and usememsetto zero memory). - Use smart buffer wrapper classes to initialize local buffer variables on the stack (i.e., function local variables).
A whole class of memory errors disappears!
Most of the above techniques require minimal code changes to existing code, such as to
add a header file for macro intercepts.
Note that C++ already zeroes all memory for global variables and local static variables,
without needing any special changes.
The most invasive of the above methods is adding safety class wrappers for stack buffers,
but there’s not really any intercepts possible in C++ for stack memory.
Other possible solutions for stack buffers would involve changes to the code itself,
such as to use heap memory instead,
or changing to dynamic alloca stack memory (which can be macro-intercepted).
Overall, there’s only a few exceptions to what memory we can initialize with DIY techniques, in that compiler changes are probably needed for:
- Full stack frame initialization to zero on function entry.
- Initialization of small local variables on the stack (without extra class wrapper variables).
- Register variable initialization (also related to local variables).
Mismatched Allocation and Deallocation
Mismatches between the various types of allocation and deallocation cause undefined behavior, and can even crash. In some cases, they won’t crash, but will fail to run the correct constructors or destructors. The correct matches are:
malloc,calloc,strdup—freenew—deletenew[]—delete[]
Any crossover between any of the three categories is technically a failure.
However, these are easily resolved by DIY memory primitive wrappers.
By using link-time intercepting of the four new and delete primitives,
everything can be converted to malloc/calloc and free.
In this way, there won’t be any crashes anymore,
even if this error occurs.
However, note that many of these failures are still higher-level errors even if they don’t crash, because they
won’t correctly run all the destructors if non-scalar objects are being deallocated.
Why Use Wrapper Functions?
The idea of debug wrapper functions is to fill a small gap in the self-checking available in the C++ ecosystem. There are two types of self-testing that happen when you run C++ programs:
- Self-tests such as error return checks, assertions, and wrappers in the main C++ code.
valgrindor sanitizer detection of numerous run-time errors.
Both of these methods are highly capable and will catch a lot of bugs. To optimize your use of these capabilities in debugging, you should:
- Test all error return codes (e.g., a fancy macro method), and
- Run
valgrindand/or other sanitizers on lots of unit tests and regression tests in your CI/CD approval process, or, when that gets too slow, at least in the nightly builds.
But this is not perfection! But there’s two main reasons that some bugs will be missed:
- Self-testing doesn’t detect all the bugs.
- You have to remember to run sanitizers on your code.
Okay, so I’m joking about “remembering” to run the debug tests, because you’ve probably got them running automatically in your build. But there’s some real cases where the application won’t ever be run in debug mode:
- Many internal failures trigger no visible symptoms for users (silent failures).
- Customers cannot run
valgrindon their premises (unless you ask nicely). - Your website “customers” also cannot run it on the website backends.
- Some applications are too costly to re-run just to debug an obscure error (I’m looking at you, AI training).
Hence, in the first case, there’s bugs missed in total silence, never to be fixed. And in the latter cases, there’s a complex level of indirection between the failure occurring and the C++ programmer trying to reproduce it in the test lab. It’s much easier if your application self-diagnoses the error!
Fast Debug Wrapper Code
But it’s too slow, I hear you say.
Running the code with valgrind or other runtime memory checkers is much slower than without.
We can’t ship an executable where the application has so much debug instrumentation
that they’re running that
much slower.
You’re not wrong, and it’s the age-old quandary about whether to ship testing code. Fortunately, there are a few solutions:
- Use fast self-testing tricks like magic numbers in memory.
- Have a command-line flag or config option that turns debug tests on and off at runtime.
- Have “fast” and “debug” versions of your executable (e.g., ship both to beta customers).
At the very least, you could have a lot of your internal C++ code development and QA testing done on the debug wrapper version that self-detects and reports internal errors.
As the first point states, there are “layers” of debugging wrappers (also ogres, like Shrek). You can define very fast or very slow types of self-checking code into debug wrapper code. These self-tests can be as simple as parameter null tests or as complex as detecting memory stomp overwrites with your own custom code. In approximate order of time cost, here are some ideas:
- Parameter basic validation (e.g., null pointer tests).
- Magic values added to the initial bytes of uninitialized and freed memory blocks.
- Magic values stored in every byte of these blocks.
- Tracking 1 or 2 (or 3) of the most recently allocated/freed addresses.
- Hash tables to track addresses of every allocated or freed memory block.
I’ve actually done all of the above for a debug library in standard C++. Make sure you check the Aussie AI website to see when it gets released.
Standard C++ Debug Wrapper Functions
It can be helpful during debugging to wrap several standard C++ library function calls with your own versions, so as to add additional parameter validation and self-checking code. Some of the functions which you might consider wrapping include:
malloccallocmemsetmemcpymemcmp
If you’re doing string operations in your code, you might consider wrapping these:
strdupstrcmpstrcpysprintf
Note that you can wrap the C++ “new” and “delete” operators
at the linker level
by defining your own versions, but not as macro intercepts.
You can also intercept the “new[]” and “delete[]” array allocation versions
at link-time.
Example: Wrapping malloc
You can use macros to intercept various standard C++ functions.
For example, here’s a simple interception of malloc:
// intercept malloc
#undef malloc
#define malloc aussie_malloc
void*aussie_malloc(int sz);
Once intercepted, the wrapper code can perform simple validation tests of the various parameters.
Here’s a simple wrapper for the malloc function in a debug library
for C++ that I’m working on:
void *aussie_malloc(int sz)
{
// Debug wrapper version: malloc()
AUSSIE_DEBUGLIB_TRACE("malloc called");
AUSSIE_DEBUG_PRINTF("%s: == ENTRY malloc === sz=%d\n",
__func__, sz);
g_aussie_malloc_count++;
AUSSIE_CHECK(sz != 0, "AUS007", "malloc size is zero");
AUSSIE_CHECK(sz >= 0, "AUS008", "malloc size is negative");
// Call the real malloc
void *new_v = NULL;
new_v = malloc(sz);
if (new_v == NULL) {
AUSSIE_ERROR("AUS200", "ERROR: malloc failure");
// Try to keep going?
}
return new_v;
}
This actually has multiple levels of tests:
- Validation of called parameter values.
- Detection of memory allocation failure.
- Builtin debug tracing macros that can be enabled.
A more advanced version could also attempt to check pointer addresses are valid and have not been previously freed, and a variety of other memory errors. Coming soon!
Example: memset Wrapper Self-Checks
Here’s an example of what you can do in a wrapper function
called “memset_wrapper”
from one of the Aussie AI projects:
void *memset_wrapper(void *dest, int val, int sz) // Wrap memset
{
if (dest == NULL) {
aussie_assert2(dest != NULL, "memset null dest");
return NULL;
}
if (sz < 0) {
// Why we have "int sz" not "size_t sz" above
aussie_assert2(sz >= 0, "memset size negative");
return dest; // fail
}
if (sz == 0) {
aussie_assert2(sz != 0, "memset zero size (reorder params?)");
return dest;
}
if (sz <= sizeof(void*)) {
// Suspiciously small size
aussie_assert2(sz > sizeof(void*), "memset with sizeof array parameter?");
// Allow it, keep going
}
if (val >= 256) {
aussie_assert2(val < 256, "memset value not char");
return dest; // fail
}
void* sret = ::memset(dest, val, sz); // Call real one!
return sret;
}
It’s a judgement call whether or not to leave the debug wrappers in place, in the vein of speed versus safety. Do you prefer sprinting to make your flight, or arriving two hours early? Here’s one way to remove the wrapper functions completely with the preprocessor if you’ve been manually changing them to the wrapper names:
#if DEBUG
// Debug mode, leave wrappers..
#else // Production (remove them all)
#define memset_wrapper memset
//... others
#endif
Compile-time self-testing macro wrappers
Here’s an idea for combining the runtime debug wrapper function idea
with some additional compile-time tests using static_assert.
#define memset_wrapper(addr,ch,n) ( \
static_assert(n != 0), \
static_assert(ch == 0), \
memset_wrapper((addr),(ch),(n),__FILE__,__LINE__,__func__))
The idea is interesting, but it doesn’t really work, because not all
calls to the memset wrapper will have constant arguments for the character
or the number of bytes, so the static_assert commands will fail in that case.
You could use standard assertions, but this adds runtime cost.
Note that it’s a self-referential macro, but that C++ guarantees
it only gets expanded once (i.e., there’s no infinite recursion of preprocessor macros).
Preventing Null Pointer Dereferences
A huge number of null pointer dereferences can be prevented and detected by wrapping the many standard library functions. Here’s a simple example of the intercept:
#define strcmp strcmp_safe
And here’s the wrapper function with parameter validation checks that prevent null pointer crashes:
int strcmp_safe(const char* s1, const char* s2)
{
if (!s1 && s2) {
AUSSIE_ASSERT(s1);
return -1;
}
else if (s1 && !s2) {
AUSSIE_ASSERT(s2);
return 1;
}
else if (!s1 && !s2) {
AUSSIE_ASSERT(s1);
AUSSIE_ASSERT(s2);
return 0; // Equal-ish
}
else {
// Both non-null
return strcmp(s1, s2);
}
// NOTREACHED
}
Unfortunately, detecting null pointer usage requires compiler changes for direct pointer or array operations, such as:
*ptr = 0;
ptr->value = 0;
arr[0] = 0;
Generalized Self-Testing Debug Wrappers
The technique of debug wrappers can be extended to offer a variety of self-testing and debug capabilities. The types of messages that can be emitted by debug wrappers include:
- Input parameter validation failures (e.g., non-null)
- Failure returns (e.g., allocation failures)
- Common error usages
- Informational tracing messages
- Statistical tracking (e.g., call counts)
Personally, I’ve built some quite extensive debug wrapping layers over the years. It always surprises me that this can be beneficial, because it would be easier if it were done fully by the standard libraries of compiler vendors. The level of debugging checks has been increasing significantly (e.g., in GCC), but I still find value in adding my own wrappers.
There are several major areas where you can really self-check for a lot of problems with runtime debug wrappers:
- File operations
- Memory allocation
- String operations
Wrapping Math Functions
It might seem that it’s not worth wrapping the mathematical functions, as their failures are rare. However, these are some things you can check:
errnois already set on entry.errnois set afterwards (if not already set).- Function returns
NaN. - Function returns negative zero.
Most of these can be implemented as a single integer test (e.g., errno)
or as a bitwise trick on the underlying floating-point representation (e.g., convert float
to an unsigned).
There are also builtin library functions to detect floating-point categories such as NaN.
In this way, a set of math wrapper functions has automated a lot of your detection of common issues. These aren’t as common as memory issue, but it’s yet another way to move towards a safe C++ implementation.
Wrapping File Operations
Many of the file operations are done via function calls, and are a good candidate for debug wrapper functions. Examples of standard C++ functions that you could intercept include:
fopen,fread,fwrite,fseek,fcloseopen,read,write,creat,close
Note that intercepting fstream operations in this way is not workable.
They don’t use a function-like syntax for file operations.
Using the approach of wrapping file operations can add error detection, error prevention, and tracing capabilities to these operations. Undefined situations and errors that can be auto-detected include:
- File did not open (i.e., trace this).
- Read or write failed or was truncated.
- Read and write without intervening seek operation.
Link-Time Interception: new and delete
Macro interception works for C++ functions like the
standard C++ functions like malloc and free,
but you can’t macro-intercept the new and delete operators,
because they don’t use function-like syntax.
Fortunately, you can use link-time interception of these operators instead,
simply by defining your own versions.
This is a standard feature of C++ that has been long supported.
Note that defining class-level
versions of the new and delete operators is a well-known optimization
for a class to manage its own memory allocation pool,
but this isn’t what we’re doing here.
Instead, this link-time interception
requires defining four operators at global scope:
newnew[]deletedelete[]
You cannot use the real
new and delete inside these link-time wrappers.
They would get intercepted again,
and you’d have infinite stack recursion.
However, you can call malloc and free instead,
assuming they aren’t also macro-intercepted in this code.
Here’s the simplest versions:
void * operator new(size_t n)
{
return malloc(n);
}
void* operator new[](size_t n)
{
return malloc(n);
}
void operator delete(void* v)
{
free(v);
}
void operator delete[](void* v)
{
free(v);
}
This method of link-time interception
is an officially sanctioned standard C++ language feature since the 1990s.
Be careful, though, that the return types and
parameter types are precise, using size_t and void*,
as you cannot use int or char*.
Also, declaring these functions as inline gets a compilation warning,
and is presumably ignored by the compiler, as this requires
link-time interception.
Here’s an example of some ideas of some basic possible checks:
#define AUSSIE_ERROR(mesg, ...) \
( printf((mesg) __VA_OPT__(,) __VA_ARGS__ ) )
void * operator new(size_t n)
{
if (n == 0) {
AUSSIE_ERROR("new operator size is zero\n");
}
void *v = malloc(n);
if (v == NULL) {
AUSSIE_ERROR("new operator: allocation failure\n");
}
return v;
}
Note that you can’t use __FILE__ or __LINE__ as these are link-time intercepts, not macros.
Maybe you could use std::backtrace instead, but I have my doubts.
Destructor Problems with Debug Wrappers
The use of a debug wrapper library can be very valuable. However, there are a few problematic areas:
- Destructors should not
throwan exception. - Destructors should not call
exitorabort. - Destructor issues with
assert.
Any of these happenstances can trigger an infinite loop situation.
Exception handlers can trigger destructors, which in turn trigger exceptions again.
Exiting or aborting in a destructor may trigger global variable destruction, which calls
the same destructor, which tries to exit or abort again.
Be careful of the system assert macro inside destructors,
because it’s a hidden call to abort if it fails.
Although these infinite-looping problems are serious, it would seem that these are minor issues to add to your coding standards: don’t do these things inside a destructor. However, we’re talking about debug wrapper libraries, rather than explicit calls, and destructors often have need to:
- De-allocate memory
- Close files
Both of these tasks are often intercepted by debug wrapper libraries, whether macro-intercepted or at link-time. Hence, the issue we have is that any failure detected by the debug wrapper code may trigger one of the above disallowed calls, depending on our policy for handling a detected failure.
Unfortunately, I’m not aware of an API that checks if “I’m running a destructor” in C++. Hence, it’s hard for the debug library to address this issue itself. There are a few mitigations you can use in coding destructors:
- Recursive re-entry detection inside destructors using a
staticlocal variable. - Modify the debug library’s error handling flags on entry and exit of a destructor
- Have global flags called “I’m exiting” or “I’m failing” that are checked by all your destructors, in which case it should probably do nothing.
Alternatively, you could manage your own global flag “I’m in a destructor” in every destructor function. More accurately, this is not a flag, but a counter of destructor depth. This flag or counter is then checked by the debug library to check if it’s in a destructor before it throws an exception, exits, or aborts.
But I’m not sure what the debug library should do instead? Maybe it can itself set a global flag saying “I want to exit soon” and then it will later detect this flag is set on the next intercepted call to the debug library, provided that it’s not still inside a destructor. Perhaps your application’s main processing loop could regularly check with the debug library whether it wants to quit, by just checking that global variable often.
Ugh! None of that sounds workable.
A better plan is probably that your debugging library wrapper functions should never throw an exception,
exit, abort, or use the builtin system assert function,
because it can’t ever be sure it’s not inside a destructor.
Instead, report errors and log errors in another way, but
try to keep going,
which is a good idea anyway.
|
• Online: Table of Contents • PDF: Free PDF book download • Buy: Advanced C++ Memory Techniques: Efficiency and Safety |
|
Advanced C++ Memory Techniques: Efficiency & Safety:
Get your copy from Amazon: Advanced C++ Memory Techniques |