Aussie AI

Chapter 10. Safe Builds

  • Book Excerpt from "Safe C++: Fixing Memory Safety Issues"
  • by David Spuler

Build Management

Proper build management can be an important part of C++ safety initiatives. The aspects of builds related to C++ safety in the development environment include:

  • Warnings analysis from compiler output.
  • Automated unit tests and regression testing.
  • Integrated testing with the nightly builds.
  • CI/CD approval processes (e.g., run unit tests).

In regards to external management of builds and releases, there are also opportunities to improve overall quality:

  • Tracking builds and releases (the basics)
  • Keeping executables for all builds
  • Matching debug versions of executables (for postmortem debugging purposes).
  • Maintaining hash signatures for executable security.

Some of the pitfalls in build management include:

  • Inadvertant disclosure of security credentials used in testing.
  • Security tracking to ensure hackers cannot add viruses to your builds.

Leveraging More Builds

Instead of thinking about how to get the product built, let's think about ways to leverage builds for extra quality. The basic method is simply "nightly builds" whereby:

  • Unit tests automatically run.
  • Full regression test suite automatically run.
  • Failures are detected.
  • Notification via email to the developer team about any failures.

This is a very efficient system. Once it's setup, there's very little to maintain. But we can level it up:

  • Run unit tests with valgrind and/or other sanitizers.
  • Run the full regression test suite with valgrind and/or other memory checkers and sanitizers (if it takes more than a day, don't run it every night).
  • Automate analyis of compiler warnings (e.g., remove unimportant ones).
  • Add multiple runs of unit tests under different build conditions (e.g., with and without debug code enabled, with different optimizer levels, with different compilers).
  • Add linters and static analysis tool pathways.

The incremental cost of setting up more builds is relatively low. Hence, if you really want to finesse things:

  • Build and run tests on different hardware platforms (e.g., with local hardware or via remote virtual machines).
  • Run multiple sanitizers, and/or use your own home-grown memory debug library (e.g., as in this book).
  • Use multiple pathways for compiler warnings (e.g., the basic build and one with many optional compiler warnings enabled).
  • Use multiple linter pathways (i.e., one for bug-focused warnings, and one with more pedantic settings for style issues).

One final point about all these builds: don't just email the output. A huge ream of informational messages and compiler warnings causes immediate overload. Instead, someone needs to take the time to grep out the unimportant messages. Otherwise, anything major detected by unit tests or compiler/linter warnings gets lost in the snow.

Maybe you shouldn't take your build engineer for granted. They're probably less likely to be replaced by AI than you!

Warning-Free Build

Don’t ignore compiler warnings! A very good goal for C++ software quality is to get to a warning-free compile. You should think of compiler warnings as doing “static analysis” of your code. To maximize this idea, turn on more warning options, since the warnings are rarely wrong in modern compilers, although some are about harmless things.

Harmless doesn’t mean unimportant. And anyway, the so-called “harmless” warnings aren’t actually harmless, because if there’s too many of them in the compilation output, then the bad bugs won’t get seen. Hence, make the effort to fix the minor issues in C++ code that’s causing warnings. For example, fix the “unused variable” warnings or “mixing float and double” type warnings, even though they’re rarely a real bug. And yet, sometimes they are! This is why it’s powerful to have a warning-free compile.

Tracking compilation warnings. One way to take warning-free compilation to the next level is to actually store and analyze the compiler output. It’s like log file analysis in DevOps, only it’s not for systems management, but for debugging. On Linux, I typically use this idea:

    make build |& tee makebuild.txt

Here’s an actual example from a Makefile in an Aussie AI project on Linux:

    build:
        -@make build2 |& tee makebuild.txt
        -@echo 'See output in makebuild.txt'

The Makefile uses prefix “-” and “@” flags, which means that it doesn’t echo the command to output, and doesn’t stop if one of the steps triggers an error.

When the build has finished, then we have a text file “makebuild.txt” which can be viewed for warning messages. To go further, I usually use grep to remove some of the common informational messages, to leave only warning messages. Typically, my Linux command looks like:

    make warnings

Here’s an example of the “warnings” target in a Makefile for one of my Aussie AI projects:

    warnings:
        -@cat makebuild.txt | grep -v '^r -' \
        | grep -v '^g++ ' | grep -v '^Compiling' \
        | grep -v '^Making' | grep -v '^ar ' \ 
        | grep -v '^make\[' | grep -v '^ranlib' \
        | grep -v '^INFO:' | grep -v 'Regressions failed: 0' \ 
        | grep -v 'Assertions failed: 0' | grep -v SUCCESS \
        |more

Note that this uses grep to remove the informational messages from g++, ar, ranlib, and make. And it also removes the unit testing success messages if all tests pass (but not if they fail!). The idea is to show only the bad stuff because log outputs with too many lines get boring far too quickly and then nobody’s watching.

One annoying thing about using grep with make is that you get these kind of error messages:

    make: [annoying] Error 1 (ignored)

Here’s a way to fix them in a Makefile on Linux:

       -@grep tmpnam *.cu *.cpp || true

The “true” command is a shell command that never fails. Note that this line uses the double-pipe “||” shell logical-or operator, so it only runs “true” if grep fails. But don’t accidentally use a single “|” pipe operator, which would actually be a silent bug! This idea makes the line calling grep return a non-zero status, and then make is silent.

Finally, your warning-free tracking method should ideally be part of your “nightly builds” that do more extensive analysis than the basic CI/CD acceptance testing. You should email those warnings to the whole team, at about 2am ideally, because C++ programmers don’t deserve any sleep.

Advanced Build Issues

There are various other aspects of build management that can improve overall quality. These include:

  • Security issues
  • CI/CD/CT integration issues
  • Documentation generation issues
  • Release management

Build security. Security issues with builds are both internal and external. There are two main issues:

  • Accidental release of internal security credentials.
  • Protection against security issues in third-party licenses.
  • Avoiding malicious contamination of your releases (don't be part of a "supply chain attack).

It's common for internal security credentials to get added into the source code control system. This is mostly problematic if your build is releasing an open source package, whereas if it's building software executables, these credentials probably won't be in the release. However, if credentials are hard-coded into the source code for testing purposes, these will still be disclosed publically as part of an executable. Don't underestimate the power of hackers to disassemble binaries, or the simple capabilities of the strings Linux tool.

External security issues arise in terms of the third-party libraries that you are using in your application (i.e., dependency management). Alternatively, you can be a direct victim of a hacking attempt, which may damage your business. Even more insidious are the cases where hackers have embedded payloads into software that is distributed to other customers, which are known as "supply chain attacks." You don't want to be the source of a virus distributed to all your customers!

Release management. Supportability can be greatly improved by good build management. The release process needs to carefully manage which executables go out in which build release. Some of the issues include:

  • Mapping customer releases to internal build numbers and versions.
  • Tracking which versions of third-party licenses were used in which builds.
  • Storing a permanent copy of any executable that went out.
  • Keeping a correlated "debug" copy of the executable (for use in postmortem debugging any customer core dumps).
  • Tagging the source code to mark the release numbers and builds.

The build management aspects of software are less heralded than the exciting algorithms in the latest AI engines. But a good, solid foundation in your build management is critical for high-quality software.

 

Online: Table of Contents

PDF: Free PDF book download

Buy: Safe C++: Fixing Memory Safety Issues

Safe C++ Safe C++: Fixing Memory Safety Issues:
  • The memory safety debate
  • Memory and non-memory safety
  • Pragmatic approach to safe C++
  • Rust versus C++
  • DIY memory safety methods
  • Safe standard C++ library

Get it from Amazon: Safe C++: Fixing Memory Safety Issues