Aussie AI
Chapter 20. Quality
-
Book Excerpt from "Safe C++: Fixing Memory Safety Issues"
-
by David Spuler
What is Software Quality?
Quality is an overarching goal in software design. The terms “software quality” and “code quality” are not the same thing. Software quality is more about product quality from the user or company perspective, which has a more outward looking feel with issues such as functionality and usability. Code quality is what software developers work on every day. Having quality coding practices is a pre-requisite for software quality, so there's much overlap.
How do we improve both types of “quality”?
First, let's acknowledge the subjectivity. Some groups of people are more focused on “software quality” than “code quality” as a goal. Salespeople want the product to have the hot features. Marketing wants a nice UI and a “positioning” in the market (what is so great about the letter P?). Support wants nobody to call.
For those working on the internals of software, everybody has a different view of code quality. Quality engineers want everything to be perfect before it ships. Project managers want to hit the date by time-boxing features out. Developers want, well, who knows, because every developer has a different but deeply-held belief about this topic.
Second, let's examine the metrics for quality code. It's runtime things like: has cool features, doesn't crash or spin, and is performant. And it's static things like: readability, modularity, and so on. And there are future-looking metrics such as: maintainability, extensibility, etc. There are various techniques to enhance these types of metrics, which we examine in the following chapters.
Third, let's take a top-down look. What does “software quality” or “code quality” mean on the executive floor? Probably it means any software that has “AI” features, so the CEO can say that buzzword in the earnings call about a hundred times. I heard on TikTok that McKinsey research proved that stocks appreciate by sqrt(pi/8) percentage points for every mention.
Finally, let's take a bottom-up look, which is really most of this chapter and the following chapters. We are talking about C++ coding, after all. There are a lot of practical techniques that can be used to improve the delivery of quality software through improvements to C++ code quality and other areas.
Advanced Software Quality
If you want to write the best C++ software for enterprise purposes in terms of “quality,” you need to consider a lot of “abilities”:
- Testability
- Debuggability
- Scalability
- Usability
- Installability
- Supportability
- Availability
- Reliability
- Maintainability
- Portability
- Extensibility
- Interoperability
- Reusability
Take a breath. Keep going. Some more:
- Deployability
- Manageability
- Readability
- Upgradability
- Marketability
- Monetizability
- Quality-ability (whatever that means!)
- Security protection (hackability)
- Internationalization (translatability)
- Fault tolerance and resilience (keep-going-ability)
- Modularity (separatability)
- Stability
Oh, and I almost forgot one coding quality issue:
- Adding new features that customers want.
Before we get too wrapped in all those inward-looking “abilities,” let us remind ourselves that the customer only cares about a few of them: installability, usability, stability. For a B2C product, think about the “grandma test”; could your grandma use this software? (After she's called you and made you set up her WiFi, I mean.) For B2B customers, the main thing the users actually care about is “ability-ability” which is whether your software has the capability to help users do whatever bizarre things businesses want to do with your code.
Sellability
Oops, I've forgotten about sales yet again, which isn't surprising because all of us in R&D aren't allowed to talk to the reps. I guess they have cooties or they'll stop selling the currently shipped version or they'll blame us for not winning a deal with the currently shipped version. We have drills to practice hiding under our chairs if we see a rep.
Anyway, to get back on topic, marketability and sellability is actually the highest level of quality. If nobody buys it, who cares how beautiful an architecture? Consider broadening the definition of “quality” beyond the C++ code to the “software quality” of the entire product from the perspective of the company.
Sellability is quality!
Most of the “code quality” practices in software engineering are internal inward-focused work, rather than looking “outwards” at the customer. If your company goal is actually financial success of your C++ software product in the B2B market, here's my suggestion of an alternative set of C++ “sellability” processes to consider:
1. Ask your sales reps what new features will close their current deal.
2. Code that in C++.
3. Run your 24-hour or 48-hour automated test suite.
4. Give the executables to your sales reps on a zippy.
Note that I only said to “consider” this method. Nobody in R&D is actually going to do it, I'm sure. I only wrote that so all the sales reps would buy a programming book.
Software Engineering Methodologies
Below is a list of various software engineering paradigms and architectural practices. Let me hereby emphatically state that one of these methods is clearly and by far the absolute best one, far superior to all the rest, and I will defend it to the hilt over a brew any day of the week.
Oh, but I'm not going to tell you which one. Feel free to argue amongst yourselves. Here's the list:
- Agile development
- Pair programming
- AI copilot programming
- Waterfall method
- DevOps for everyone
- Test-driven development
- Feature-driven development
- Agile scrum
- Lean coding
- GMB
- Don't Repeat Yourself (DRY)
- Structured Design Methodology
- Designated Object Architecture (DOA)
- UML
- Rapid Application Development (RAD)
- eXtreme Programming
- Object Oriented Design (OOD)
- SQA
- Rogue coder model
- Pick Your Favorite Acronym (PYFA)
- Intentional coding
- Joint Application Development Process
- Move fast & break stuff
- Behavior-Driven Development
- SOLID
- Domain-Driven Design
- Product Market Fit (PMF)
- ISO something
- Fingers and toes crossed
- Spiral Model
- TQM or six-sigma or Jack Welch stuff
- Code myself a new minivan
- YAGNI
- Rational Unified Coding
- Product-Led Growth (PLG)
What a fun list! I'm going to make a poster to put on the wall above my “jump to conclusions” mat.
Software Engineering Process Group
The idea of a Software Engineering Process Group (SEPG) is a team of people in your company who aim to help software engineers write better code. It's people helping people, so what could be better than that?
What this SEPG team does is buy everyone in the company a copy of this book, including the valet parking attendants and catering staff, who are integral to your software development strategy, if you ask me (you didn't). After that, it's feet up on the desk and read the newspaper for the rest of the day on the SEPG floor, because it's all sorted in this book.
I really like the idea of the SEPG, but I've also seen it ineffective when product groups simply ignored their advice. I don't know what to say about that. I guess if I were running an SEPG, I'd say try to focus on pragmatic and incremental ways to improve software processes. Some of the ways that an SEPG can add tremendous value across an entire software development organization include:
- Educating engineers on best practices.
- Reviewing coding tools that might be useful.
- Vetting common libraries of low-level functionality (reusability!).
- Documenting and sharing successful methods and ideas.
- Coding up horizontal libraries like debug wrappers.
Oh, yeah, and a coding standards document, because who doesn't love a great one of those.
Coding Standards
I cannot pretend that I am a big fan of having coding style standards. But most large companies tend to have them, and there is certainly a benefit to doing so. You can find Google's on the Internet, and I read it to my toddler to put him to sleep (easier than putting him into a child seat and doing a hundred blockies at 3am while wearing pyjamas; who doesn't love parenting?).
The advantage of a coding policy is a standardization of various activities and processes company-wide, which is something they really like in head office. The disadvantages include things like: (a) a focus on “busy work” coding rather than adding new user features, and (b) practical difficulties merging two different development procedures if you acquire another big company. Newly acquired startups will expend a fair amount of effort to conform to your standards, but they probably need to do similar activities to fix technical debt, anyway.
My preference would rather be that a company has a specific organizational group focused on software engineering excellence, with a focus on practicality, rather than dictate the “one true way” of programming. Coding standards are only one of the many issues for such a cross-company team to address. This is the idea of having an SEPG in your organization, which is kind of like a SEP field, if you know what I mean. So, it is a matter of tone and focus in terms of how high or how low to go in devising the coding standard for your project or organization.
Some high-level issues that could be addressed:
- Which programming language. (C++, of course!)
- Code libraries allowed
- Tech stack: database, app layer, UI, etc.
- Tools: source code control, bug database, etc.
- Naming: e.g., good APIs follow a naming convention that the developer can guess.
A coding style for C++ could specify a variety of factors about which of the advanced language features to use (or avoid):
- Templates
- Operator overloading
- Class inheritance hierarchies
- Namespace management
I'm really not going to suggest your coding standard document should address indentation, variable names, comments, and so on, but some of these types of documents actually do.
There is also value in specifying standard suggested coding libraries and interfaces:
- Basic data types
- Basic coding libraries
- Basic data structures (e.g., hash tables, lookup tables, etc.)
- Unit testing library/APIs
- Regression testing tools and harnesses
- Assertions and self-testing
- Debug tracing code
- Exception handling
- Testing and debugging tools
I could go on, but I won't.
Project Estimation
Estimating project time and space requirements is an important part of software project management. Although estimating the efficiency of a proposed project is important in ascertaining its feasibility, it is difficult to find anything concrete to say about arriving at these estimates. Producing advance estimates is more of an art than a science, and a typical process goes like this:
1. Pick a random date.
2. Deny programmers sleep until this date.
3. Slip the date.
4. Time-box out all useful features.
5. Ship it!
Experience is probably the best source of methods for producing an accurate estimate. Hence, it is wise to seek out others who have implemented a similar project, or to perform a literature search for relevant papers and books. Unfortunately, neither of these methods is guaranteed to succeed and the implementor may be forced to go it alone. The only other realistic means of estimation relies on a good understanding of the various data structures and algorithms that will be used by the program. Making realistic assumptions about the input can provide some means of examining the performance of a data structure. How a data structure performs under worst case assumptions may also be of great importance.
An alternative to these methods of plucking estimates out of the air is to code up a prototype version of the program, which implements only the most important parts of the project (especially those which will have the biggest impact). The efficiency of the prototype can then be measured using the various techniques. Even if the prototype is too inefficient, at least the problem has been identified early in the development cycle, when the investment in the project is relatively low.
Code Quality
Everyone has their own opinions on the best way to write software, so I'll choose to simply offer some possible options for you to discuss. Here is my list of some of the more pragmatic and useful ways to ensure code reliability as a professional software developer:
- Lots of unit tests.
- Lots of assertions.
- Lots of bigger regression tests.
- Automated acceptance testing in CI/CD.
- Nightly builds that automatically re-run all the bigger tests that are too slow for CI/CD.
- Warning-free compilation (as a coding policy goal).
- Running Valgrind or other memory checkers in the nightly builds (Linux).
- Run big multi-platform tests in the nightly builds.
- Check return codes (as a coding policy).
- Validate incoming function parameters (as a coding policy).
- Use an error logger.
- Use a debug tracing library.
- Add some debug wrapper functions.
Extensibility
Extensibility is allowing your customers to extend or customize your AI software. Although your first thought is going to be to run off and build an API or an SDK, there are a few things to consider first. The simpler ways to “extend” are:
- Just add more features.
- Add configuration settings.
- Add command-line options.
- Add minor personalization features.
Adding customer features. The basic problem that customers have is that they want to find a way to do something. If they're looking to extend your software, well, that means that some feature is lacking. If one customer finds this issue, other customers are probably silently suffering. So, rather than building an API, just listen to your customer, and add some more features to your code that will solve the issue, and other reasonably similar issues.
Configuration settings. Think about your AI's configuration settings from the point-of-view of extensibility. If you prefer, call them “declarative extensions.” It's much easier for a customer to change a config option than to write a program using your SDK. Consider elevating and documenting some of the different ways that your application can be configured, to give your customers more capabilities. Yes, this does significantly increase the error handling code and QA testing cycle, so this is a careful consideration: which of your internal config options do you hide or publicize?
Personalization options. When you're deep in the guts of an AI application, you're thinking about really brain-intensive stuff like vectorizing your tokenizer. Your customer, however, just wants to put their company's name at the top of their AI-generated report. Hence, focus on adding some of the “smaller” functionality that seems trivial to engineers, but is what customers want. Maybe, like the wheel, the report could even have different colors?
And one final point about extensibility: your customers aren't programmers. They don't even know what the acronyms API and SDK stand for. Your customers need an API like a fish needs a bicycle.
Scalability
Almost this entire treatise is about scalability of your application. Getting a huge behemoth to run fast is the biggest challenge.
But the actual software code is not the only scalability concern. There's also the server on which your application resides, receiving and process requests, sending them on to the backend application, and collating returned results. This server is itself a piece of software, and it could be an off-the-shelf server, or you could write your own in C++ if you like.
User interfaces are another overlooked point in regard to scalability. Not only must the backend be fast, but the user interface layer must handle all of the requirements in a way that people can cope with. The key point is this:
Humans don't scale.
What that means is that making your human user do anything is a hard problem. People cannot read reams of text fast, they cannot click on a thousand warning messages, and they do dumb things in the interface, like re-clicking the “Load” button a hundred times if it's taking too long. The fact that a human is part of the process flow means that you have to make sure that all of your steps are human-friendly. This is an often-underestimated aspect of scalability.
Reusability
In our commercial world it is frequently the cost of our own time that is the greatest. Using our own time efficiently can be more important than writing fast programs. Although improving programming productivity is not our main topic of this book, let us briefly consider a few methods here.
The basic method of reducing time spent programming is to build on the work of others. The use of libraries, including the wide variety of commercially available source code libraries, and the C++ standard library, is a good way to build on the work of others. There are a few concerns with using third-party libraries:
- Quality concerns — Is it bug-free? How well tested and supported?
- Security issues — Consider the source and it's security protections.
- Legal licenses — Don't use "copyleft" or "share-alike" or "non-commercial" code.
Building on your own work is the other main method of productivity improvement. How often have you coded up a hash table? Have you ever written a sorting routine off the top of your head and then spent hours debugging it? You should perform tasks only once. This doesn’t necessarily mean writing reusable code in its most general sense, but just having the source code available for the most common problems. Modifying code that has already been debugged is far more time-efficient than writing it from scratch. Organizations should seek to create building blocks of code that programmers can use, but you can also do so in your own personal career.
|
• Online: Table of Contents • PDF: Free PDF book download |
|
Safe C++: Fixing Memory Safety Issues:
Get it from Amazon: Safe C++: Fixing Memory Safety Issues |