Aussie AI

Chapter 7. Order Book & Matching Engine

Book Excerpt from "C++ Ultra-Low Latency: Multithreading and Low-Level Optimizations"

by David Spuler, Ph.D.

Chapter 7. Order Book & Matching Engine

What is an Order Book?

An order book is a data structure that tracks all the currently active buy orders (“bids”) and sell orders (“asks”). Typically, it has tons of data, because it has all the buys and sells from every trader who’s in the market today.

Tracking the order book has a number of commonalities across all changes. There are a number of major assumptions when implementing an order book:

Only a single stock — each financial instrument with a price has its own order book.
All individual orders are seen — it’s a “market-by-depth” data feed with all the juice.
No out-of-order transactions — we track order times implicitly as they are received via “order-of-insertion” rather than receiving an explicit timestamp and sorting based on its value.
No missing transactions — this is often an invalid assumption (I’m looking at you, UDP!), so the order book must be robust for problems such as IDs that are not found.

It’s different for the exchange versus a trading engine client. The exchange is managing many trading clients, but a trading engine is only for one exchange. The way it works overall is:

Exchange — real order book and matching engine to find trades to execute.
Trading engine — an abstract “model” of what’s happening in the exchange.

The exchange has to detect trades via its “matching engine” and really execute them (and do so correctly!). A trading engine is just watching what’s happening, and using that information to decide whether or not to submit its own orders. It’s like the exchange has the “real” order book, and the trading engine is maintaining its own “model” of what it probably looks like.

In theory, a trading engine doesn’t need its own matching engine. In practice, maintaining an order book means that you know when the bids and asks have “crossed” (i.e., a buy is high enough to match a sell), and you can thereby predict when a trade message will be coming. In fact, this can be a useful consistency self-check to confirm that your predictions about incoming trades are correct.

Matching engines and order books are intricately linked, and their code uses the same data structure. It’s not much extra code, and, anyway, you’ll probably also need a matching engine for:

Unit testing of the order book
Backtesting

Note that a trading engine’s order book is about all the orders and trades, not just your own. Tracking your own submitted trades is a totally different component of your trading engine. For example, to track whether your order got executed, or to know cumulatively what your current “positions” are in the purchased stocks from your prior submitted orders, that’s not in the order book. It’s some other C++ coder’s problem.

Order Book Messages

The input to an order book is mainly a set of messages for:

New buys
New sells
Updates (to quantity)
Cancels (remove the order)

If you’re the exchange, these submitted orders or updates to already-submitted orders are received via TCP from multiple clients. If you’re one of the trading engines, you see all these messages coming across the network from a market data feed, usually via UDP multicast messages.

The main difference is in trade executions. The exchange has a matching engine that has to figure out whether to execute a trade, and then tell everyone about it. A client order book in a trading engine doesn’t need to figure this out, because it gets told about trades in the incoming messages from the exchange’s market data feed.

Let’s examine the types of messages that occur.

Usually, the bids are priced lower than the asks, in which case there’s no trades. When orders come in like this, they go into the order book, and might be executed later when things change. These are called “passive orders” and can be buys or sells.

If a new bid in an incoming buy order raises the price above the min-ask (min-sell), then that buy order will trigger a trade to occur. Similarly, if a new sell transaction has a lower ask, then it may match with one of the bids, and a trade occurs.

Market Data Feed Issues

There are two main types of data feeds, where one is like a fire hose and the other more like a faucet. The amount of data affects how “deep” of an order book data structure you can maintain from the feed:

Market-by-Order (MBO) — full detail about every order, trade or update.
Market-by-Price (MBP) — only about trades and pricing, not every passive order.

An MBO data feed shows every order as it comes in from traders, with full detail of how many passive orders there are at every price level. MBP is more of an aggregation of pricing, showing the total quantities of bids and asks at every price level. Hence, an MBP feed is more detailed than just “quotes” about the most recent trade prices for a stock, but it’s much less data than an MBO feed.

Some issues in the structure of an MBO market data feed can affect how you update your fully-detailed client-side order book in a trading engine.

Are affected order IDs listed in a trade execution message?
Do orders affected by a trade execution get modified by followup messages changing status or quantity?
Is there order status tracking in the feed (e.g., partially filled, filled, or “done for day”).

Some lesser-known types of data feed information:

Broken trade — undo a prior trade.
Execution with price — specific type of trade message.

Order Book versus Matching Engine

There’s a kind of symbiotic relationship between an order book data structure and a matching engine algorithm. The tentative distinction is:

Matching engine — detects if a trade is possible.
Order book — is told when a trade occurs (by the matching engine).

If you’re processing a market data feed, then the exchange is detecting trades (via its matching engine), and then sending trade messages down the feed. Your order book is then receiving a trade message, and doesn’t need to do its own matching to detect when trades occur. In fact, many of the market data feed protocols will supply the IDs of the orders involved in a trade, so your order book may not need to implement the matching logic.

Except that sometimes it does!

You need to do matching if you don’t get IDs in trade messages from the feed, which is sometimes the case. Another reason is when a trading engine needs to predict trades before you hear about them on the feed. Such issues depend on the algo that you’re running.

Matching Engine Logic

Consider the situations if you get a trade message from a market data feed with only:

Trade price
Trade quantity

How do you know which orders did the trade? Short answer: you don’t. There could be many orders with the right price and enough quantity.

In these cases, a client-side trading engine has to effectively build your own emulation of a matching engine, as part of your client-side order book maintaining code. Note that mapping an anonymous trade message, without only price and quantity, to other orders is effectively the same logic as an exchange doing a matching process on an incoming new order.

You have to figure out which orders probably did the trade by looking at all the buys and sells at the price level, and examine all orders at that price level in the order they were received (i.e., order-of-insertion), and which orders have enough quantity, and so on.

Guess what you just coded in your order book ... a matching engine!

There are also various other scenarios where things get different with order books versus matching algorithms:

1. Exchange matching algorithm — if you’re working on the exchange side, then you actually need to implement a live matching engine for real trades (scary!).

2. Backtesting — you may need to emulate a matching algorithm to simulate the effect of your fancy algorithm and its submission of orders.

3. Unit testing — you may need to emulate a matching engine, unless you’re just replaying some recorded data feed messages.

And if you have to code up a real matching algorithm at an exchange, guess what data it has to maintain, for each financial instrument, to detect matches based on all the incoming trade submissions from its clients...an order book.

Data Structures for the Order Book

The first point to note about the order book is that it’s an incremental algorithm. You process each market data feed message in sequence, and can maintain an up-to-date order book in this way.

Is your order book correct? Not always, since messages can get delayed or even lost (usually in a UDP multicast data feed). There’s also a non-incremental secondary method call “snapshot synchronization” whereby you can correct your order book. Also, some protocols have partial status or statistics messages that can help validate your order book is still correct.

Multiple data structures are required to address the various different types of requirements. The overall idea would be something like:

Hash table — mapping the IDs to order objects.
Doubly linked list — order-of-insertion sorted list for processing orders in the order they’re received.
Heap data structure (priority queue) — maintain the maximum-buy (max-heap) and minimum-sell (min-heap).

Do you need all this stuff?

A hash table is hard to avoid because everything’s keyed off the order IDs. New orders come in with a unique ID, which is then used to modify or cancel the order. The IDs also usually appear in trade messages, although not always.

Whether you need a queue or linked list for the FIFO list of orders is discussed below. The first point is that it’s actually needed at a price level, and not necessarily one huge list of all orders.

FIFO Order Lists

The idea of an order list is that it has FIFO logic based on order-of-insertion, which is usually equivalent to timestamps. Assuming they have the best price, multiple orders at the same price level are supposed to match in the order they were received by the exchange.

Do you always need a linked list or queue of orders?

If you’re the exchange then, yes, you definitely need to track FIFO status of orders at a price level, so as to implement “fairness” of trade execution in the matching engine. But what about on the trading engine side?

It depends on what algo you’re doing. If the market data feed is giving your order IDs for trades, then you don’t need to emulate a matching engine, and you don’t need those linked lists, unless the algo needs a signal derived from them.

This situation is a little similar to using a Market-by-Price (MBP) data feed, which doesn’t have all the individual trades. However, there are times when you want to track all the individual orders and trades, but you don’t really need to know the FIFO ordering of them. For example, you might want more price-level data than an MBP feed is giving you, such as maintaining all these price-level statistics incrementally:

Total volume
Total number of orders (queue length)
Time frequency of orders (timestamp computations)
Last order size and timestamp

An algo could be using these price-level data points, without necessarily needing the full tracking of all the orders in a queue data structure.

Price-Level FIFO Ladders

An important aspect of maintaining an order book is that matching orders should be processed fairly in a FIFO order. Hence, to process a trade in a matching engine, we need a queue of orders. This gives a FIFO ordering with the orders stored according to order-of-insertion.

But the price matters more than the ordering! To process a matching trade, we need to find all the orders that are max-buy (or min-sell), and process them in FIFO order. There are two basic ways to set up a data structure:

One long FIFO queue
Per-price FIFO queues

Using a single FIFO queue is inefficient when we need to process a trade, because the whole list may need to be scanned to find the orders at the right price level.

Having a separate queue (list) for each price level is much faster, because all of the items on the list have the same price level (that we want). This aspect of having lists of orders for each price level is often called a “ladder” or a “bid/ask ladder.” The algorithm becomes:

1. Find the linked list or queue of orders for that price level.

2. Scan the order list processing each order record (in FIFO order).

3. Continue until we have enough quantity or the list ends.

4. Process the next-best price if we need more quantity (from Step 1).

However, there is extra storage cost because each price level needs its own object, and must contain the head and tail pointers of a dequeue or doubly-linked list. And we need a mapping data structure to find those linked lists for each price level (i.e., a hashmap or a very big bucket array).

Note that this above analysis of FIFO matching is ignoring some other types of matching rules. For example, there is pro-rata and hybrid FIFO pro-rata as other possibilities, which introduce additional complications if any order has a large quantity (which is arguably “unfair”!).

Heap Data Structures

Heap data structures, also known as a priority queue, are good at efficiently tracking the maximum (or minimum) of a set of values. They are also efficient at updating the maximum or minimum under many insertions and deletions of random values. Note that the term “heap” in this context has nothing to do with memory allocation!

There are three types of heaps:

Max-heap — tracks the maximum value.
Min-heap — for the minimum value.
Min-max-heap — does both efficiently (we don’t need this).

Do you need a heap for price levels? It seems like overkill to have a data structure just to calculate the maximum buy and minimum sell prices of the order book. However, see below for reasons why an incremental algorithm isn’t that easy. In short: deletions are tricky to handle without a heap.

Anyway, actually you don’t need a heap, because you need two! There are typically two heaps with a max-heap data structure for the buy orders to track max-bid, and another min-heap data structure for the sell orders (min-ask). These two heap data structures are completely independent.

In C++, a max-heap can be implemented using the std::priority_queue container class. A min-heap can be declared using a custom comparator that reverses the logic. The default for a max-heap is std::less, but you can use std::greater to create a min-heap. Although using a custom comparator would often be inefficient, the standard C++ library probably (hopefully) has a builtin template specialization for this comparator. However, you need to check by benchmarking!

If your min-heap is slow with a custom comparator, there’s another weird optimization: negative numbers to the rescue! A maximum of negative values is the minimum absolute value. You can make a min-heap from a max-heap by negating all the values on the way in, and negating again (to reverse it back to the original number) when returning an item from the min-heap.

There are alternatives to using heaps, but you certainly need to track some information per price-level, and be able to access it in sorted order (e.g., find the second-best price). A heap is the most straight-forward data structure for doing this.

Ordering Out-of-Order Orders

The simplest type of order book and matching engine should be based on “order of insertion” so that older orders get processed first. However, even when trying to maintain this FIFO ordering, there can also be issues with timestamping and lost or delayed order messages. This is more of an issue for client-side maintenance of an implicit order book from a market data feed via UDP multicast, rather than an exchange server’s incoming transactions over more reliable TCP connections.

Some orders from the exchange market data feed may be received late due to a delay and therefore appear to be “out-of-order” when they arrive at the market data ingestion component. Typically, this is due to delays or lost packets in UDP multicast messages as they are sent from the exchange’s market data feed to the trading engine’s client code.

How to handle this?

As a client, we want to keep our order book accurate, but not at too great of a performance cost. In the abstract, there are various ways to treat messages that come in with a timestamp that indicates they are not received correct order. Some issues include:

Detection by tracking incoming timestamps versus previous messages.
Ignoring timestamps and doing order-of-insertion anyway.
Timestamp-based insertion to correctly place the orders in the FIFO list.

Detection is relatively straight-forward if we assume it’s a rare event and only one transaction is received out-of-order. The idea is to simply track the timestamp of the most recent order, and compare that with each incoming message. However, things get more complex if there could be multiple out-of-order transactions, which could be also the wrong order in themselves. Handling these obscure cases efficiently is more problematic.

One way to do it all efficiently is to detect out-of-order timestamps, but then ignore the issue (except maybe logging some in-memory statistics counters). In other words, just insert it into the data structure in order-of-insertion, and it will be wrongly behind some of the orders with a later timestamp.

How risky is that?

The risk is that there’s a match at that price level, and some other transactions get processed by a trade, rather than this one (which was actually received earlier by the exchange). The importance of an order can also depend on its price level, since orders that are unlikely to be crossed won’t see any problem at all. Another point is that lost messages occur, so there are whole orders that get missed, and the order book will get occasionally updated via the “snapshot synchronization” methods.

In other words, out-of-order trade or order messages is not the only problem that we have with our order book becoming an inaccurate model of the exchange’s order book. There are various trade-offs here, and nobody likes to leave it to chance. In this case, we actually know there’s a problem, whereas with a lost order, we do not.

Can we fix it?

The correction is to try to insert the order wherever it should have been. Timestamp-based insertion is inherently a slower operation on the default data structure of a FIFO queue of orders.

Insertion into an order-of-insertion queue is an O(1) insertion at the tail of the queue, which is only a couple of operations. But for an out-of-order insertion using its timestamp, you now have to scan this queue or deque. The operation of finding where exactly to insert is an O(n) linear search to examine all the timestamps on the list. It may be a rare event, and the reverse scan down the list might only be a few orders previously, but it still introduces a non-deterministic slowdown to insertion of orders into the order book data structures.

Another separately indexed data structure may be considered here to map timestamps to linked list locations (i.e., unsigned long to a pointer), but that adds more complexity to the situation. And it’s not a hashmap, because we need lookups with relative ordering of the timestamp keys. Worse still, maintaining some other type of index will also slow down all of the other non-problematic order insertions.

Incremental Max-Buy and Min-Sell Prices

A useful optimization is to maintain the current maximum buy and minimum sell as two incrementally-updated price values. These are the same numerical types as the price type, whether it’s a double or an integer. And there are two separate values:

Max-buy price (maximum bid)
Min-sell price (minimum ask)

The first thought is to get excited and think that maybe we don’t need a data structure at all to track the price levels. Maybe we can just incrementally maintain these two values, and done.

Does it work? Let’s try it out. The general idea for the incremental update algorithm is:

New buy orders — if price is more than the max-buy, update the incremental max-buy.
New sell orders — if price is less than the min-sell, update the incremental min-sell.
Cancel buy orders — if buy price equals the max-buy, and there are no other buy orders at that price level, find the next-highest buy price.
Cancel sell orders — if sell price equals the min-sell, and there are no other sell orders at that price level, find the next-lowest sell price.
Modify orders — these are not allowed to change the price of orders, so they don’t affect it.
Trades — treated like order cancels for updating the max-buy or min-sell values.

This would be a beautifully efficient algorithm ... if only it worked. It’s super-fast for new buy or sell orders, and modify orders don’t affect the incremental values. The problem is the deletions.

Deletions of orders, whether via a cancel message or a trade happening, need to find the new max-buy or min-sell. Canceled buy orders below the max-buy price, or sell orders above the min-sell price, don’t affect the incremental values, and are efficient. Also, even if the deletion is an order at the max-buy or min-sell, if there is even one other order at that price level, then we also don’t need to do anything.

But if a deletion removes the last order at the max-buy or min-sell price, then we need to scan all the other price levels. Note that when doing any type of deletion or cancel, there shouldn’t be a buy order at a higher price, or a sell order at a lower price, if we’ve been correctly tracking the incremental values. So, we’re looking for the second-best buy or sell price.

Alas, the hash table of offer ids is no help here. We need a data structure that tracks the price levels of all offers, and one that can efficiently find the maximum or minimum, such as:

Red-black tree (e.g., std::map)
Heap (e.g., std::priority_queue)

On the upside, you can see that the use of the price-level data structure (heap or tree) to find the next-best buy or sell prices is a relatively rare event. Hence, this incremental optimization can be very helpful in practical terms. For example, when computing mid-quotes or the bid-ask spread, we can just access these two scalar variables, rather than querying the price level data structure.

Finding whether there’s any other orders at that price level actually requires a price-level data structure anyway. We need to map the new order’s price level to the linked list of other orders at that price level. There’s also another optimization here: to avoid needing to look up the price level, we could store a pointer to the price level object for the current max-heap or min-sell prices (i.e., we have then four incrementally maintained variables: two prices and two pointers to price level data structure objects). Also, we’ve probably already found the price level data structure object for other reasons (e.g., to add a new order to the linked list of orders for that price level).

References

Sourav Ghosh, July 2023, Building Low Latency Applications with C++, Packt Publishing, https://www.amazon.com/dp/1837639353
Philippe Bourgeon, 2016, Take Home Test (C++14 OrderBook), https://github.com/bgn9000/Cpp1y-OrderBook
Sadhbh Code, 2024, C++20 Order Book: Order Book implementation in C++20 (Concepts & Co-Routines), https://github.com/sadhbh-c0d3/cpp20-orderbook
Code Review, 2023, Fast OrderBook Implementation, https://codereview.stackexchange.com/questions/285623/fast-orderbook-implementation
Colman M., September 25, 2024, C/C++ Advanced Order Book Processing Example with CPU Affinity, https://www.linkedin.com/pulse/cc-advanced-order-book-processing-example-cpu-colman-marcus-quinn-7227e/
Scorsone Enterprises, 2024, Coding an Order Book in C++ (Beginner Friendly), https://www.youtube.com/watch?v=TRiqIkhR0XI
Shu Wang, 2011, How to Build a Fast Limit Order Book, https://gist.github.com/halfelf/db1ae032dc34278968f8bf31ee999a25
Quantitative Finance (Stack Exchange), 2021, What is an efficient data structure to model order book?, https://quant.stackexchange.com/questions/3783/what-is-an-efficient-data-structure-to-model-order-book
Charles Cooper, 2021, Fast implementation of an ITCH order book, https://github.com/charles-cooper/itch-order-book
Amitava Biswas, Aug 18, 2023, Designing Low Latency High Performance Order Matching Engine, https://medium.com/@amitava.webwork/designing-low-latency-high-performance-order-matching-engine-a07bd58594f4

• Online: Table of Contents

• PDF: Free PDF book download

• Buy: C++ Ultra-Low Latency

C++ Ultra-Low Latency: Multithreading and Low-Level Optimizations:

Low-level C++ efficiency techniques
C++ multithreading optimizations
AI LLM inference backend speedups
Low latency data structures
Multithreading optimizations
General C++ optimizations

Get your copy from Amazon: C++ Ultra-Low Latency

Aussie AI

Chapter 7. Order Book & Matching Engine