Skip to content

Why Low Latency

Nicolai Grodzitski edited this page Sep 13, 2018 · 4 revisions

Within high-frequency trading (HFT) it is almost impossible to avoid the subject of low-latency. What does low-latency mean in this context? Latency is a measure of time to get a response for a given action. To achieve low-latency a response must be fast, where fast is often measured in micro or nano seconds for HFT. HFT systems must respond in a timely manner to market events otherwise a trading opportunity is missed at best, or at worst can mean the trading algorithm is exposed to significant risk as they lag the market.

What the name low-latency does not convey is the importance of having consistent latency. It is often more important to be consistent than being fast most of the time but occasionally slow. Therefore trading systems and exchanges strive to not just offer low-latency they also try to minimize the variance. Achieving low-latency with minimal variance can be a real challenge when dealing with response times measured in microseconds, especially at the event rates which can be in the many millions per second for the largest feeds.

How does one design a system to be so responsive?

For a system to be very responsive when faced with large event rates then it must be incredibly efficient. There is no room for waste in such a system. Each instruction executed must pay its own way and be purely focused on the goal of processing the incoming events and responding accordingly. Design of such systems needs a similar design approach to that of aircraft or spacecraft. Spacecraft are designed to be as minimalist as possible with the appropriate level of safety features. This requires a keen understanding of what exactly is required and a razor sharp focus on efficiency. No extra baggage should be carried on the way.

How does one design a system to minimize the variance?

To minimize variance requires a full stack, or platform, understanding so that the system is always amortizing the cost of expensive operations and all algorithms have a Big-O notation of O(log n) or even better.

Variance can be introduced at all layers or the stack. At a very low level it could be SMM interrupts checking the status of hardware or system interrupts pre-empting execution. Further up the stack it could be the operating system or virtual machines managing memory that is being churned by the applications. Variance often comes from algorithms employed within the trading systems themselves traversing data structures that can cause cache-misses and O(n) processing or searching.

The design approach taken to the SBE codec has the goals of being as efficient as possible and keeping variance to a minimum while taking a risk appropriate approach to safety.