Alexander Krizhanovsky is CEO of Tempesta Technologies and the architect of Tempesta FW.
The cornerstone of a secure web architecture is a web application firewall (WAF). A WAF is essentially a web proxy that sits in front of your web application, detecting and blocking web attacks and application-layer DDoS attacks. Although some vendors focus exclusively on web API security, modern next-generation WAFs typically include API protection as well. For simplicity, “WAFs” will refer to solutions that also handle API security.
Performance Issues In Web Security
Although WAFs are designed to protect web applications, they often suffer from low performance (download), leading to not only scalability limitations but also security issues. A high-quality, deep-inspection WAF might process only 1,000 to 5,000 requests per second (RPS), depending on the vendor and workload. Under heavy load, latency for the 99.9th percentile and above can exceed one second. Given that an average web page triggers dozens of HTTP requests, WAF-induced latency may result in 10 seconds or more total page load time. In contrast, open-source web proxies like NGINX or HAProxy can easily handle 100,000 RPS and have just several milliseconds of latency for the 99.9th percentile.
The reason WAFs are so much slower is the intensive logic applied to every request and response: normalization of input data, hundreds of complex regular expressions, validation routines for various data types, the logic tailored to specific frontend/backend frameworks and dozens of other things. Scaling a WAF to 100,000 RPS and beyond can be cost-prohibitive, especially for high-traffic infrastructures. Yet, even smaller web services (100 to 1,000 RPS) can experience traffic surges from DDoS or AI-feeding bots.
Web Bots And DDoS Attacks
Advanced proxy services now help web bots (e.g., scrapers) avoid detection. Although bots often aren’t intended to cause DDoS, their inaccurate logic in fetching information can have that effect. For instance, LWN.net recently experienced such a case. The AI industry’s demand for data has also fueled the growth of bot-avoidance tools.
WAFs’ DDoS Weakness
Any slow logic is a potential DDoS vector—WAFs included. A well-known example is a regular expression denial of service (ReDoS), where inefficient regular expressions, or regexes, are exploited to spike CPU usage.
Rate limits are a basic but effective mitigation tool. However, they’re hard to configure safely so they don’t affect normal users, are vulnerable to misjudged thresholds and can be inflexible with unexpected marketing surges or application degradations.
Thus, more sophisticated traffic analysis—often powered by machine learning (ML)—is required. But ML-based systems need time (often several minutes) to observe classification traffic and react. This means DDoS attacks can succeed temporarily before protection kicks in.
In LWN’s case, bots used a proxy network distributing requests across millions of IP addresses, with each sending a single browser-mimicking request. Such attacks are hard to detect and often slip through defenses.
Architectural Limits Of Traditional WAFs
Most WAFs are built on general-purpose web servers like NGINX or Envoy, which are optimized for daily workloads, not for aggressively filtering malicious traffic with minimal resource consumption. Common inefficiencies include redundant data copies and inspections of the same data. Also, the event handling model of a web server isn’t well suited to heavy request processing. Cloudflare, for example, has blogged extensively about optimizing and reworking NGINX internals to improve WAF performance.
Improving Web Security Performance
One solution is to introduce a lightweight application-level load balancer in front of the WAF. This component classifies requests and forwards to the WAF just the ones needing inspection, such as dynamic content requests, and offloads the WAF by caching responses.
Ideally, the load balancer should be a lightweight WAF itself to block trivial attacks early. It should also come from a different vendor to reduce the risk of shared weaknesses that attackers could exploit to bypass protection. It should also include caching to reduce backend load.
To resist cache-targeted attacks (e.g., web cache deception), this balancer must support secure caching logic. Keeping only a small, hot subset of resources in RAM helps ensure resilience to (semi-)random URL DDoS attacks, reducing disk I/O bottlenecks.
If your WAF uses ML for detection, it may benefit from visibility into all traffic, not just what passes through the cache. In such cases, extended access logs from the balancer should be fed into the WAF’s ML pipeline.
Lastly, the load balancer should include a programmable rules engine, allowing WAF rules to be offloaded when safe, improving efficiency by blocking bad traffic earlier.
Conclusion
Web security and performance are tightly linked aspects of any web application: It’s impossible to achieve strong security on an underperforming infrastructure. The performance and scalability of WAFs are crucial for effective DDoS resistance. This becomes even more critical with the growth and diversity of AI-driven businesses employing advanced bots for scraping—these bots are hard to detect and may cause severe overloads.WAF accelerators not only mitigate the performance degradations introduced by WAFs but also enhance DDoS protection, enable cost-effective scalability and even improve the overall security of architectures that use WAFs.
Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?
