The GenAI Problem In Finance

The GenAI Problem In Finance

Ashok Reddy is CEO of KX, a leading high-performance analytical database for the AI era.

The hype around generative AI (GenAI) is undeniable. Tools like ChatGPT have captivated the public imagination, demonstrating an impressive ability to generate human-like text, create content and power chatbots. But in capital markets, where precision, speed and explainability are paramount, we’re seeing a different reality.

While GenAI has its uses, its current form falls short of meeting the rigorous demands of financial applications. That’s why I believe the future of AI in finance will not be driven by the biggest models but by the smartest—those built with a deep understanding of the industry’s specific challenges and needs.

The reason for this is fundamental: ChatGPT, and many models like it, can’t actually do math. They rely on sophisticated pattern recognition and statistical memory, not true mathematical computation. This reliance on memorization rather than calculation makes them unsuitable for many critical financial applications.

Additionally, these models struggle with understanding time-series data—a key component of market forecasting and risk assessment. Without a strong grasp of temporal relationships, they lack the ability to track, interpret and react to market shifts in real time.

New research from the University of Chicago’s Booth School of Business reinforces this limitation. Bradford Levy’s paper, Caution Ahead: Numerical Reasoning and Look-ahead Bias in AI Models,” published on January 25, 2025, delivers a sobering assessment of the challenges facing large language models (LLMs) in financial contexts.

Why LLMs Struggle With Basic Math And Time-Series Data

Levy’s research provides compelling evidence that LLMs are not the financial whiz-kids some might believe. The paper highlights that much of the perceived accuracy of AI models stems from artifacts of the modeling process rather than mechanisms grounded in economics. His analysis identifies two major concerns: poor numerical reasoning and look-ahead bias, alongside fundamental weaknesses in handling time-series data.

Levy’s tests expose how LLMs rely on memorization rather than genuine numerical reasoning. For example, while they can add two numbers between zero and 100 correctly, their accuracy plummets when adding numbers between 0 and 10,000. To probe this weakness further, Levy conducted a novel test in which he manipulated real company accounting data by subtly changing the least significant digit (e.g., $7.334 billion to $7.335 billion).

The result? GPT-4’s accuracy in predicting earnings changes dropped from 60% to no better than random chance, demonstrating that these models aren’t analyzing financial data meaningfully but simply matching memorized patterns. This isn’t just a minor glitch; it’s a fatal flaw.

The paper also argues that commercial LLMs often exhibit significant look-ahead bias, meaning their seemingly strong performance may be due to implicit knowledge of future outcomes rather than true predictive ability. Compounding this issue, these models struggle with time-series forecasting.

Since LLMs fail to retain and understand the sequential nature of data, their ability to generate reliable financial predictions is severely constrained. Financial strategies often depend on precise timing, but GenAI models fundamentally lack the temporal awareness needed to interpret long-term dependencies.

The GenAI Mirage: Where Hype Meets Reality On Wall Street

These limitations, combined with the foundational design of many popular GenAI models, present significant challenges in the financial markets.

• The Illusion Of Calculation: As Levy’s research confirms, models like ChatGPT are not performing mathematical calculations but rather predicting the next word or number in a sequence based on probabilities derived from their training data. In finance, this inability to perform accurate calculations is a critical weakness.

• The Explainability Imperative: Explainability isn’t just a regulatory requirement on Wall Street; it’s essential for building trust and making sound investment decisions. The opaque, “black box” nature of many GenAI models makes them a liability in this respect. Without transparency in their decision making processes, firms risk regulatory penalties and operational disruptions.

• The Cost-Benefit Disconnect: The computational costs of training and deploying large GenAI models are substantial, and the return on investment remains questionable for many financial applications compared to proven traditional AI techniques.

A Hybrid Approach: The Path Forward

Levy’s research underscores the limitations of GenAI in financial applications, but it also suggests a path forward. LLMs can be instructed to write and execute code that performs math, acting as intelligent agents that delegate tasks to more specialized tools. Additionally, analytical AI models, which are capable of accurate numerical computation and time-series analysis, can bridge the gap where GenAI models fall short.

The future of AI in finance isn’t about abandoning GenAI altogether. It’s about a pragmatic, hybrid approach. Traditional AI techniques, such as machine learning and discriminative AI, remain the backbone of many financial applications, excelling in structured data analysis and real-time processing. However, there are areas where elements of GenAI can be strategically applied, provided they are approached with a quant-informed mindset. These include:

• Research Augmentation: GenAI could potentially assist in summarizing trusted financial news, research reports or earnings call transcripts.

• Code Generation For Finance: AI can be a powerful tool for generating and debugging code, including code for financial models.

• Streamlined Documentation: Certain aspects of regulatory reporting or compliance documentation might be automated with carefully tailored GenAI tools, provided they offer transparency and auditability.

• Combining Analytical AI With GenAI: A hybrid AI approach leverages analytical AI for rigorous quantitative and temporal analysis, while GenAI enhances pattern recognition and contextual processing. This combination can improve financial modeling, enhance alpha generation and reduce risk.

Redefining AI’s Role In Finance

The future of finance hinges on harnessing the power of data through advanced analytics and AI. Through my experience working with the world’s leading hedge funds and quants, I’ve seen the limitations of black-box models and the enduring value of rigorous, explainable and mathematically sound approaches.

The future belongs not to the biggest AI models but to the smartest ones—those built with a deep understanding of the specific needs and challenges of Wall Street. By embracing the rigor, precision and efficiency of quantitative finance, as well as strategically integrating elements of GenAI where appropriate, capital markets can unlock the true potential of AI.


Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?


Leave a Reply

Your email address will not be published. Required fields are marked *