
AI Inference · Data Centers · Edge Computing · Enterprise AI
The AI industry is experiencing a significant pivot in spending, shifting from the capital-intensive training of large language models (LLMs) to the deployment and utilization of these models through AI inference.
Historically, 80% of AI spending focused on training, with 20% on inference; however, Lenovo CEO Yuanqing Yang forecasts a reversal, projecting 80% for inference and 20% for training in the future. This trend is corroborated by Deloitte, which estimated inference workloads accounted for 50% of all AI compute in 2025, expected to rise to two-thirds in 2026.
The Futurum Group also predicts inference revenue will surpass training revenue by 2026. This shift is driven by enterprises moving beyond AI experimentation to widespread deployment, increasing demand for dedicated inference servers. Lenovo, a key player, launched three new inference servers at CES 2026, targeting diverse applications from manufacturing to retail.
Other major players like AMD, Dell, and HPE have also introduced or updated their inference server offerings. Key drivers for enterprises adopting on-premise inference solutions include cost efficiency compared to public cloud for predictable workloads, the necessity for data locality and real-time processing at the edge, and critical privacy, security, and data sovereignty concerns.
This indicates a robust and growing market for AI inference hardware and solutions.
AI Inference Spending Surges; Lenovo Leads Server Shift(current)