
AI Inference · Data Centers · Edge Computing · Enterprise AI
The AI industry is experiencing a significant pivot in spending, shifting from the capital-intensive training of large language models (LLMs) to the deployment and utilization of these models through AI inference.
Historically, 80% of AI spending focused on training, with 20% on inference; however, Lenovo CEO Yuanqing Yang forecasts a reversal, projecting 80% for inference and 20% for training in the future. This trend is corroborated by Deloitte, which estimated inference workloads accounted for 50% of all AI compute in 2025, expected to rise to two-thirds in 2026.
The Futurum Group also predicts inference revenue will surpass training revenue by 2026. This shift is driven by enterprises moving beyond AI experimentation to widespread deployment, increasing demand for dedicated inference servers. Lenovo, a key player, launched three new inference servers at CES 2026, targeting diverse applications from manufacturing to retail.
Other major players like AMD, Dell, and HPE have also introduced or updated their inference server offerings. Key drivers for enterprises adopting on-premise inference solutions include cost efficiency compared to public cloud for predictable workloads, the necessity for data locality and real-time processing at the edge, and critical privacy, security, and data sovereignty concerns.
This indicates a robust and growing market for AI inference hardware and solutions.
AI Inference Spending Surges; Lenovo Leads Server Shift(current)
Tech Giants Drive Record Bond Issuance for AI
Bezos Launches $100 Billion AI Fundraising Effort
Meta Drives AI With Layoffs, $2B Compute Deal
Meta CTO Bosworth Drives Company's AI-Native Push
Google AI Overviews Decimate Publisher Traffic, Revenue