Rising cloud analytics costs are causing widespread AI/ML project failures in data-driven enterprises, prompting them to limit analytics and seek cost-effective solutions like GPU acceleration. (Image: FreePik)The rise of cloud computing and generative AI (genAI) have empowered data-driven enterprises with robust analytics and business insights. Cloud services provide essential infrastructure and tools that facilitate the development and deployment of genAI technologies. Additionally, the availability of pre-trained models and software packages over the cloud has accelerated the integration of genAI into data analytics processes. However, this progress has also led to a surge in data volumes and unsustainable cloud infrastructure costs.
A recent 2024 State of Big Data Analytics report by SQream, a GPU-based big data platform, highlights the financial strain cloud analytics costs impose on data-driven enterprises. The study surveyed 300 senior data management professionals from US companies, and found that 71 per cent frequently encounter unexpected high cloud analytics charges. Specifically, 5 per cent of companies experience cloud “bill shock” monthly, 25 per cent every two months, and 41 per cent quarterly.
Moreover, despite substantial budgets, a staggering 98 per cent of companies faced machine learning (ML) project failures in 2023 due to soaring cloud costs.
Bill shocks occur when data workflows are either too complex or too large for the existing cloud query engine. Due to scalable offerings of compute capacity for handling large datasets and complex algorithms, enterprise AI and data analytics tech stacks are now highly dependent on cloud platforms. With compute power requirements, the associated cloud costs rise.
“As data and analytics advance, companies are forced to limit dataset size and reduce complexity to manage expenses, impacting the quality of their insights,” Deborah Leff, chief revenue officer at SQream, told Observer. “Many AI/ML projects are not initiated due to the high cost of experimentation over the cloud. Poor data preparation and inadequate data cleansing methods are other major contributors to project failures.”
The cost of running data and AI technologies over the cloud has been a significant deterrent. Cloud cost inflation is set to persist in 2024, necessitating cost-cutting measures within enterprises that intensified last year. US government economic data and vendor research point to a pattern of rising cloud costs. The Bureau of Labor Statistics’ Producer Price Index (PPI) for 2024 has shown a month-over-month increase in data processing and related services, a category that includes cloud computing. The current year-over-year uptick stands at 3.7 per cent.
Data queries and the volume of projects are also being compromised due to these costs. Nearly half of the enterprises (48 per cent) in the SQream study admitted to reducing the complexity of queries to manage analytics costs, particularly concerning cloud resources and compute loads. Meanwhile, 46 per cent are limiting AI-powered projects due to cost.
But the cost crunch extends beyond vendor pricing. Leff explained that businesses often do not thoroughly analyze which in-house IT assets would benefit from cloud migration.
“Cost is a major factor in project failures because expenses often escalate during experimentation. It’s not that machine learning architecture fails, rather management chooses to halt investment when costs spiral. Time to value is crucial, and experimenting often leads to high costs due to the size and complexity of modern data,” she added.
Regarding data preparation, a third of the companies surveyed (33 per cent) said they are using 5-10 solutions/platforms, making this task extremely complicated. Using different tools by several users in parallel can be problematic, as finding bottlenecks and analyzing processes is more difficult.
“The data center ecosystem, built on 40-year-old technology, needs modernization. Sticking with outdated methods is not the solution. Instead, companies should explore innovative approaches to avoid letting costs and data limitations restrict their analytics capabilities,” Leff said. “Tools like NVIDIA Rapids are valuable but require developer skills, highlighting the need for more accessible solutions. Companies must challenge the status quo and seek better options to overcome current constraints.”
As companies navigate market disruptions caused by generative AI and the rise of large language models (LLMs), the explosion in data volume and complexity makes ML technologies essential for market competitiveness. Limiting data queries for AI systems to manage costs results in superficial insights, leading to premature project termination. Ninety-two percent of companies in the study said they are actively working to “rightsize” their cloud spending on analytics to better align with their budgets.
Leff explained that GPU acceleration, despite perceptions of high expense, can reduce costs significantly while speeding up processing. The solution provides benefits of the cloud with right-sized parallel processing resources and a flexible pay-as-you-go pricing option for agility and simplified management of the cloud. Enterprises can rent the GPU resources they need and later automatically scale on-demand.
“NCBA, a large online bank with up to 60 million daily users, initially took 37 hours to update their marketing models with daily click data. Despite optimizing their queries and exploring expensive hardware solutions, this delay left them unable to use data strategically. When they turned to GPU acceleration, it helped reduce their data pipeline cycle time to just seven hours, enabling them to update models rapidly each day,” she added.
Leff emphasized that companies must think proactively and push the boundaries of what’s possible. The rapid evolution of generative AI highlights that current data strategies may not be sufficient. She predicted that the next two years would bring dramatic changes within the IT sector.
“We must envision and prepare for a future where data grows and queries become more complex, but outdated limitations are removed. Embracing new methods such as GPU acceleration can unlock significant value, and those who act quickly will reap the rewards,” she said.