AI IS A BOOM, MAKE FUTURE FROM OUT OF IT.
Many companies say the cloud is their go-to when it comes to training and running large AI applications—but today, only a small portion of existing cloud infrastructure is actually set up to support that. The rest is not.
Now cloud providers, including Amazon Web
Service get skilled Microsoft Azure and Google Cloud are under pressure to
change that calculus to meet the computing demands of a major AI boom—and
as other hardware providers see a potential opening.
“There’s
a pretty big imbalance between demand and supply at the moment,” said Chetan
Kapoor, director of product management at Amazon Web Services’ Elastic Compute
Cloud division.
# CPMI LETS GET SKILLED
Most generative AI models today
are trained and run in the cloud. These models, designed to generate original text and analysis, can be
anywhere from 10 times to a 100 times bigger than older AI models, said Ziad
Asghar, senior vice president of product management at Qualcomm Technologies, adding that the number of use cases as
well as the number of users are also exploding.
“There is insatiable demand,” for
running large language models right now, including in industry sectors like
manufacturing and finance, said Nidhi Chappell, general manager of Azure AI
Infrastructure.
It is putting more pressure than
ever on a limited amount of computing capacity that relies on an even more limited number of specialized chips, such as graphic chips, or
GPUs, from NVidia. Companies like Johnson & Johnson, Visa, Chevron and others all said they anticipate using cloud
providers for generative AI-related use cases.
But
much of the infrastructure wasn’t built for running such large and complex
systems. Cloud sold itself as a convenient replacement for on-premise servers
that could easily scale up and down capacity with a pay-as-you-go pricing
model. Much of today’s cloud footprint consists of servers designed to run
multiple workloads at the same time that leverage general-purpose CPU
chips.
A minority of it, according to
analysts, runs on chips optimized for AI, such as GPUs and servers designed to
function in collaborative clusters to support bigger workloads, including large
AI models. GPUs are better for AI since they can handle many computations at
once, whereas CPUs handle fewer computations simultaneously.
At AWS, one cluster can contain up to 20,000 GPUs. AI-optimized
infrastructure is a small percentage of the company’s overall cloud footprint,
said Kapoor, but it is growing at a much faster rate. He said the company plans
to deploy multiple AI-optimized server clusters over the next 12 months.
Microsoft Azure and Google Cloud
Platform said they are similarly working to make AI infrastructure a greater
part of their overall fleets. However, Microsoft’s Chappell said that that
doesn’t mean the company is necessarily moving away from the shared
server—general purpose computing—which is still valuable for companies.
Other hardware providers have an
opportunity to make a play here, said Lee Sustar, principal analyst at tech
research and advisory firm Forrester, covering public cloud computing for the
enterprise.
Dell Technologies expects that high cloud costs, linked to heavy use—including
training models—could push some companies to consider on-premises
deployments. The computer maker has a server designed for that use.
#CPMI LETS GET SKILLED
“The existing economic models of
primarily the public cloud environment weren’t really optimized for the kind of
demand and activity level that we’re going to see as people move into these AI
systems,” Dell’s Global Chief Technology Officer John Roese said. On
premises, companies could save on costs like networking and data storage, Roese
said.
Cloud providers said they have several offerings
available at different costs and that in the long term, on-premises deployments
could end up costing more because enterprises would have to make huge
investments when they want to upgrade hardware.
Qualcomm said that in some cases it might be cheaper and
faster for companies to run models on individual devices, taking some pressure
off the cloud. The company is currently working to equip devices with the
ability to run larger and larger models.
Hewlett
Packard Enterprise headquarters in Spring, TX. HPE is rolling out its own
public cloud service, powered by a supercomputer, that will be available to
enterprises looking to train generative AI models in the second half of this
year.
And Hewlett PackardEnterprise is rolling out its own public cloud service, powered
by a supercomputer, that will be available to enterprises looking to train
generative AI models in the second half of 2023. Like some of the newer cloud
infrastructure, it has the advantage of being purposely built for large-scale
AI use cases, said Justin Hotard, executive vice president and general manager
of High Performance Computing, AI & Labs.
Hardware providers agree that it is still early days and
that the solution could ultimately be hybrid, with some computing happening on
the cloud and some on individual devices, for example.
In the long term, Sustar said, the raison d’être of cloud
is fundamentally changing from a replacement for companies’
difficult-to-maintain on-premise hardware to something qualitatively new:
Computing power available at a scale heretofore unavailable to
enterprises.
“It’s really a phase change in terms of how we look at
infrastructure, how we architected the structure, how we deliver the
infrastructure,” said Amin Vahdat, vice president and general manager of
machine learning, systems and Cloud AI at Google Cloud.
so , AI need boom with skilling
@MAHAK KESHYAP


Comments
Post a Comment