Oracle Expands NVIDIA GPU Instances on OCI for AI and Digital Twins


Terrill
Dicki


Aug
01,
2024
02:03

Oracle
Cloud
Infrastructure
(OCI)
now
offers
NVIDIA
L40S
GPU
bare-metal
instances,
enhancing
AI
and
digital
twin
capabilities.

Oracle Expands NVIDIA GPU Instances on OCI for AI and Digital Twins

Oracle
Cloud
Infrastructure
(OCI)
has
announced
the
availability
of
NVIDIA
L40S
GPU
bare-metal
instances,
according
to
NVIDIA
Blog.
This
expansion
aims
to
meet
the
growing
demand
for
advanced
technologies
like
generative
AI,
large
language
models
(LLMs),
and
digital
twins.

NVIDIA
L40S
Now
Available
to
Order
on
OCI

The
NVIDIA
L40S
GPU
is
designed
to
deliver
multi-workload
acceleration
for
various
applications,
including
generative
AI,
graphics,
and
video.
It
features
fourth-generation
Tensor
Cores
and
supports
the
FP8
data
format,
making
it
ideal
for
training
and
fine-tuning
small-
to
mid-size
LLMs
and
performing
inference
across
a
wide
range
of
use
cases.

For
instance,
a
single
L40S
GPU
can
generate
up
to
1.4
times
more
tokens
per
second
than
a
single
NVIDIA
A100
Tensor
Core
GPU
for
Llama
3
8B
with
NVIDIA
TensorRT-LLM.
The
L40S
also
excels
in
graphics
and
media
acceleration,
making
it
suitable
for
advanced
visualization
and
digital
twin
applications.
It
delivers
up
to
3.8
times
the
real-time
ray-tracing
performance
of
its
predecessor
and
supports
NVIDIA
DLSS
3
for
faster
rendering
and
smoother
frame
rates.

OCI
will
offer
the
L40S
GPU
in
its
BM.GPU.L40S.4
bare-metal
compute
shape,
featuring
four
NVIDIA
L40S
GPUs,
each
with
48GB
of
GDDR6
memory.
This
setup
includes
local
NVMe
drives
with
7.38TB
capacity,
4th
Generation
Intel
Xeon
CPUs
with
112
cores,
and
1TB
of
system
memory.
These
configurations
eliminate
virtualization
overhead
for
high-throughput
and
latency-sensitive
AI
or
machine
learning
workloads.

“We
chose
OCI
AI
infrastructure
with
bare-metal
instances
and
NVIDIA
L40S
GPUs
for
30%
more
efficient
video
encoding,”
said
Sharon
Carmel,
CEO
of
Beamr
Cloud.
“This
will
reduce
storage
and
network
bandwidth
consumption
by
up
to
50%,
speeding
up
file
transfers
and
increasing
productivity
for
end
users.”

Single-GPU
H100
VMs
Coming
Soon
on
OCI

OCI
will
soon
introduce
the
VM.GPU.H100.1
compute
virtual
machine
shape,
accelerated
by
a
single
NVIDIA
H100
Tensor
Core
GPU.
This
new
offering
aims
to
provide
cost-effective,
on-demand
access
for
enterprises
looking
to
leverage
the
power
of
NVIDIA
H100
GPUs
for
their
generative
AI
and
high-performance
computing
(HPC)
workloads.

A
single
H100
GPU
can
generate
more
than
27,000
tokens
per
second
for
Llama
3
8B,
offering
up
to
four
times
the
throughput
of
a
single
A100
GPU
at
FP16
precision.
The
VM.GPU.H100.1
shape
includes
2×3.4TB
of
NVMe
drive
capacity,
13
cores
of
4th
Gen
Intel
Xeon
processors,
and
246GB
of
system
memory,
making
it
well-suited
for
a
range
of
AI
tasks.

GH200
Bare-Metal
Instances
Available
for
Validation

OCI
has
also
made
the
BM.GPU.GH200
compute
shape
available
for
customer
testing.
This
shape
features
the
NVIDIA
Grace
Hopper
Superchip
and
NVLink-C2C,
providing
a
high-bandwidth,
cache-coherent
900GB/s
connection
between
the
NVIDIA
Grace
CPU
and
Hopper
GPU.
This
setup
enables
up
to
10
times
higher
performance
for
applications
running
terabytes
of
data
compared
to
the
NVIDIA
A100
GPU.

Optimized
Software
for
Enterprise
AI

Maximizing
the
potential
of
GPU-accelerated
compute
instances
requires
an
optimized
software
layer.
NVIDIA
NIM,
part
of
the
NVIDIA
AI
Enterprise
software
platform
available
on
the
OCI
Marketplace,
offers
a
set
of
microservices
designed
for
secure,
reliable
deployment
of
high-performance
AI
model
inference.

Optimized
for
NVIDIA
GPUs,
NIM
pre-built
containers
offer
improved
cost
of
ownership,
faster
time
to
market,
and
enhanced
security.
These
microservices
can
be
easily
deployed
on
OCI,
enabling
enterprises
to
develop
world-class
generative
AI
applications.

For
more
information,
visit
the

NVIDIA
Blog
.

Image
source:
Shutterstock

Comments are closed.