NVIDIA NIM Microservices Revolutionize AI Model Deployment

Delivered
as
optimized
containers,
NVIDIA
NIM
microservices
are
designed
to
accelerate
AI
application
development
for
businesses
of
all
sizes,
paving
the
way
for
rapid
production
and
deployment
of
AI
technologies.
The
set
of
microservices
can
be
used
to
build
and
deploy
AI
solutions
across
speech
AI,
data
retrieval,
digital
biology,
digital
humans,
simulation,
and
large
language
models
(LLMs),
according
to
the

NVIDIA
Technical
Blog.

Speech
and
Translation
NIM
Microservices

The
latest
NIM
microservices
for
speech
and
translation
enable
organizations
to
integrate
advanced
multilingual
speech
and
translation
capabilities
into
their
conversational
applications.
These
include
automatic
speech
recognition
(ASR),
text-to-speech
(TTS),
and
neural
machine
translation
(NMT),
catering
to
diverse
industry
needs.

Parakeet
ASR

The
Parakeet
ASR-CTC-1.1B-EnUS
ASR
model,
with
1.1
billion
parameters,
provides
record-setting
English
language
transcription
capabilities.
It
delivers
exceptional
accuracy
and
robustness,
adeptly
handling
diverse
speech
patterns
and
noise
levels,
enabling
businesses
to
advance
their
voice-based
services.

FastPitch-HiFiGAN
TTS

FastPitch-HiFiGAN-EN
integrates
FastPitch
and
HiFiGAN
models
to
generate
high-fidelity
audio
from
text.
It
enables
businesses
to
create
natural-sounding
voices,
elevating
user
engagement
and
delivering
immersive
experiences.

Megatron
NMT

The
Megatron
1B-En32
is
a
powerful
NMT
model
excelling
in
real-time
translation
across
multiple
languages,
facilitating
seamless
multilingual
communication.
It
enables
organizations
to
extend
their
global
reach
and
engage
diverse
audiences.

Retrieval
NIM
Microservices

The
latest
NVIDIA
NeMo
Retriever
NIM
microservices
help
developers
efficiently
fetch
the
best
proprietary
data
to
generate
knowledgeable
responses
for
their
AI
applications.
NeMo
Retriever
enables
organizations
to
seamlessly
connect
custom
models
to
diverse
business
data
and
deliver
highly
accurate
responses
using
retrieval-augmented
generation
(RAG).

Embedding
QA
E5

The
NVIDIA
NeMo
Retriever
QA
E5
embedding
model
is
optimized
for
text
question-answering
retrieval.
It
transforms
textual
information
into
dense
vector
representations,
crucial
for
a
text
retrieval
system.

Embedding
QA
Mistral
7B

The
NVIDIA
NeMo
Retriever
QA
Mistral
7B
embedding
model
is
a
multilingual
community
base
model
fine-tuned
for
high-accuracy
question-answering.
This
model
is
suitable
for
users
building
a
question-and-answer
application
over
a
large
text
corpus.

Snowflake
Arctic
Embed

Snowflake
Arctic
Embed
is
a
suite
of
text
embedding
models
for
high-quality
retrieval,
optimized
for
performance.
These
models
are
ready
for
commercial
use,
free
of
charge,
and
have
achieved
state-of-the-art
performance
on
the
MTEB/BEIR
leaderboard.

Reranking
QA
Mistral
4B

The
NVIDIA
NeMo
Retriever
QA
Mistral
4B
reranking
model
provides
a
logit
score
representing
document
relevance
to
a
query.
It
improves
the
overall
accuracy
of
text
retrieval
systems,
often
deployed
in
combination
with
embedding
models.

Digital
Biology
NIM
Microservices

In
healthcare
and
life
sciences,
NVIDIA
NIM
microservices
are
transforming
digital
biology.
These
AI
tools
empower
pharmaceutical
companies,
biotechnology,
and
healthcare
facilities
to
expedite
innovation
and
deliver
life-saving
medicine
to
patients.

MolMIM

MolMIM
is
a
transformer-based
model
for
controlled
small
molecule
generation,
optimizing
and
sampling
molecules
for
improved
values
of
desired
scoring
functions.
It
can
be
deployed
in
the
cloud
or
on-premises
for
computational
drug
discovery
workflows.

DiffDock

NVIDIA
DiffDock
NIM
microservice
is
built
for
high-performance,
scalable
molecular
docking.
It
predicts
up
to
7x
more
poses
per
second
compared
to
baseline
models,
reducing
the
cost
of
computational
drug
discovery
workflows.

LLM
NIM
Microservices

New
NVIDIA
NIM
microservices
for
LLMs
offer
unprecedented
performance
and
accuracy
across
various
applications
and
languages.

Llama
3.1
8B
and
70B

The
Llama
3.1
8B
and
70B
models
provide
cutting-edge
text
generation
and
language
understanding
capabilities,
serving
as
powerful
tools
for
creating
engaging
and
informative
content.
Deploying
Llama
3.1
8B
NIM
on
NVIDIA
H100
data
center
GPUs
can
achieve
up
to
2.5x
tokens
per
second
for
content
generation.

Llama
3.1
405B

Llama
3.1
405B
is
the
largest
openly
available
model
for
various
use
cases,
including
synthetic
data
generation.
The
Llama
3.1
405B
NIM
microservice
can
be
downloaded
and
run
anywhere
from
the
NVIDIA
API
catalog.

Simulation
NIM
Microservices

New
NVIDIA
USD
NIM
microservices
offer
the
ability
to
leverage
generative
AI
copilots
and
agents
to
develop
Universal
Scene
Description
(OpenUSD)
tools
that
accelerate
the
creation
of
3D
worlds.

USD
Code

USD
Code
is
a
state-of-the-art
LLM
that
answers
OpenUSD
knowledge
queries
and
generates
USD-Python
code.

USD
Search

USD
Search
provides
AI-powered
search
for
OpenUSD
data,
3D
models,
images,
and
assets
using
text-
or
image-based
inputs.

USD
Validate

USD
Validate
enables
verifying
compatibility
of
OpenUSD
assets
with
instant
RTX
render
and
rule-based
validation.

Video
Conferencing
NIM
Microservices

NVIDIA
Maxine
simplifies
the
deployment
of
AI
features
that
enhance
audio,
video,
and
augmented
reality
effects
for
video
conferencing
and
telepresence.

Maxine
Audio2Face-2D

Maxine
Audio2Face-2D
animates
a
2D
image
in
real
time
using
speech
audio.
It
enables
head
pose
animation
for
natural
delivery
and
can
be
coupled
with
chatbot
output
or
translated
speech.

Eye
Contact

NVIDIA
Maxine
Eye
Contact
NIM
microservice
uses
AI
to
apply
a
filter
to
the
user’s
webcam
feed
in
real
time,
redirecting
their
eye
gaze
toward
the
camera
to
improve,
augment,
and
enhance
the
user
experience.

Accelerate
AI
Application
Development

NVIDIA
NIM
streamlines
the
creation
of
complex
AI
applications
by
enabling
the
integration
of
specialized
microservices
across
domains.
Using
NIM
microservices,
organizations
can
bypass
the
complexities
of
building
AI
models
from
scratch,
saving
time
and
resources.
This
allows
for
the
assembly
of
customized
AI
solutions
that
meet
specific
business
needs.

For
example,
a
company
can
combine
ACE
NIM
microservices,
including
speech
recognition,
with
LLM
NIM
microservices
to
create
digital
humans
for
personalized
customer
service
across
industries
such
as
healthcare,
finance,
and
retail.

NIM
microservices
can
also
be
integrated
into
supply
chain
management
systems,
combining
cuOpt
NIM
microservice
for
route
optimization
with
NeMo
Retriever
NIM
microservices
for
retrieval-augmented
generation
and
LLM
NIM
microservices
for
business
communication.

Get
Started

NVIDIA
NIM
empowers
enterprises
to
fully
harness
AI,
accelerating
innovation,
maintaining
a
competitive
edge,
and
delivering
superior
customer
experiences.
Explore
the
latest
AI
models
available
with
NIM
microservices
and
discover
how
these
powerful
tools
can
transform
your
business.

Image
source:
Shutterstock

NVIDIA NIM Microservices Revolutionize AI Model Deployment

Speech and Translation NIM Microservices

Parakeet ASR

FastPitch-HiFiGAN TTS

Megatron NMT

Retrieval NIM Microservices

Embedding QA E5

Embedding QA Mistral 7B

Snowflake Arctic Embed

Reranking QA Mistral 4B

Digital Biology NIM Microservices

MolMIM

DiffDock

LLM NIM Microservices

Llama 3.1 8B and 70B

Llama 3.1 405B

Simulation NIM Microservices

USD Code

USD Search

USD Validate

Video Conferencing NIM Microservices

Maxine Audio2Face-2D

Eye Contact

Accelerate AI Application Development

Get Started