NVIDIA Unveils Llama 3.1 AI Models for Enterprise Applications


Joerg
Hiller


Jul
24,
2024
03:16

NVIDIA
introduces
Llama
3.1
large
language
models
to
enhance
enterprise
AI
applications,
offering
customizable
generative
AI
solutions.

NVIDIA Unveils Llama 3.1 AI Models for Enterprise Applications

The
newly
unveiled
Llama
3.1
collection
of
8B,
70B,
and
405B
large
language
models
(LLMs)
by
NVIDIA
is
closing
the
gap
between
proprietary
and
open-source
models.
This
development
is
attracting
more
developers
and
enterprises
to
integrate
these
models
into
their
AI
applications,
according
to
the

NVIDIA
Technical
Blog
.

Capabilities
of
Llama
3.1

These
models
excel
at
various
tasks
including
content
generation,
coding,
and
deep
reasoning.
They
can
be
used
to
power
enterprise
applications
for
use
cases
like
chatbots,
natural
language
processing,
and
language
translation.
The
Llama
3.1
405B
model,
thanks
to
its
extensive
training
data,
is
particularly
suited
for
generating
synthetic
data
to
tune
other
LLMs,
which
is
beneficial
in
industries
such
as
healthcare,
finance,
and
retail
where
real-world
data
is
often
restricted
due
to
compliance
requirements.

Additionally,
Llama
3.1
405B
can
be
tuned
with
domain-specific
data
to
serve
enterprise
use
cases,
enabling
better
accuracy
and
customization
for
organizational
requirements,
including
domain
knowledge,
company
vocabulary,
and
cultural
nuances.

Build
Custom
Generative
AI
Models
with
NVIDIA
AI
Foundry

NVIDIA
AI
Foundry
is
a
platform
and
service
designed
for
building
custom
generative
AI
models
with
enterprise
data
and
domain-specific
knowledge.
Similar
to
how
TSMC
manufactures
chips
designed
by
other
companies,
NVIDIA
AI
Foundry
allows
organizations
to
develop
their
own
AI
models.
This
includes
NVIDIA-created
AI
models
like
Nemotron
and
Edify,
popular
open
foundation
models,
NVIDIA
NeMo
software
for
customizing
models,
and
dedicated
capacity
on
NVIDIA
DGX
Cloud.

The
foundry
outputs
performance-optimized
custom
models
packaged
as
NVIDIA
NIM
inference
microservices
for
easy
deployment
on
any
accelerated
cloud,
data
center,
or
workstation.

Generate
Proprietary
Synthetic
Domain
Data
with
Llama
3.1

Enterprises
often
face
challenges
with
the
lack
of
domain
data
or
data
accessibility
due
to
compliance
and
security
requirements.
The
Llama
3.1
405B
model
is
ideal
for
synthetic
data
generation
due
to
its
enhanced
ability
to
recognize
complex
patterns,
generate
high-quality
data,
generalize
well,
scale
efficiently,
reduce
bias,
and
preserve
privacy.

Nemotron-4
340B
Reward
model
judges
the
data
generated
by
the
Llama
3.1
405B
model,
scoring
it
across
various
categories
and
filtering
out
lower-scored
data
to
provide
high-quality
datasets
that
align
with
human
preferences.
This
model
has
achieved
best-in-class
performance
with
an
overall
score
of
92.0
on
the
RewardBench
leaderboard.

Curate,
Customize,
and
Evaluate
Models
with
NVIDIA
NeMo

NVIDIA
NeMo
is
an
end-to-end
platform
for
developing
custom
generative
AI
models.
It
includes
tools
for
training,
customization,
retrieval-augmented
generation
(RAG),
guardrailing
and
toolkits,
data
curation
tools,
and
model
pretraining.
NeMo
supports
several
parameter-efficient
fine-tuning
techniques,
such
as
p-tuning,
low-rank
adaption
(LoRA),
and
its
quantization
version
(QLoRA).

NeMo
also
supports
supervised
fine-tuning
(SFT)
and
alignment
techniques
such
as
reinforcement
learning
from
human
feedback
(RLHF),
direct
preference
optimization
(DPO),
and
NeMo
SteerLM.
These
techniques
enable
steering
the
model
responses
and
aligning
them
with
human
preferences,
making
the
LLMs
ready
to
integrate
into
customer-facing
applications.

High-Performance
Inference
with
NVIDIA
NIM

The
custom
models
from
the
AI
Foundry
can
be
packaged
as
an
NVIDIA
NIM
inference
microservice,
part
of
NVIDIA
AI
Enterprise,
for
secure,
reliable
deployment
of
high-performance
inferencing
across
the
cloud,
data
center,
and
workstations.
Supporting
a
wide
range
of
AI
models,
including
open
foundation
models,
it
ensures
seamless,
scalable
AI
inferencing
using
industry-standard
APIs.

Use
NIM
for
local
deployment
with
a
single
command
or
autoscale
on
Kubernetes
on
NVIDIA
accelerated
infrastructure,
anywhere.
Get
started
with
a
simple
guide
to
NIM
deployment.
Additionally,
NIMs
also
support
deployments
of
models
customized
with
LoRA.

Start
Building
Your
Custom
Models

Depending
on
where
you
are
in
your
AI
journey,
there
are
different
ways
to
get
started.
To
build
a
custom
Llama
NIM
for
your
enterprise,
learn
more
at
NVIDIA
AI
Foundry.
Experience
the
new
Llama
3.1
NIMs
and
other
popular
foundation
models
at
ai.nvidia.com.
You
can
access
the
model
endpoints
directly
or
download
the
NIMs
and
run
them
locally.

Image
source:
Shutterstock

Comments are closed.