Modelserve: Golem Network’s New AI Inference Service


Joerg
Hiller


Jul
15,
2024
15:19

Golem
Network
introduces
Modelserve,
a
scalable
and
cost-effective
AI
model
inference
service
designed
for
developers
and
startups.

Modelserve: Golem Network's New AI Inference Service

Golem
Network
has
unveiled
Modelserve,
a
new
service
aimed
at
providing
scalable
and
affordable
AI
model
inferences,
according
to
a
recent
announcement
by
the
Golem
Project.
This
service
is
designed
to
allow
seamless
deployment
and
inference
of
AI
models
through
scalable
endpoints,
enhancing
the
efficiency
and
cost-effectiveness
of
AI
applications.

What
Is
Modelserve?

Modelserve,
developed
in
collaboration
with
an
external
team
and
Golem
Factory,
integrates
into
the
Golem
Network
ecosystem.
It
aims
to
support
the
AI
open-source
community
and
attract
developers
of
AI
applications
for
GPU
providers.
The
service
allows
for
the
seamless
deployment
and
inference
of
AI
models
through
scalable
endpoints,
ensuring
efficient
and
cost-effective
AI
apps
operations.

Why
Is
Golem
Network
Introducing
Modelserve?

The
introduction
of
Modelserve
aims
to
meet
the
growing
demand
for
computing
power
in
the
AI
industry.
By
leveraging
consumer-grade
GPU
resources,
which
offer
sufficient
power
and
memory,
the
service
can
effectively
run
AI
models
such
as
diffusion
models,
automatic
speech
recognition,
and
small
to
medium
language
models.
This
approach
is
more
cost-effective
compared
to
traditional
methods.
The
decentralized
architecture
of
the
Golem
Network
serves
as
a
marketplace
for
matching
supply
and
demand
for
these
resources,
enabling
access
to
computing
power
that
is
perfectly
suited
to
AI
applications.

The
addition
of
Modelserve
to
the
Golem
ecosystem
plays
a
key
role
in
getting
AI
use
cases,
driving
demand
for
providers
and
contributing
to
the
broader
adoption
of
the
Golem
Network.

Target
Audience

Modelserve
is
designed
for
a
diverse
range
of
users
including
service
and
product
developers,
startups,
and
companies
operating
in
both
Web
2.0
and
Web
3.0
environments.
These
users
typically:

  • Utilize
    small
    and
    medium-sized
    open-source
    models
    or
    create
    their
    own
    models
    from
    scratch
  • Require
    scalable
    AI
    model
    inference
    capabilities
  • Seek
    an
    environment
    to
    test
    and
    experiment
    with
    AI
    models

Technical
Implementation

Modelserve
comprises
three
key
components:


  • Website
    :
    Allows
    users
    to
    create
    and
    manage
    endpoints

  • Backend
    :
    Manages
    GPU
    resources
    to
    handle
    inferences,
    featuring
    a
    load
    balancer
    and
    auto-scaling
    capabilities.
    It
    leverages
    GPU
    resources
    available
    in
    the
    market,
    sourcing
    them
    from
    the
    Golem
    open
    and
    decentralized
    marketplace
    and
    other
    platforms
    offering
    GPU
    instances

  • API
    :
    Enables
    the
    running
    of
    AI
    model
    inferences
    and
    management
    of
    endpoints

The
service
uses
USD
payments
for
user
transactions,
while
settlements
with
Golem
GPU
providers
are
conducted
using
GLM,
the
native
token
of
the
Golem
Network.

Benefits
for
Users


  • Maintenance-Free
    AI
    Infrastructure
    (AI
    IaaS)
    :
    Users
    do
    not
    need
    to
    manage
    model
    deployment,
    inference,
    or
    GPU
    clusters
    as
    Modelserve
    handles
    these
    tasks

  • Affordable
    Autoscaling
    :
    The
    system
    automatically
    scales
    GPU
    resources
    to
    meet
    application
    demands
    without
    requiring
    user
    intervention

  • Cost-Effective
    Pricing
    :
    Users
    are
    charged
    based
    on
    the
    actual
    processing
    time
    of
    their
    requests,
    avoiding
    the
    costs
    associated
    with
    hourly
    GPU
    rentals
    or
    maintaining
    their
    own
    clusters

Synergy
with
Other
AI/GPU
Projects

Modelserve
integrates
with
GPU
Provider
and
AI
Provider
GamerHash
AI,
which
is
currently
in
the
proof-of-concept
stage.
Additionally,
the
first
version
of
Golem-Workers
has
been
created
as
part
of
Modelserve,
which
will
be
developed
as
a
separate
project
in
the
future.

Milestones
and
Next
Steps

  • Beta
    tests
    have
    been
    conducted
    with
    several
    AI-based
    startups
    and
    companies
  • The
    Golem
    Community
    Tests
    are
    scheduled
    for
    July
  • Commercialization
    of
    the
    service
    is
    set
    to
    begin
    in
    August

For
more
detailed
information,
visit
the

Golem
Project
blog
.

Image
source:
Shutterstock

Comments are closed.