Claude 3.5 Sonnet Empowers Audio Data Analysis with Python

Claude
3.5
Sonnet,
recently

announced
by
Anthropic,
sets
new
industry
benchmarks
for
various
LLM
tasks.
This
model
excels
in
complex
coding,
nuanced
literary
analysis,
and
showcases
exceptional
context
awareness
and
creativity.

According
to

AssemblyAI,
users
can
now
learn
how
to
utilize
Claude
3.5
Sonnet,
Claude
3
Opus,
and
Claude
3
Haiku
with
audio
or
video
files
in
Python.

Pipeline
for
applying
Claude
3
models
to
audio
data

Here
are
a
few
example
use
cases
for
this
pipeline:

Creating
summaries
of
long
podcasts
or
YouTube
videos
Asking
questions
about
the
audio
content
Generating
action
items
from
meetings

How
Does
It
Work?

Language
models
primarily
work
with
text
data,
necessitating
the
transcription
of
audio
data
first.
Multimodal
models
can
address
this,
though
they
remain
in
early
development
stages.

To
achieve
this,
AssemblyAI’s
LeMUR
framework
is
employed.
LeMUR
simplifies
the
process
by
allowing
the
combination
of
industry-leading
Speech
AI
models
and
LLMs
in
just
a
few
lines
of
code.

Set
Up
the
SDK

To
get
started,
install
the

AssemblyAI
Python
SDK,
which
includes
all
LeMUR
functionality.

pip install assemblyai

Then,
import
the
package
and
set
your
API
key.
You
can
get
one
for
free

here.

import assemblyai as aai
aai.settings.api_key = "YOUR_API_KEY"

Transcribe
an
Audio
or
Video
File

Next,
transcribe
an
audio
or
video
file
by
setting
up
a
Transcriber
and
calling
the
transcribe()
function.
You
can
pass
in
any
local
file
or
publicly
accessible
URL.
For
instance,
a

podcast
episode
of
Lenny’s
podcast
featuring
Dalton
Caldwell
from
Y
Combinator
can
be
used.

audio_url = "https://storage.googleapis.com/aai-web-samples/lennyspodcast-daltoncaldwell-ycstartups.m4a" transcriber = aai.Transcriber()
transcript = transcriber.transcribe(audio_url) print(transcript.text)

Use
Claude
3.5
Sonnet
with
Audio
Data

Claude
3.5
Sonnet
is
Anthropic’s
most
advanced
model
to
date,
outperforming
Claude
3
Opus
on
a
wide
range
of
evaluations
while
remaining
cost-effective.

To
use
Sonnet
3.5,
call
transcript.lemur.task(),
a
flexible
endpoint
that
allows
you
to
specify
any
prompt.
It
automatically
adds
the
transcript
as
additional
context
for
the
model.

Specify
aai.LemurModel.claude3_5_sonnet
for
the
model
when
calling
the
LLM.
Here’s
an
example
of
a
simple
summarization
prompt:

prompt = "Provide a brief summary of the transcript." result = transcript.lemur.task( prompt, final_model=aai.LemurModel.claude3_5_sonnet
) print(result.response)

Use
Claude
3
Opus
with
Audio
Data

Claude
3
Opus
is
adept
at
handling
complex
analysis,
longer
tasks
with
many
steps,
and
higher-order
math
and
coding
tasks.

To
use
Opus,
specify
aai.LemurModel.claude3_opus
for
the
model
when
calling
the
LLM.
Here’s
an
example
of
a
prompt
to
extract
specific
information
from
the
transcript:

prompt = "Extract all advice Dalton gives in this podcast episode. Use bullet points." result = transcript.lemur.task( prompt, final_model=aai.LemurModel.claude3_opus
) print(result.response)

Use
Claude
3
Haiku
with
Audio
Data

Claude
3
Haiku
is
the
fastest
and
most
cost-effective
model,
ideal
for
executing
lightweight
actions.

To
use
Haiku,
specify
aai.LemurModel.claude3_haiku
for
the
model
when
calling
the
LLM.
Here’s
an
example
of
a
simple
prompt
to
ask
your
questions:

prompt = "What are tar pit ideas?" result = transcript.lemur.task( prompt, final_model=aai.LemurModel.claude3_haiku
) print(result.response)

Learn
More
About
Prompt
Engineering

Applying
Claude
3
models
to
audio
data
with
AssemblyAI
and
the
LeMUR
framework
is
straightforward.
To
maximize
the
benefits
of
LeMUR
and
the
Claude
3
models,
refer
to
additional
resources
provided
by
AssemblyAI.

Image
source:
Shutterstock

Claude 3.5 Sonnet Empowers Audio Data Analysis with Python

How Does It Work?

Set Up the SDK

Transcribe an Audio or Video File

Use Claude 3.5 Sonnet with Audio Data

Use Claude 3 Opus with Audio Data

Use Claude 3 Haiku with Audio Data

Learn More About Prompt Engineering