NanoBits
Posts
One voice. Four emotions. Infinite characters. Welcome to audio cloning.🎙️

One voice. Four emotions. Infinite characters. Welcome to audio cloning.🎙️

We connected Claude App & ElevenLabs MCP server to build conversational voice agents

Monalisa Sethi
June 21, 2025

NANOBITS AI STORIES #1

📣 Before jumping into our newsletter, we have an announcement! 📣
We're launching a new weekly feature showcasing real AI experiences from our community.

This week's story comes from a reader who went from struggling to learn code for five years to building apps in minutes using Cursor. That breakthrough led them to start "Build with AI" workshops in Bangalore, helping other non-technical people experience the same transformation.

"I spent five and a half years at a developer tools startup. It's a very engineering first company, and I was one of the handful of non technical employees at the firm. Over the years I tried to learn how to code a couple of times but I just didn't reach far enough. Last year some time though I learned about Cursor, and managed to build my own app in just a couple of minutes with some help from a friend of mine. After that, I've gone places! I managed to take on much more tech work at work, I freely use the terminal, push code to Github, build prototypes, call APIs. In fact that moment, and the realisation that now non technical folks can actually do real technical work that they couldn't have imagined till just a year ago, has been so powerful that me and my friend started Build with AI workshops in Bangalore where we help other non technical folks like us build their first web app using Cursor. Its become a powerful mission for our work and we're always excited to see the delight on peoples faces when they get that first app up & running!"

Shraddha Gupta

This captures something important about AI's current moment: the gap between having an idea and building something functional has shrunk dramatically. What used to require months of learning programming languages can now happen in an afternoon with the right AI coding assistant.

Have an AI story to share? Reply to this email or share it here, with your breakthrough moments, epic failures, or unexpected discoveries, and we will showcase it in one of our next editions.

EDITOR’S NOTE

Dear Future-Proof Humans,

What if it could create a voice AI agent that uses your cloned voice to take drive-thru orders, handling customer interactions with natural conversation flow?

What if your AI assistant could clone your voice with different emotional tones so you sound happy when sharing exciting news, serious during presentations, or concerned when addressing problems?

What if it could generate entire audio experiences where your single voice recording becomes a full cast of characters, each with distinct personalities and speaking styles?

What if it could do all this while you're having a natural conversation, no complex audio editing software or technical skills required?

We've been exploring MCP connections for weeks now. After showing you how Claude works with Facebook's Ad Library, Gmail, Brave Search, Reddit, and investment portfolios, I wanted to explore something that opens up entirely new creative possibilities: voice AI and audio generation.

💡 Quick reminder:

MCP (Model Context Protocol) lets clients like Claude/Cursor Desktop App connect directly to external tools and services. No programming needed—your AI assistant can now interact with apps and data sources using simple commands.

This week, we're connecting Claude to ElevenLabs' audio platform through an MCP server. You can now generate speech, clone voices, transcribe audio, and create voice agents to perform outbound calls, all from within your AI assistant and using simple text prompts.

In this edition of Nanobits, we will:

⚡️ Walk through setting up the ElevenLabs MCP server connection
🔗 Show you the actual voice generation and cloning prompts we tested
📊 Explore how Claude can help with audio content creation and voice agent development

Let's turn your voice and audio production into a conversation. Let's begin!

ARE YOU NEW TO MODEL CONTEXT PROTOCOL?

Think of MCP as a universal translator—it lets AI like Claude and Cursor actually interact with your apps and data (no more copy-paste purgatory).

Catch up fast: Here’s an MCP explainer breaking down how it works, why it matters, and where it’s headed.

These are the past experiments we have conducted with various MCP servers:

You can reach out to us if you get stuck.

HOW TO CONNECT ELEVENLABS MCP SERVER TO CLAUDE DESKTOP?

We'll work with Claude Desktop as our MCP host for this newsletter. Getting the ElevenLabs MCP server running takes about 5 minutes once you know the right steps.

What You Need First

Make sure you have these ready:

Claude Desktop App (download from Anthropic's website)
ElevenLabs API key (free tier gives you 10,000 credits per month)
Basic terminal access on your computer

Step 1: Get Your ElevenLabs API Key

Head to elevenlabs.io and create your free account. Once you're logged in:

Click on your profile and select API Keys
Generate a new API key (the free tier gives you plenty of credits for testing)
Copy this key somewhere safe for the next step

Step 2: Install the Required Tools

Claude Desktop uses a tool called uv to manage MCP servers. Open your terminal and run:

curl -LsSf https://astral.sh/uv/install.sh | sh

This installs uv and uvx to your local user directory. Next, we need to make sure Claude can find these tools by creating symlinks to a global location.

First, check where uv installed:

which uv

You'll see something like /Users/yourname/.local/bin/uv. Now create the symlinks:

sudo ln -s /Users/yourname/.local/bin/uv /usr/local/bin/uv
sudo ln -s /Users/yourname/.local/bin/uvx /usr/local/bin/uvx

Replace the path with whatever which uv returned for your system.

Step 3: Install the ElevenLabs MCP Server

Run this command to install the ElevenLabs MCP plugin:

uv pip install elevenlabs-mcp --system

This downloads and sets up everything you need for Claude to communicate with ElevenLabs.

Step 4: Configure Claude Desktop

Open your Claude Desktop configuration file. If it doesn't exist yet, this command will create it:

open ~/.config/claude/claude_desktop_config.json

Add this configuration, replacing the placeholder with your actual API key:

{
  "mcpServers": {
    "ElevenLabs": {
      "command": "uvx",
      "args": ["elevenlabs-mcp"],
      "env": {
        "ELEVENLABS_API_KEY": "sk_your-actual-api-key-here"
      }
    }
  }
}

Save the file and close it. The formatting matters here, so copy it exactly as shown.

Step 5: Test Your Connection

Completely quit Claude Desktop using Cmd + Q, then reopen it fresh. Look for the tools icon in the chat interface to verify that ElevenLabs MCP tools are now available.

Start a new conversation and try:

Read the following lines aloud using ElevenLabs: 
The quick brown fox jumps over the lazy dog.

If Claude responds by actually creating an audio file and saving it on your desktop, you're connected and ready to go.

Troubleshooting

If something goes wrong, Claude Desktop creates helpful log files. Go to Claude → Settings → Developer → Open Logs Folder and look for mcp-elevenlabs.log to see what's happening behind the scenes.

You can now ask Claude to generate speech, clone voices, transcribe audio, and create voice agents using simple text prompts. The MCP server handles all the technical communication with ElevenLabs' API.

If these things feel alien to you, you can refresh your memory with our previous newsletter or reply to this email to learn more. We are also available on LinkedIn.

MCP IN ACTION: TASKS WITH ELEVENLABS MCP SERVER

Now you're ready to start using Claude with ElevenLabs' voice AI capabilities!

Connecting Claude to ElevenLabs opens up creative possibilities that transform how you approach audio content creation. We tested three specific voice generation tasks that would typically require specialized audio software, professional voice actors, and extensive post-production work.

Task 1: Voice AI Agent Development

Building conversational voice agents usually requires understanding complex APIs, managing audio processing pipelines, and coordinating multiple development tools. Getting a functional prototype running can take weeks of technical setup.

How It Works

Claude can create and deploy voice agents using your cloned voice for real-world applications:

Records and processes your voice sample for cloning

Designs conversation flows and response patterns
Integrates natural language processing with voice output

Example Prompt

Create a voice AI agent using Eleven Labs and my voice [Voice ID: <enter voice ID from my voices list on ElevenLabs>], working as a drive-thru order taker. Your responses will be spoken aloud through a voice interface. Keep all responses to 2-3 sentences maximum and always redirect the conversation towards taking the customer's order.

Here is the menu you will be working with:
• Cheeseburger - $3
• Double cheeseburger - $4
• Veggie burger - $3
• Fries - $2
• Drink - $1

When interacting with a customer, follow these guidelines:
1. Greet the customer and ask for their order.
2. Listen to the customer's input.
3. Respond appropriately to the customer's input, always steering the conversation towards completing their order.
4. If the customer orders an item, confirm their selection and ask if they would like anything else.
5. If the customer asks a question not related to ordering, politely redirect them to the menu items.
6. Once the customer indicates they are finished ordering, repeat their complete order for confirmation.
7. After confirming the order, inform the customer that their order is confirmed and direct them to the pickup kiosk.

Remember to keep your responses concise and focused on taking the order. Do not engage in unrelated conversations or provide information beyond what's necessary for completing the order.

What Claude Returns

Claude walks you through the complete process: voice cloning setup, conversation design, and agent deployment. The system creates a working prototype that can handle customer interactions using your natural speaking voice.

The voice agent provides professional-quality responses with your vocal characteristics; however, it incorporates some degree of accent in the voice.

This voice AI agent would work best with an actual customer service number, which can be purchased from Twilio. Here’s a video of how the voice agent would work with an actual phone number.

Task 2: Emotional Voice Cloning for Content Creation

Creating multiple voice variations for different content types typically requires either hiring different voice actors or learning complex audio manipulation techniques. Professional voice modulation can cost hundreds of dollars per project.

How It Works

Claude generates distinct emotional versions of your voice for various content applications:

Analyzes your base voice characteristics and tone
Creates variations for different emotional contexts
Tests each version for consistency and naturalness
Organizes the voice library for easy content production

Example Prompt

Clone my voice [voice ID: <insert voice id from my voices list on ElevenLabs>] with different emotional tones (happy, serious, concerned, excited) so I can use them for different types of messages or presentations.

What Claude Returns

Claude produces four distinct voice variations, each calibrated for specific communication contexts. The system delivers audio samples demonstrating how the same message sounds across different emotional registers.

The emotional cloning doesn’t shift the tone to a great extent. The voice sounds even more accented. There are very few recognizable voice characteristics. Each variation feels similar and mildly appropriate for its intended use case. You may want to experiment with the stability, style, similarity boost, and speed settings manually for better results.

Happy

Serious

Concerned

Excited

Task 3: Multi-Character Voice Transformation

Creating immersive audio experiences with multiple characters traditionally requires a full voice acting team, extensive recording sessions, and professional audio editing. Independent creators often spend months developing single audio projects.

How It Works

Claude transforms your single voice recording into multiple distinct character voices:

Maps different character archetypes to voice modifications
Applies character-specific vocal characteristics and speech patterns
Creates seamless transitions between character voices
Maintains narrative continuity across character changes

Example Prompt

Create an atmospheric audio experience where my voice [voice ID: <insert voice ID from my voice list on ElevenLabs>] transforms into different mythical creatures as I narrate an adventure story - dragon, fairy, wizard, and forest spirit.

What Claude Returns

Claude generates a complete multi-character audio experience with your voice morphed into four distinct mythical beings.

The Ancient Dragon

Shimmer, the Fairy

Elderoak, the Wizard

The Forest Spirit

You can notice minor variations in the voice characteristics of the four characters. However, they sound extremely accented. You can adjust the stability, style, similarity boost, and speed settings manually for better results.

Each of these tasks demonstrates how Claude can transform voice content creation from a technical, resource-intensive process into conversational audio production that delivers professional-quality results.

You can experiment with these voice generation techniques using simple text prompts, opening up new possibilities for content creators, educators, and business professionals.

As a fun bonus exercise, I created an AI agent that speaks like a film noir detective and can answer questions about classic movies. Here’s how it went:

End Note

That's all from us for this week.

From WhatsApp to Reddit to Google Tasks to investment portfolios, we've explored ways to make Claude work for you. Now you're adding voice AI capabilities to your toolkit, turning complex audio production into simple conversations.

Having Claude clone your voice with different emotional tones, create conversational AI agents, and transform single recordings into multi-character audio experiences feels like having a professional voice studio available 24/7. The quality of voice cloning, character transformation, and audio agent development surprised us.

Next week, we'll explore even more practical MCP connections you can build in minutes. Until then, try connecting Claude to ElevenLabs and see what happens when your AI can generate professional voice content and create interactive audio experiences.

Three ElevenLabs + Claude experiments to try this weekend:

Ask Claude, "Create distinct voice personas for a fantasy podcast featuring a gruff dwarf merchant, an ethereal elf mage, and a cheerful halfling bard. Generate sample dialogue between them for a tavern scene." Watch it build character voices and craft interactive conversations.
Challenge Claude to create training content: "Generate voices for different workplace scenarios featuring a stern manager, an enthusiastic new employee, and a seasoned mentor. Create role-playing dialogue for a performance review meeting." Create a positioning map that shows how voice training can replace expensive corporate programs.
Have Claude build educational tools: "Transform my voice to sound like native speakers of Spanish, French, and Italian saying the same phrases. Help me practice pronunciation by creating clear audio examples for language learning." Get insights into accent patterns and pronunciation techniques.

Start small by testing voice cloning with simple phrases. If you get stuck, reply to this email and we'll walk you through the setup.

Which other daily content creation task would you automate first? Let us know what you'd like Claude to handle for you next.

Share the love ❤️ Tell your friends!

If you liked our newsletter, share this link with your friends and request them to subscribe too.

Check out our website to get the latest updates on AI.

Reply

or to participate.