Also known as Prompting Tags or Speech Control Tags, with the release of the ElevenLabs v3 model, these are known as Audio Tags.

There isn’t a strict, finite dictionary because the AI model is trained to interpret natural language cues. You can often type any descriptive word or action in brackets, and the model will attempt to perform it. However, ElevenLabs has documented primary tags that the system reliably recognizes.

Voiceover with Audio Tags

ElevenLabs audio tags list.

Voiceover without Audio Tags

1. Emotional Tone

These dictate the mood and feeling of the speaker’s delivery.
Tags: [happy], [excited], [sad], [angry], [nervous], [frustrated], [tired], [curious], [mischievously], [serious], [confident], [ironic], [cheerful], [sorrowful]

2. Reactions & Non-Verbal Sounds

These insert human-like sounds and unscripted natural reactions into the audio.
Tags: [laughs], [laughing], [laughs harder], [starts laughing], [wheezing], [sighs], [exhales], [crying], [clears throat], [gulps], [gasp], [breathes], [snorts]

3. Delivery Style & Volume

These adjust the energy, volume, and specific performance style of the read.
Tags: [whispers], [whispering], [shouts], [shouting], [quietly], [loudly], [flatly], [monotone], [with expression], [sarcastic], [dramatic], [matter-of-fact], [whiny], [explaining]

4. Pacing & Timing

These tags manipulate the rhythm, speed, and flow of the speech.
Tags: [pause], [long pause], [rushed], [slow], [slows down], [deliberate], [rapid-fire], [stammers], [drawn out], [repeats], [timidly], [continues after a beat]

5. Emphasis & Dialogue Mechanics

These are used to shape sentence stress or manage multi-character conversational dynamics.
Tags: [emphasized], [stress on next word], [understated], [interrupting], [overlapping]

6. Character Archetypes & Accents

You can direct the AI to take on a specific persona or accent mid-sentence without changing the base voice model.
Tags: [pirate voice], [French accent], [British accent], [Australian accent], [Southern US accent], [evil scientist voice], [childlike tone], [fantasy narrator], [sci-fi AI voice], [classic film noir]

How to use them effectively:

Tags are case-insensitive ([SIGH] works the same as [sigh]). You can also stack them together to create layered performances. For example, writing [nervous][whispering] I think someone is out there. will combine both the emotion and the delivery style. The effect of a tag generally persists until a new tag is introduced or the paragraph ends.

Want to Create Viral Videos Like These? 👇

Stop guessing. Get the step-by-step guides to mastering AI content.

Inside Ai School, you’ll get to know the exact Ai Workflows and detailed guides on how to make real $$ with Ai Content.

PLUS you also get:
✅ Prompt Vault: Copy/Paste exact Viral Prompts
✅ Daily Trend Alert: Never miss out on Viral Trends
✅ VIP Network: Access to Top 0.1% 1B+ views Creators
✅ Step by Step Ai Course: Video Tutorials and Guides