[{"data":1,"prerenderedAt":2175},["ShallowReactive",2],{"blog-index-en":3},[4,223,406,554,827,996,1214,1398,1660,1821,2001],{"id":5,"title":6,"author":7,"body":8,"description":208,"extension":209,"meta":210,"navigation":211,"ogImage":212,"path":213,"publishedAt":214,"readingTime":215,"seo":216,"stem":217,"tags":218,"updatedAt":212,"__hash__":222},"blog\u002Fblog\u002Fen\u002Fai-audiobook-guide-2026.md","How to turn any text into an AI audiobook in 2026","Narrator AI",{"type":9,"value":10,"toc":194},"minimark",[11,15,18,23,26,29,33,36,39,60,63,67,70,73,76,80,83,86,89,93,96,99,102,113,116,120,123,126,130,133,136,147,150,154,157,168,171,174,177,181,184,188,191],[12,13,14],"p",{},"For a long time I didn't believe you could just upload a file and get back an audiobook you'd actually want to finish. Every attempt before about 2024 ended the same way: five minutes in, ears go heavy, brain checks out, robot voice droning through the fog. Then I tried what's available now — and closed the freelance narrator price page that had been sitting in my browser tabs for six weeks.",[12,16,17],{},"This piece is about what actually changed, and where the holes still are so you don't fall into them.",[19,20,22],"h2",{"id":21},"what-ai-narration-even-means-now","What \"AI narration\" even means now",[12,24,25],{},"Short version: it sounds like a person. Longer version: it sounds like a person who, unlike the actual person, isn't tired, didn't trip on a word, didn't go on vacation, and didn't ask for a re-record fee.",[12,27,28],{},"The technical reason is straightforward. Old TTS stitched together prerecorded phonemes — that's where the call-center voice came from. New models generate the whole audio waveform conditioned on the surrounding sentence, sometimes the whole paragraph. So when a character is angry, the voice doesn't just hit a louder volume on emoji-style cues — it actually sounds tighter, lower, closer to the teeth. Not flawless, but in a blind test I'm right about half the time. Two years ago I was right within a second.",[19,30,32],{"id":31},"what-you-can-throw-at-it","What you can throw at it",[12,34,35],{},"Almost any text format you've got: epub, fb2, txt, markdown, doc. A fanfic from AO3 or FFN. A lecture transcript. A 4-year-old unpublished novel that's been sitting in your drafts being shy about being read. I've even pushed my own year of journal entries through it and listened in the car like a summary of my own head.",[12,37,38],{},"What it dislikes:",[40,41,42,46,54,57],"ul",{},[43,44,45],"li",{},"PDFs, especially scanned ones — extract the text first or you get garbage.",[43,47,48,49,53],{},"Books with math and formulas. AI still can't read ",[50,51,52],"code",{},"∫(x²+1)dx"," gracefully, and I wouldn't pretend it can.",[43,55,56],{},"Footnotes and reference markers. They either get dropped or read in the same flat tone, which destroys the flow in academic writing.",[43,58,59],{},"Poetry. It's better than it was, but most models still don't really get rhythm.",[12,61,62],{},"Languages: English and Russian are tier one. German, French, Spanish — fine. Smaller languages — varying.",[19,64,66],{"id":65},"how-the-pipeline-runs","How the pipeline runs",[12,68,69],{},"I'll spare you the twelve-step recipe, because once you open the actual UI it's mostly self-evident. Quick version:",[12,71,72],{},"You upload the file. The service splits it into chapters and paragraphs, walks the text, and tries to figure out who's in it. For a novel with dialogue, it picks up speakers and assigns each a voice — male, female, age, sometimes character archetype. You can take the suggestions or rework them. Then it renders.",[12,74,75],{},"Render time scales with length. A short story finishes in a few minutes. An essay collection in half an hour. A full novel in a few hours, sometimes overnight if the queue is busy or the provider is using batch APIs. Decent services (we're one) prioritize the first chapters so you can start listening before the rest is done — useful if quality turns out to be off and you don't want to wait through the whole render.",[19,77,79],{"id":78},"money","Money",[12,81,82],{},"Pricing is almost universally per character, not per audio minute — which is right, because 1000 characters comes out to roughly a minute of audio, and the exact ratio depends on language and voice tempo.",[12,84,85],{},"In real numbers: a 50k-character short story comes in cheaper than a movie ticket. A 600k-character novel costs about as much as a decent dinner out. A 1.5M-character tome lands around the price of a mid-range board game. Not \"cheap as coffee,\" but not the kind of money you regret either.",[12,87,88],{},"Subscriptions are mostly gone. Most services (us included) bill per use: you narrate, you pay, you keep. No silent monthly charges on something you forgot you signed up for.",[19,90,92],{"id":91},"per-character-voices","Per-character voices",[12,94,95],{},"The single most felt improvement in the last couple of years is that one narrator no longer reads everyone. Anna and Maria used to sound the same and you guessed by intonation which mom was talking, which daughter. Now Anna gets a low warm voice, Maria gets a clearer mid-range, the narrator stays neutral, and you stop flipping back to figure out who said what.",[12,97,98],{},"For thrillers and mysteries, where the plot literally hinges on \"who said that,\" this is a quality-of-life upgrade you don't realize you needed until you have it. For non-fiction, a single voice is fine.",[12,100,101],{},"What you typically get to control on top of voice picks:",[40,103,104,107,110],{},[43,105,106],{},"Style hints — \"cold, detached,\" \"warm, conversational.\"",[43,108,109],{},"Age range — teen, adult, older.",[43,111,112],{},"Per-character voice override.",[12,114,115],{},"A piece of advice that saved me hours of re-listening: don't trust the auto-assignment for your main characters. For background players it's fine, you can't really tell. But for the protagonist, listen to three voice options before committing the whole book.",[19,117,119],{"id":118},"emotion-without-overdoing-it","Emotion — without overdoing it",[12,121,122],{},"Modern models do emotional coloring, and the temptation is to crank it. Don't. Real audiobook narrators read more flatly than you'd think — most of the work happens in the pauses and the pacing, not in screams. Push the emotion sliders all the way and after ten minutes you need a break.",[12,124,125],{},"We have a note pinned in our internal memory on this: light coloring only when the intensity is genuinely high. Sadness — yes. Tension — yes. Hysterics and screaming — almost never. Not because the model can't, but because nobody actually wants to listen to that for fifteen hours.",[19,127,129],{"id":128},"how-long-is-one-book-really","How long is one book, really",[12,131,132],{},"A 300-page novel is 12-15 hours of audio at standard pace. That's a lot. Most people I know listen at 1.25x — clarity holds, and a book finishes in a couple of work weeks of commute time.",[12,134,135],{},"Render times in our setup:",[40,137,138,141,144],{},[43,139,140],{},"Tiny (under 10k chars) — minutes.",[43,142,143],{},"Average (under 100k) — about a quarter of an hour, give or take.",[43,145,146],{},"Full novel — a few hours.",[12,148,149],{},"If the queue is busy, the service will still ship the early chapters first so you can start listening while the tail catches up.",[19,151,153],{"id":152},"where-ai-still-breaks","Where AI still breaks",[12,155,156],{},"I'm not going to pretend it's perfect. There are real bumps.",[12,158,159,160,163,164,167],{},"Stress in proper nouns, especially non-English ones, lands wrong about half the time. \"Bertholt,\" \"Proust,\" \"Margarita Therese\" — flip a coin. On your own book this can be patched with stress markers in the source (",[50,161,162],{},"за+мок"," instead of ",[50,165,166],{},"замок","), but spending an evening hand-marking a list of names is not the high point of the workflow.",[12,169,170],{},"Charts, tables, infographics — gone. If a book leans on visual material, audio won't carry it, and no AI will fix that.",[12,172,173],{},"Russian and other Cyrillic-script languages occasionally trip on archaic vocabulary. 19th-century classics are listenable, but every once in a while a word comes out that the model clearly hadn't seen and is guessing on. Annoying if you're a linguist or an editor. Almost invisible if you're not.",[12,175,176],{},"And there's the personal-fatigue thing. Some listeners' brains, after about 40 minutes, sense that something's off, even if they can't say what. Most don't. Just a fact, no fixing it.",[19,178,180],{"id":179},"your-books-vs-the-public-catalog","Your books vs the public catalog",[12,182,183],{},"A lot of services (us included) split this in two: your own book — you upload, you listen, no copyright issue under fair personal use. Public catalog — the service licenses popular titles and gives them to everyone. Personal listening doesn't bump into rights. Commercial distribution of AI-narrated work is a separate topic, and by default it's not allowed almost everywhere — you need explicit licenses and rights-holder consent.",[19,185,187],{"id":186},"what-id-do-the-first-time","What I'd do the first time",[12,189,190],{},"Don't go straight to a thousand pages. Pick a short story or novella — something you'd read in a couple of hours, where you can feel how it actually sits in your ears. Within ten minutes of upload you'll have the first finished chapter. If it works, push the rest. If not, fiddle with voices, tempo, style, try again. Odds are by the third or fourth pass you'll find a combination where the book sounds like yours.",[12,192,193],{},"That's how I assembled my first one. It still lives in my commute playlist, and somewhere around hour eight I stop noticing it isn't a person reading.",{"title":195,"searchDepth":196,"depth":196,"links":197},"",2,[198,199,200,201,202,203,204,205,206,207],{"id":21,"depth":196,"text":22},{"id":31,"depth":196,"text":32},{"id":65,"depth":196,"text":66},{"id":78,"depth":196,"text":79},{"id":91,"depth":196,"text":92},{"id":118,"depth":196,"text":119},{"id":128,"depth":196,"text":129},{"id":152,"depth":196,"text":153},{"id":179,"depth":196,"text":180},{"id":186,"depth":196,"text":187},"What AI narration actually does in 2026, what it costs, where it still fails, and how to get started without wasting a weekend.","md",{},true,null,"\u002Fblog\u002Fen\u002Fai-audiobook-guide-2026","2026-04-15",8,{"title":6,"description":208},"blog\u002Fen\u002Fai-audiobook-guide-2026",[219,220,221],"guide","basics","getting started","Q0WeUB2tYOr9UMslRIMkNmq1OdywpWtWtVcS5QkPiBg",{"id":224,"title":225,"author":7,"body":226,"description":394,"extension":209,"meta":395,"navigation":211,"ogImage":212,"path":396,"publishedAt":397,"readingTime":398,"seo":399,"stem":400,"tags":401,"updatedAt":212,"__hash__":405},"blog\u002Fblog\u002Fen\u002Fbest-ai-voices-russian.md","Best AI voices for Russian audiobooks in 2026",{"type":9,"value":227,"toc":385},[228,231,234,238,241,248,254,260,266,272,275,279,282,285,288,291,294,302,306,309,312,315,319,322,325,335,339,342,345,348,352,355,358,361,365,368,382],[12,229,230],{},"The first time I heard one of my own books in synthesized Russian was about five years ago, and I closed the tab forty seconds in. It was the call-center robot reading my text like he was being interrogated. In 2026 I put an audiobook on in the background, do other things, and an hour later catch myself forgetting it's a machine. Quality stopped being the problem. The problem now is choice.",[12,232,233],{},"Here's what actually works on Russian in 2026, and what each option fits. No price lists, no marketing screenshots — just what I've heard with my own ears.",[19,235,237],{"id":236},"whats-on-the-market","What's on the market",[12,239,240],{},"The list of voices worth seriously considering for Russian is short.",[12,242,243,247],{},[244,245,246],"strong",{},"Gemini TTS"," by Google — what we run in production. The current generation sounds nearly indistinguishable from a live narrator, handles emotion sensibly, and is careful with stress. For literary fiction, the best I've heard.",[12,249,250,253],{},[244,251,252],{},"Silero"," — open-source, free for personal use, rare in production. Limited voice roster, but quality for home projects is more than adequate.",[12,255,256,259],{},[244,257,258],{},"ElevenLabs"," — leader in voice cloning, but their Russian historically trails their English. If you specifically need a clone of your own voice for a podcast, yes. For just reading a book, there are better options.",[12,261,262,265],{},[244,263,264],{},"Yandex SpeechKit"," — solid quality, narrow selection, not the obvious pick for fiction. For technical use (navigation, IVR, system messages) it's excellent.",[12,267,268,271],{},[244,269,270],{},"Tinkoff Voice TTS"," — corporate API. Quality close to Yandex, distinct voice roster.",[12,273,274],{},"If you want it stripped down: in 2026 I take Gemini for fiction, Yandex or Tinkoff for technical and system content, ElevenLabs when I need to clone a specific voice (and I accept the Russian quality trade-off).",[19,276,278],{"id":277},"voice-for-the-kind-of-text","Voice for the kind of text",[12,280,281],{},"There's no universal answer, just patterns that work most of the time.",[12,283,284],{},"Literary fiction wants a warm mid-to-low male voice or a soft female voice, no strong accent. Pace slightly slow, around 0.95x. In our roster something like Charon (low male) or Leda (soft female) for Russian, depending on the protagonist.",[12,286,287],{},"Non-fiction, business books, self-help — different register. Wants a more businesslike voice, fewer emotions. Pace can pick up, around 1.1x. Listeners in those genres often have it on in the background; they want a steady stream rather than artful pauses.",[12,289,290],{},"Mystery and thriller. I'd take a male voice slightly below mid-range, moderate pace, minimal emotion. Too much vocal performance hurts here — what you want is a steady delivery where the dread slips past you, and a second later you realize what you just heard.",[12,292,293],{},"Children's books — softer female delivery, slightly higher pitch, slightly slower. Many models now ship dedicated children's voices; a regular female voice with the right style hint also works.",[12,295,296,297,301],{},"Classics — neutral \"literary\" voice. Not too young, not too old, no emotional coloring, even pace. Goal: don't ",[298,299,300],"em",{},"perform"," the classic, just read it cleanly enough that the text stays in front.",[19,303,305],{"id":304},"casting-in-dialogue-heavy-books","Casting in dialogue-heavy books",[12,307,308],{},"When several characters speak, you're not picking one voice anymore — you're picking several. Auto-casting quality varies a lot between services.",[12,310,311],{},"If the text says \"Anna said,\" every service will route the line to her. If dialogue runs without attribution, services start guessing from context, and that's where misses happen. So before render I always walk the character list manually, particularly for the main cast. Background characters can ride on automation.",[12,313,314],{},"One small thing that saved me re-listens: don't make every character contrast dramatically. Ten radically different voices fatigue the ear. Three or four \"anchor\" voices for main characters and the rest in their neighborhood, with small timbre shifts. The book ends up sounding like an ensemble, not a costume drama.",[19,316,318],{"id":317},"the-russian-fear-stress","The Russian fear: stress",[12,320,321],{},"\"За́мок\" or \"замо́к\"? \"Доро́га\" or \"дорога́\"? Without context, no model can be sure, and older TTS broke on this constantly.",[12,323,324],{},"In 2026 Gemini lands stress correctly somewhere around 95–97% of the time. Misses tend to cluster in proper nouns, especially non-Russian — Bertholt, Proust, Jorge, Kierkegaard. And in archaic vocabulary that's underrepresented in training data. For 19th-century classics, you'll occasionally hear it.",[12,326,327,328,330,331,334],{},"A trick I lean on: some services accept manual stress markers in the source — ",[50,329,162],{},", ",[50,332,333],{},"до+рога",". If a recurring word in your book lands wrong, five minutes of marking fixes it for the whole book. I started doing this seriously after one book burned three full re-renders on a single name.",[19,336,338],{"id":337},"cloning-your-own-voice","Cloning your own voice",[12,340,341],{},"The question that comes up constantly: \"can I record five minutes of myself and have it narrate my book?\" Technically yes — ElevenLabs does this; Gemini doesn't officially yet.",[12,343,344],{},"Quality on English clones is genuinely good, and people do use them for podcasts. On Russian, clones are acceptable but audibly thinner than the original. Useful for personal projects: your own podcast in your own clone, memoirs in your own voice, journal entries. For commercial projects, it's harder — and not just technically. You're stepping into questions about voice rights, consent for cloning, ethics.",[12,346,347],{},"If I were shipping a book commercially right now, I wouldn't clone. In a year, maybe — the technology will catch up.",[19,349,351],{"id":350},"where-id-start-as-a-beginner","Where I'd start as a beginner",[12,353,354],{},"If this is your first time and you don't know the landscape, I'd do this.",[12,356,357],{},"Pick a service running on Gemini TTS — currently the leader on Russian, no need to overthink it. Don't manually pick voices for the first pass; let the automation do its thing and see what you get. Default pace 1.0; speed-up belongs in the player, not in the source render. Listen to the first chapter. In about 80% of cases it'll be fine as-is.",[12,359,360],{},"The remaining 20% is where the real fiddly work begins — picking specific voices, marking stress on names, adding style hints. There are no \"the right way\" answers here, only your ears and your book.",[19,362,364],{"id":363},"a-simple-quality-test","A simple quality test",[12,366,367],{},"A trick I run on any service. Take a five-minute slice of your real text — not a clean one, a real one, with dialogue, hard words, an emotional beat. Render, listen in headphones. Four things to catch:",[40,369,370,373,376,379],{},[43,371,372],{},"Unnatural pauses where there shouldn't be any.",[43,374,375],{},"Stress on hard words, correct or not.",[43,377,378],{},"Voice changes between speakers in dialogue.",[43,380,381],{},"\"Synthesis\" leaking on long sentences — that metallic aftertaste.",[12,383,384],{},"If all four are clean, push the rest of the book. If even one is genuinely off, find another voice or another service. Don't talk yourself into \"I'll get used to it.\" Over fifteen hours of listening, that's a very expensive habit to develop.",{"title":195,"searchDepth":196,"depth":196,"links":386},[387,388,389,390,391,392,393],{"id":236,"depth":196,"text":237},{"id":277,"depth":196,"text":278},{"id":304,"depth":196,"text":305},{"id":317,"depth":196,"text":318},{"id":337,"depth":196,"text":338},{"id":350,"depth":196,"text":351},{"id":363,"depth":196,"text":364},"Which AI models actually sound natural in Russian, which to avoid, and what to pick for fiction vs non-fiction.",{},"\u002Fblog\u002Fen\u002Fbest-ai-voices-russian","2026-04-10",6,{"title":225,"description":394},"blog\u002Fen\u002Fbest-ai-voices-russian",[402,403,404],"voices","comparison","russian","2v7EK3J-qqx4IGN97_k3v7PebrtM71sbUcQt44gco7w",{"id":407,"title":408,"author":7,"body":409,"description":543,"extension":209,"meta":544,"navigation":211,"ogImage":212,"path":545,"publishedAt":546,"readingTime":547,"seo":548,"stem":549,"tags":550,"updatedAt":212,"__hash__":553},"blog\u002Fblog\u002Fen\u002Fvoice-fanfic-tutorial.md","How to turn a fanfic into an audiobook in 10 minutes",{"type":9,"value":410,"toc":534},[411,414,417,421,424,427,430,434,437,447,450,453,456,460,463,466,469,472,475,478,482,485,488,491,494,498,501,504,511,514,518,521,524,527,531],[12,412,413],{},"Fanfic is probably the only literature that exists, in full, only for the eyes. Professional narrators don't read fanfic — there's no publisher to pay one. Meanwhile readers are out there grinding through 300- and 500-thousand-word epics on AO3, and by the end of a long sequel everyone's eyes hurt. I've heard \"I wish this was audio\" a lot — and ten minutes of work in a service like ours now closes that gap.",[12,415,416],{},"Here's how it actually goes, on a typical 70–100k-word fic.",[19,418,420],{"id":419},"what-to-gather-before-starting","What to gather before starting",[12,422,423],{},"The text itself, as a file. AO3 has a Download button in the top right of every work that gives you EPUB — that's the ideal source, full chapter structure included. FFN's export options are limited; easiest there is to copy text into a notepad and save as .txt. If the fic is split across many chapters on the site, stitch them into one file before uploading.",[12,425,426],{},"An account on an AI narration service. Could be ours, could be a competitor — for short fics the differences are small.",[12,428,429],{},"Headphones. Even cheap earbuds. Speakers smooth artifacts, and your impression of chapter one in headphones is the more honest one.",[19,431,433],{"id":432},"the-actual-steps","The actual steps",[12,435,436],{},"Upload the file. Then go through this in order.",[12,438,439,440,330,443,446],{},"Check how the service split the text into chapters. If it gave you one monolithic block, your source doesn't have chapter markers. Two minutes in a notepad fixes it: add ",[50,441,442],{},"# Chapter 1",[50,444,445],{},"# Chapter 2"," lines before each chapter break. Without structure, the player can't jump and listening gets harder.",[12,448,449],{},"Review the character list the service detected. Auto-detection works by scanning names and dialogue tags. Fandom fics with rare names sometimes confuse it. Walk through the list and make sure every canon character has something reasonable assigned.",[12,451,452],{},"Cast voices for the main pairing or core characters. This is the important part, and on a fic about a specific couple I'd spend ten minutes here, not five. The voice you hear Snape in, or Loki, or Aziraphale, is half the success. Side characters are fine on automation.",[12,454,455],{},"Start the render. A 70k-word fic finishes in ten to twenty minutes. Time enough to make tea.",[19,457,459],{"id":458},"voices-for-canon-characters","Voices for canon characters",[12,461,462],{},"There's no universal table, but a few patterns hold across fandoms.",[12,464,465],{},"Brooding adult men — Snape, Loki, post-timeskip Viktor, older Allen Walker — low male voice, measured pace, a touch of cold in the style hint. \"Cold, detached\" is usually enough.",[12,467,468],{},"Teen protagonists — slightly higher than mid-range, male or female by canon, moderate pace, more emotion than baseline.",[12,470,471],{},"Wise mentor figures — Dumbledore, Gandalf, Iruka, Ironwood when written kindly — older male, slow, warm. The Gandalf template still works, even outside its own fandom.",[12,473,474],{},"Villains. Don't reach for an \"evil voice.\" Seriously. Tom Riddle, Aizen, end-arc Ozpin all read best in an even, calm, slightly cold voice — no theatrics, no laughter. Cold restraint is more frightening than shouted rage.",[12,476,477],{},"Most decent services give you five to ten male voices and the same number of female. I'd spend three minutes listening to each on a short phrase before committing — it's worth the time.",[19,479,481],{"id":480},"where-this-usually-breaks","Where this usually breaks",[12,483,484],{},"No chapter breaks in the source. Already covered above — two minutes in notepad to add headers.",[12,486,487],{},"Dialogue without attribution. If a chunk of replies skip \"she said,\" \"he answered\" tags, AI starts mixing speakers up. In short scenes that's fine; in long two-character dialogues it's audibly off. For important scenes you can hand-add attributions; figure five minutes per chapter.",[12,489,490],{},"Foreign names. Especially from East Asian fandoms — manga, anime, K-pop. Transliterations of names like Satoru, Kyojuro, Taehyung, or Yuri Katsuki can land badly in English models. Hack: listen to the first two chapters, and if a name comes out warped, write a phonetic spelling in the source (or use stress markers if the service supports them).",[12,492,493],{},"Long author's notes. A lot of authors front-load chapters with \"A\u002FN: thanks to my beta,\" disclaimers, or \"sorry, this got angsty instead of fluffy.\" AI reads these as part of the chapter, and in audio it's jarring. Strip them out before upload, or move them to a separate section.",[19,495,497],{"id":496},"rights-and-the-unwritten-rules","Rights and the unwritten rules",[12,499,500],{},"Fanfic is copyright's grey zone, and there are unwritten fandom norms that you'd do well not to break.",[12,502,503],{},"For personal listening — no problem at all. You narrated your favorite fic for yourself, you're listening on the train, nobody's coming after you. Functionally identical to printing the text for your own reading.",[12,505,506,507,510],{},"For public distribution — different story. If you want to put the audio version on YouTube or in a Discord, three things to keep in mind. First, contact the author beforehand. Most fanfic authors are ",[298,508,509],{},"happy"," about fan adaptations as long as they're credited; some ask for explicit approval. Second, credit the author in every chapter, with a link to the original. Third, don't monetize. Donation-based (\"if you enjoy, support\") is generally accepted; outright sales of fan-made audio are not.",[12,512,513],{},"This isn't law, it's the fandom code. Breaking it gets your channels reported and sometimes pulled.",[19,515,517],{"id":516},"long-fics-are-their-own-thing","Long fics are their own thing",[12,519,520],{},"Half-million-word epics aren't \"narrate in ten minutes,\" they're a real project. Render takes hours, sometimes overnight.",[12,522,523],{},"Decent services (we're one) prioritize the early chapters — meaning your first hour of audio is ready in thirty minutes, while the remaining twenty hours cook in the background. Don't wait for the full thing — start listening, the tail catches up.",[12,525,526],{},"Casting for long fics matters more, especially for the main pairing and recurring side characters. For minor characters that show up in three out of a hundred scenes, I wouldn't bother — let the automation sort them.",[19,528,530],{"id":529},"short-version","Short version",[12,532,533],{},"Ten minutes for a one-shot, a couple of hours of attention for an epic. Your first audio fanfic is the hardest one — after that they stack up fast. I have dozens in the commute playlist now, and I've stopped noticing it isn't a person reading.",{"title":195,"searchDepth":196,"depth":196,"links":535},[536,537,538,539,540,541,542],{"id":419,"depth":196,"text":420},{"id":432,"depth":196,"text":433},{"id":458,"depth":196,"text":459},{"id":480,"depth":196,"text":481},{"id":496,"depth":196,"text":497},{"id":516,"depth":196,"text":517},{"id":529,"depth":196,"text":530},"Step-by-step guide for AO3 and fanfic readers. Convert text fanfic into a ready-to-stream audiobook without any magic.",{},"\u002Fblog\u002Fen\u002Fvoice-fanfic-tutorial","2026-04-01",5,{"title":408,"description":543},"blog\u002Fen\u002Fvoice-fanfic-tutorial",[551,219,552],"fanfic","practical","xj20wKXkaA8imZ0YSEMMUiwB0RDb6V30sz1CeF4IpI4",{"id":555,"title":556,"author":7,"body":557,"description":817,"extension":209,"meta":818,"navigation":211,"ogImage":212,"path":819,"publishedAt":820,"readingTime":821,"seo":822,"stem":823,"tags":824,"updatedAt":212,"__hash__":826},"blog\u002Fblog\u002Fen\u002Faudiobook-generators-comparison.md","5 AI audiobook generators compared — 2026",{"type":9,"value":558,"toc":805},[559,562,565,569,572,592,596,599,602,605,609,612,615,618,622,625,628,631,635,638,641,644,648,651,654,657,661,757,761,764,767,770,773,776,780,783,786,789,792,795,799,802],[12,560,561],{},"I sat down over a weekend and ran the same 30,000-character short story through five different AI narration services. Same conditions everywhere: prose with dialogue, mixed register, a couple of tricky words with non-standard stress thrown in as traps. No marketing screenshots, no promises — just what I heard in my headphones.",[12,563,564],{},"I'm calling them A through E. Not for drama; just because reviews of specific names go stale fast, while patterns (where the leader is, where the budget tier sits, where the dud is) hold. If you're in the space, you'll guess them.",[19,566,568],{"id":567},"how-i-scored","How I scored",[12,570,571],{},"One story. One reviewer (me). Listened on headphones, jotted impressions in real time, didn't go back for a second pass. I looked at:",[40,573,574,577,580,583,586,589],{},[43,575,576],{},"Audio quality — robotic feel, artifacts, flatness on long sentences;",[43,578,579],{},"Stress on twenty pre-selected \"trap\" words (names, Latinisms, jargon);",[43,581,582],{},"Dialogue handling — distinct voices or not;",[43,584,585],{},"Emotion and pace — natural or oversold;",[43,587,588],{},"Render time;",[43,590,591],{},"What it cost me.",[19,593,595],{"id":594},"service-a","Service A",[12,597,598],{},"Audio is nearly indistinguishable from a live narrator. If I played it through a speaker and someone walked into the room, they wouldn't know it was synthesis. Eighteen of twenty stresses correct; the two misses were a foreign name and a rare archaism — minor but audible.",[12,600,601],{},"Dialogue split perfectly, all four characters got their own voice. Emotion was restrained, no scenes oversold. Twelve minutes to render. About $2.50 for the whole story.",[12,603,604],{},"If I were picking a \"default for a book,\" this would be it. Main downside: cost adds up at scale.",[19,606,608],{"id":607},"service-b","Service B",[12,610,611],{},"The budget option, and you can hear it. Quality is good but with a faint electronic edge on long sentences. On factual content (news, weekly recaps, lectures) it doesn't matter at all. On literary prose, ten minutes in, it starts to nag.",[12,613,614],{},"Sixteen of twenty on stresses — not bad. Dialogue uses only two voices, male and female, so multiple male characters all sound the same. That's an audible limitation, not a stylistic choice.",[12,616,617],{},"Almost no emotion. Five minutes to render. Around $1.30. Has its niche: technical content at low cost.",[19,619,621],{"id":620},"service-c","Service C",[12,623,624],{},"Closer to A than to B in quality, but doesn't quite get there. Best stress score in the test — nineteen of twenty. Default voice is noticeably \"warmer\" than the rest, and I caught myself pairing texts to it on purpose.",[12,626,627],{},"Downside: it overshoots emotion in places. Where the AI should be calm with a slight edge, it goes nearly into shouting. A scene where the heroine is firmly chastising someone came out sounding like she'd caught them red-handed.",[12,629,630],{},"Eight minutes, $3.30. For romance and children's books I'd try this first.",[19,632,634],{"id":633},"service-d","Service D",[12,636,637],{},"Western multilingual service. Reportedly excellent on English; on Russian you can hear the accent. Not bad-synthesizer accent — capable-foreigner-who-learned-the-language accent. Technically clean, but not native.",[12,639,640],{},"Fourteen of twenty on stresses, which is the kind of number that matters. You can tell each word was pronounced \"by the rules\" without an understanding of the sentence. Dialogue split well, four voices. Emotion was flat.",[12,642,643],{},"Fifteen minutes, $4.10. Wouldn't recommend for Russian projects. For bilingual where English is primary — maybe.",[19,645,647],{"id":646},"service-e","Service E",[12,649,650],{},"Reliable middle. Eight on audio, seventeen of twenty on stresses, no surprises in either direction. It auto-detected the speakers and assigned voices; I had to swap two manually, the rest fit.",[12,652,653],{},"Emotion moderate, doesn't get in the way. Twenty minutes to render — slowest in the test, only real weakness. $3.00.",[12,655,656],{},"If someone asks \"give me something I won't regret,\" this is the answer. Doesn't shine in any single dimension, doesn't fail in any either.",[19,658,660],{"id":659},"on-one-line","On one line",[662,663,664,683],"table",{},[665,666,667],"thead",{},[668,669,670,674,677,680],"tr",{},[671,672,673],"th",{},"Rank",[671,675,676],{},"Service",[671,678,679],{},"Strength",[671,681,682],{},"Weakness",[684,685,686,701,715,729,743],"tbody",{},[668,687,688,692,695,698],{},[689,690,691],"td",{},"1",[689,693,694],{},"A",[689,696,697],{},"Audio + stress",[689,699,700],{},"Pricier",[668,702,703,706,709,712],{},[689,704,705],{},"2",[689,707,708],{},"C",[689,710,711],{},"Warm default voice",[689,713,714],{},"Oversells emotion",[668,716,717,720,723,726],{},[689,718,719],{},"3",[689,721,722],{},"E",[689,724,725],{},"Stability",[689,727,728],{},"Slow render",[668,730,731,734,737,740],{},[689,732,733],{},"4",[689,735,736],{},"B",[689,738,739],{},"Cost",[689,741,742],{},"Limited voices",[668,744,745,748,751,754],{},[689,746,747],{},"5",[689,749,750],{},"D",[689,752,753],{},"Multilingual",[689,755,756],{},"Russian struggles",[19,758,760],{"id":759},"picking-by-job","Picking by job",[12,762,763],{},"For a long book where quality matters, A. The audio gap is worth the money when you'll spend fifteen-plus hours listening.",[12,765,766],{},"For non-fiction or technical writing, B. The warmth is unnecessary; what matters is that the narrator doesn't tire.",[12,768,769],{},"For children's stories or romance, C. The default warmth gives you the right emotional register out of the box.",[12,771,772],{},"For long translation projects where consistency across many chapters matters most, A or E. On the long haul, \"decent everywhere\" beats \"excellent in some places.\"",[12,774,775],{},"For working in two languages at once, A. Across the test, it had the cleanest English-Russian parity.",[19,777,779],{"id":778},"whats-not-the-audio-but-matters","What's not the audio but matters",[12,781,782],{},"Beyond the voice itself, I'd look at things that don't show up in marketing copy but bite when you hit them.",[12,784,785],{},"File format support. EPUB and FB2 are mandatory for books, txt for fanfic, doc and markdown are nice extras. PDF — almost no one handles cleanly. If you specifically need PDF, plan ahead for OCR or manual extraction.",[12,787,788],{},"MP3 download. Sounds basic, but some services lock audio behind their own player, and that's a dead end if you want to move it to your own audiobook library or share it.",[12,790,791],{},"Billing. One-time vs subscription — for occasional use, one-time wins, and thank god in 2026 that's the standard. Subscriptions linger at a few Western services and a couple of dated providers.",[12,793,794],{},"Privacy. Is your uploaded text strictly yours, or can the service use it? For unpublished manuscripts and personal projects, this isn't a \"policy nuance,\" it's a deal-breaker. I read the TOS before uploading anything sensitive, and yes, I've walked away after reading.",[19,796,798],{"id":797},"bottom-line","Bottom line",[12,800,801],{},"The 2026 market has split into three tiers. Top tier (A and C) gives you audio that gets confused with a live narrator in blind tests. Mid tier (B and E) delivers enough quality at sane money. Bottom tier (D) is Western services that treat Russian as a side feature, and there's no point picking them for Russian work.",[12,803,804],{},"The advice I keep repeating: before committing a long project, run a short test passage through two or three services and listen back to back. The marketing copy won't tell you the difference between A and B — only your headphones will. Half an hour spent there saves a week of redo later.",{"title":195,"searchDepth":196,"depth":196,"links":806},[807,808,809,810,811,812,813,814,815,816],{"id":567,"depth":196,"text":568},{"id":594,"depth":196,"text":595},{"id":607,"depth":196,"text":608},{"id":620,"depth":196,"text":621},{"id":633,"depth":196,"text":634},{"id":646,"depth":196,"text":647},{"id":659,"depth":196,"text":660},{"id":759,"depth":196,"text":760},{"id":778,"depth":196,"text":779},{"id":797,"depth":196,"text":798},"Tested five popular AI narration services on the same Russian text. What they do well, what they skip, what they cost.",{},"\u002Fblog\u002Fen\u002Faudiobook-generators-comparison","2026-03-28",7,{"title":556,"description":817},"blog\u002Fen\u002Faudiobook-generators-comparison",[825,403],"review","VJRjulLqUUPh7yYyKoYN1YC_kVLIGGEP-h9gWBGBwmA",{"id":828,"title":829,"author":7,"body":830,"description":986,"extension":209,"meta":987,"navigation":211,"ogImage":212,"path":988,"publishedAt":989,"readingTime":398,"seo":990,"stem":991,"tags":992,"updatedAt":212,"__hash__":995},"blog\u002Fblog\u002Fen\u002Frussian-tts-2026.md","Text-to-speech for Russian — what 2026 models can do",{"type":9,"value":831,"toc":978},[832,835,839,846,856,859,863,866,869,872,875,879,886,889,892,895,898,901,904,908,911,914,917,920,927,931,934,937,940,943,953,956,960,963,966,969,972,975],[12,833,834],{},"In 2020 I worked on a project that needed to synthesize Russian speech for voice assistants. We tried everything on the market, settled on the least bad option, and for the next two years I couldn't listen to any TTS narration without flinching. Reflex. In 2026 I listen to AI-narrated audiobooks and ten minutes in I forget they're synthesized. What happened in five years — and where the holes still are — let me try to break it down.",[19,836,838],{"id":837},"what-changed-under-the-hood","What changed under the hood",[12,840,841,842,845],{},"The big shift is that the model stopped being a ",[298,843,844],{},"splicer",". Old pipelines went text → phonetic markup → library of pre-recorded sound bites → stitch into a waveform. That's the call-center sound, because stitching can't reason about context.",[12,847,848,849,852,853,855],{},"Modern models generate the audio waveform directly through a neural network that sees the surrounding sentence, often the whole paragraph. That means the pause after a comma is ",[298,850,851],{},"actually"," natural, the rise before a question mark is ",[298,854,851],{}," a rise, and a dramatic scene gets a softer drop in timbre. Not because the rules say so — because the model has seen millions of examples of how live people do it.",[12,857,858],{},"In 2026 the production stack is mostly transformer-based — same architecture as large language models, retrained for audio. Diffusion-based and flow-matching compete in research and sometimes win on quality, but they're slower. Gemini TTS, which we run, is transformer-based.",[19,860,862],{"id":861},"what-russian-tts-can-do-in-2026","What Russian TTS can do in 2026",[12,864,865],{},"In normal use, almost everything you'd want.",[12,867,868],{},"Read prose with natural intonation — yes. Stress correct 95–97% of the time — yes. Context-aware emotional coloring (sadness, joy, tension) — yes, and not overcooked. Voice differentiation in dialogue — works. Pace adjusts to text type (literary slower, non-fiction faster).",[12,870,871],{},"Middle ground: question and exclamation intonation (sometimes slightly oversold), proper nouns (native — fine, foreign — coin flip), pauses in the right places (mostly yes, occasionally breathing in odd spots).",[12,873,874],{},"Still hard: poetry with meter preserved — no. Subtle irony or sarcasm without explicit cues — almost none. Texts heavy with footnotes — confused. Specialized terminology (medical, legal) — many errors. Formulas and math — disaster.",[19,876,878],{"id":877},"why-russian-is-harder-than-english","Why Russian is harder than English",[12,880,881,882,885],{},"Briefly: because the language asks the model for more ",[298,883,884],{},"understanding"," than English does.",[12,887,888],{},"Homographs with different stress are a uniquely Russian pain. \"За́мок\" (castle) and \"замо́к\" (lock) are written identically. Without context the model is just guessing.",[12,890,891],{},"Cases. Russian word endings carry grammatical role — subject, object, instrument, destination. To intonate correctly the model has to figure out the structure of the sentence, not just read the words in order.",[12,893,894],{},"Free word order. \"Я написал книгу\" and \"Книгу я написал\" mean the same thing but emphasize different things. English word order is rigid, easier to navigate. In Russian the model has to infer what's important from context.",[12,896,897],{},"Training data. There's an order of magnitude more English audiobooks with transcripts in datasets than Russian ones. Pure math: the model learns from what's there, and English has more.",[12,899,900],{},"Verb aspect. \"Делать\" and \"сделать\" are different verbs — process versus result. English doesn't draw that line so cleanly; it works through tense and context.",[12,902,903],{},"Gemini handles all of this better than the alternatives — Google has put real effort into Russian localization. But head-to-head, the same model's English remains cleaner than its Russian. That's normal, and the gap will probably narrow over the next couple of years.",[19,905,907],{"id":906},"concrete-glitches-everyone-hits","Concrete glitches everyone hits",[12,909,910],{},"Numerals. \"1 500 000\" — one model reads \"one and a half million,\" another reads \"one five hundred thousand.\" If the number matters, write it out.",[12,912,913],{},"Dates. \"12.04.2026\" can come out as \"twelve dot zero four dot two thousand twenty-six.\" Brutal. Write \"April 12, 2026\" and the model handles it.",[12,915,916],{},"Abbreviations. \"USSR\" usually goes letter by letter, which is correct. \"NATO,\" \"VAT,\" shorter agency names — coin flip. Verify.",[12,918,919],{},"Foreign technical terms in Russian text. \"DevOps engineer\" might end up \"dev-ops engineer\" syllable-by-syllable. Or \"devops engineer,\" which is fine. Depends on the model and luck.",[12,921,922,923,926],{},"URLs and emails. AI doesn't know what to do with ",[50,924,925],{},"@"," — silent or read as \"at,\" which is awkward. If you have an email in the text that matters, write it out: \"name at domain dot com.\"",[19,928,930],{"id":929},"where-this-is-heading","Where this is heading",[12,932,933],{},"I don't love forecasts, but a few things look clear at the 2027–2028 horizon.",[12,935,936],{},"Voice cloning becomes routine. Right now it's largely an English-and-Western-services story. Russian quality will catch up to original, and that will reshape podcasting, first-person audiobooks, and personal archives.",[12,938,939],{},"Multilingual models with preserved character. Today, switching from Russian to English mid-book (a quotation, a name, a term) \"jumps\" voice — different timbre, different manner. Soon: smooth crossover without losing identity.",[12,941,942],{},"Real-time. Right now a full book renders in hours. That's a model limitation, not a service one. By 2028, expect real-time for most jobs — upload, listen.",[12,944,945,946,330,949,952],{},"Explicit emotional control. Style hints work today, but unevenly. The future is clear in-text tags — ",[50,947,948],{},"\u003Cgentle>",[50,950,951],{},"\u003Cgrim>"," — handled cleanly and predictably.",[12,954,955],{},"Book-level context. Today the model sees a paragraph at most. Soon — a chapter, eventually a whole book. That gives consistent character intonation from page one to page eight hundred, instead of \"cheerful sometimes, sad sometimes, no clear reason.\"",[19,957,959],{"id":958},"what-id-pick-today","What I'd pick today",[12,961,962],{},"If you have to choose a Russian TTS in 2026, here's how I'd shop.",[12,964,965],{},"For literary fiction and audiobooks — Gemini TTS. Currently the top of the market. We run on it; I don't know anything better.",[12,967,968],{},"For technical and system content (IVR, navigation, voice assistants) — Yandex SpeechKit. Stable, narrow voice roster, but the voices are quality and built for the job.",[12,970,971],{},"For multilingual projects with Russian — Gemini again. If you specifically need a clone of a voice for short pieces — ElevenLabs, with the awareness that Russian is weaker than English.",[12,973,974],{},"For open-source and personal pet projects — Silero. Free for personal use, narrow voice roster, quality is fine for home, not for production.",[12,976,977],{},"The list will shift in a year or two. The market moves fast, new models drop quarterly, and \"top three today\" is something you keep updating in your head. Worth checking back on reviews every six months instead of telling yourself your current pick is settled for years.",{"title":195,"searchDepth":196,"depth":196,"links":979},[980,981,982,983,984,985],{"id":837,"depth":196,"text":838},{"id":861,"depth":196,"text":862},{"id":877,"depth":196,"text":878},{"id":906,"depth":196,"text":907},{"id":929,"depth":196,"text":930},{"id":958,"depth":196,"text":959},"Technical overview of Russian TTS. Where we are now, what still breaks, where the industry is heading.",{},"\u002Fblog\u002Fen\u002Frussian-tts-2026","2026-03-22",{"title":829,"description":986},"blog\u002Fen\u002Frussian-tts-2026",[993,994,404],"technical","tts","Z8Bd_1xgYdK6_922ZwCVFG4itQqq7gg6kTnx4nbPikE",{"id":997,"title":998,"author":7,"body":999,"description":1203,"extension":209,"meta":1204,"navigation":211,"ogImage":212,"path":1205,"publishedAt":1206,"readingTime":821,"seo":1207,"stem":1208,"tags":1209,"updatedAt":212,"__hash__":1213},"blog\u002Fblog\u002Fen\u002Faudiobook-self-publishing.md","AI narration for self-published authors — a guide",{"type":9,"value":1000,"toc":1191},[1001,1004,1008,1015,1018,1021,1024,1027,1031,1038,1047,1050,1053,1057,1060,1068,1075,1079,1082,1089,1092,1095,1099,1102,1105,1108,1112,1115,1118,1121,1125,1128,1131,1134,1137,1141,1148,1151,1168,1172,1175,1178,1181,1184,1188],[12,1002,1003],{},"I know a fair number of authors who, five years ago, finished a good book, pushed it to KDP and a few indie storefronts, and then quietly let go of the idea of an audio edition. Because an audio edition meant $1,500 to $3,000 in a studio, and that was just a different budget — one indies don't get to. In 2026 the same job costs $50–100 and a couple of days, without leaving your desk. People keep asking how, so here's the actual workflow.",[19,1005,1007],{"id":1006},"first-is-it-even-worth-it","First — is it even worth it",[12,1009,1010,1011,1014],{},"There's a temptation to ship audio because you suddenly ",[298,1012,1013],{},"can",". Not every text needs it, and I'd answer three questions honestly before opening the wallet.",[12,1016,1017],{},"Ask your audience directly. In your social, your reader chat, your newsletter: \"would you actually listen to this?\" Sometimes 80% of replies tell you they read with their eyes and aren't switching. Then audio is an exercise for you, not a product for them.",[12,1019,1020],{},"Think about genre. Non-fiction, YA, genre fiction land cleanly in audio. Literary fiction is more reliably read than listened to. Poetry in AI narration is something I wouldn't ship.",[12,1022,1023],{},"Think about length. A short story or novella is a \"bonus to the ebook\" — a promo, a gift to subscribers. A long novel is its own product with its own pricing and promotion plan, and that's a different project.",[12,1025,1026],{},"If those answers line up — go.",[19,1028,1030],{"id":1029},"what-you-actually-need","What you actually need",[12,1032,1033,1034,1037],{},"A finished text in a sane format — epub, fb2, markdown, txt. ",[298,1035,1036],{},"Finished",", because any post-render edit means re-rendering, re-uploading, and a couple of extra hours of fiddly work.",[12,1039,1040,1041,1046],{},"An account with an AI narration service. Picking one is its own pain; I have a ",[1042,1043,1045],"a",{"href":1044},"\u002Fblog\u002Faudiobook-generators-comparison","comparison post"," where I ran five popular ones through the same text.",[12,1048,1049],{},"A budget of $30–100 per book, depending on length and service.",[12,1051,1052],{},"A day to three of your time, total. Most of it is listening to early renders and tweaking.",[19,1054,1056],{"id":1055},"step-one-clean-the-text","Step one. Clean the text",[12,1058,1059],{},"You can get the file you upload into reasonable shape in about half an hour, and that half hour saves you several later.",[12,1061,1062,1063,330,1065,1067],{},"Remove stray blank lines and ragged indents from your Word export. Add explicit chapter headers — ",[50,1064,442],{},[50,1066,445],{}," — regardless of what your editor outputs. Check dialogue: it should be marked with quotes or em-dashes, ideally with attribution (\"she said\"). Footnotes and \"see page 47\" cross-references should either go or get reformatted; AI doesn't read them as humans do, and they break the flow in audio.",[12,1069,1070,1071,1074],{},"If you know specific words your model botches — proper nouns, technical terms, place names — and your service supports manual stress markers, mark them in the source. I once lost three render passes to a single character name before I just put ",[50,1072,1073],{},"+"," on the right vowel and was done with it.",[19,1076,1078],{"id":1077},"step-two-voices","Step two. Voices",[12,1080,1081],{},"The thing I want to say up front: don't trust full automation. It's good enough for a draft, to feel out how the book sounds in principle. The final pass needs human judgment, at least on the main roles.",[12,1083,1084,1085,1088],{},"The narrator voice is the headline. That's seventy percent of your listening time, and if you don't like it, you're miserable for fifteen hours. Pick one ",[298,1086,1087],{},"you"," personally want to listen to. Not \"right for the genre\" — the one that doesn't grate after thirty minutes.",[12,1090,1091],{},"Main characters, three to five of them, deserve manual selection. The voice should read as \"different person\" by their first line; otherwise listeners get lost in dialogue. Secondary characters can take whatever automation gives them, but spend a minute scanning the assignments to make sure nobody got an actively wrong tone (a grandmother shouldn't sound twelve).",[12,1093,1094],{},"Style hints help. Short and direct beats long and cinematic: \"cold, detached,\" \"with irony,\" \"slow and thoughtful\" land cleanly. Elaborate descriptions (\"aggressive cold baritone of a senior officer\") tend to confuse the model more than they help.",[19,1096,1098],{"id":1097},"step-three-one-chapter-then-everything-else","Step three. One chapter, then everything else",[12,1100,1101],{},"Don't render the whole book in one shot. This rule cost me my first attempt — I queued 600,000 characters at once, came back hours later to a finished audio, and discovered the protagonist's last name had stress on the wrong syllable in all 200 mentions. Full re-render.",[12,1103,1104],{},"So: render chapter one. Listen end to end, in headphones, not at 2x. What to catch — stress on your own surnames and place names, voices (do they sound how you imagined), pause naturalness, long-sentence performance (where AI tends to thin out).",[12,1106,1107],{},"Chapter one's clean? Push the rest. Not? Adjust and try again. This isn't wasted time, it's insurance.",[19,1109,1111],{"id":1110},"step-four-cover-and-metadata","Step four. Cover and metadata",[12,1113,1114],{},"Audio lives by different rules than ebook on the platforms. The audiobook cover almost always needs to be made fresh — square 3000×3000, title legible at thumbnail size in a phone player. An ebook cover designed for portrait with a fine font and a wide landscape will turn into mush on a player.",[12,1116,1117],{},"Designer for the cover adaptation: $50–150. If you don't have one, you can put together a basic square in Canva in an evening.",[12,1119,1120],{},"Metadata duplicates: title, author, description, genre, keywords. Same content as the ebook, separate upload.",[19,1122,1124],{"id":1123},"step-five-where-to-put-it","Step five. Where to put it",[12,1126,1127],{},"The big four for English-language indie audio: Audible (via ACX), Findaway Voices (which fans out to many platforms), Storytel, Kobo. ACX is the obvious entry point if you're aiming at the Audible audience. Findaway gives you the widest distribution from one upload. Storytel and Kobo have their own indie programs.",[12,1129,1130],{},"Royalty splits through these channels run 30–50%. It's a lot, but it's the price of distribution.",[12,1132,1133],{},"The alternative is selling directly through your own site, Substack, Patreon, or Bandcamp. You keep everything, but you handle the traffic and payments yourself. For an established audience this works; for a new one, less so.",[12,1135,1136],{},"Most indies I know in 2026 do a hybrid: main launch through the big platforms, exclusive formats and early access through their own channel.",[19,1138,1140],{"id":1139},"step-six-it-does-not-market-itself","Step six. It does not market itself",[12,1142,1143,1144,1147],{},"This is the part I want to underline. AI narration is a ",[298,1145,1146],{},"production technology",", not marketing. Your book won't randomly surface in Audible's \"new in audio\" carousel. You have to bring your audience.",[12,1149,1150],{},"What actually moves the first hundred to thousand listens:",[40,1152,1153,1156,1159,1162,1165],{},[43,1154,1155],{},"Short clips on TikTok and Reels — the first interesting minute, with a \"find the rest at\" caption.",[43,1157,1158],{},"Promo codes for your existing audience — newsletter, Discord, Patreon.",[43,1160,1161],{},"Guest swaps with same-genre indie authors, honest cross-pollination.",[43,1163,1164],{},"Beta listeners — ten to twenty people get the audio free in exchange for an honest review.",[43,1166,1167],{},"Indie publishing podcasts (there are many, they always want guests).",[19,1169,1171],{"id":1170},"things-ive-stepped-in-in-case-you-dont-have-to","Things I've stepped in, in case you don't have to",[12,1173,1174],{},"Quotes from other authors. If your book has direct quotations, verify the rights. Fair use varies by jurisdiction, and audio sometimes has stricter limits.",[12,1176,1177],{},"Final proof. Once audio is published, fixing a typo means re-rendering the chapter, re-uploading to platforms, sometimes refreshing metadata. A pile of micro-work. So a week before launch, do one more text pass.",[12,1179,1180],{},"The cheap-service trap. Saving 50% of the budget can cost you reviews from listeners who hear the electronic edge. If this is a book you care about, don't optimize for the lowest price.",[12,1182,1183],{},"Transparency with readers. In the audio edition's description I'd just write \"AI narration\" out loud. Most people are fine with it. A few are categorically not. But nobody appreciates finding out themselves. Trust matters more than the awkwardness of the disclosure.",[19,1185,1187],{"id":1186},"in-one-paragraph","In one paragraph",[12,1189,1190],{},"AI narration isn't magic. It's a new working tool that's replaced the studio for most indie scenarios. It won't make your book a bestseller on its own. It will give you a door that didn't exist five years ago — shipping audio without going broke. If your book already has readers in text, audio is the natural next step. If it doesn't, fix marketing first, audio later. No service routes around that order.",{"title":195,"searchDepth":196,"depth":196,"links":1192},[1193,1194,1195,1196,1197,1198,1199,1200,1201,1202],{"id":1006,"depth":196,"text":1007},{"id":1029,"depth":196,"text":1030},{"id":1055,"depth":196,"text":1056},{"id":1077,"depth":196,"text":1078},{"id":1097,"depth":196,"text":1098},{"id":1110,"depth":196,"text":1111},{"id":1123,"depth":196,"text":1124},{"id":1139,"depth":196,"text":1140},{"id":1170,"depth":196,"text":1171},{"id":1186,"depth":196,"text":1187},"How an indie author can make an audio edition of their book for $100 instead of $2000 for a studio. What's possible, what's not.",{},"\u002Fblog\u002Fen\u002Faudiobook-self-publishing","2026-03-15",{"title":998,"description":1203},"blog\u002Fen\u002Faudiobook-self-publishing",[1210,1211,1212],"self-publishing","authors","audiobooks","yqri18xSaq-s9NBHoIno4ydtCKOUDP-pYgzfLDEi0T4",{"id":1215,"title":1216,"author":7,"body":1217,"description":1387,"extension":209,"meta":1388,"navigation":211,"ogImage":212,"path":1389,"publishedAt":1390,"readingTime":547,"seo":1391,"stem":1392,"tags":1393,"updatedAt":212,"__hash__":1397},"blog\u002Fblog\u002Fen\u002Fai-narrator-youtube-podcast.md","AI narrator for YouTube and podcasts — when it works",{"type":9,"value":1218,"toc":1378},[1219,1222,1225,1229,1235,1241,1247,1253,1259,1263,1269,1279,1285,1291,1297,1301,1304,1307,1314,1317,1319,1322,1325,1329,1332,1335,1346,1349,1353,1356,1360,1367],[12,1220,1221],{},"I struggled with my own voice for a long time. On a recording it sounds nothing like the version in my head, and I used to re-record every other video three times because something was always \"off.\" When AI narration finally stopped sounding like a robot from the subway, I started swapping it in for the parts of my content I'd been doing myself.",[12,1223,1224],{},"Some formats took to it immediately. Others still don't work, and probably won't anytime soon. Here's what I've actually shipped.",[19,1226,1228],{"id":1227},"where-ai-narration-sticks","Where AI narration sticks",[12,1230,1231,1234],{},[244,1232,1233],{},"News digests."," The most obvious case. Your listener showed up for facts, not for sighs and breath catches — they need structure and pace. AI handles that better than tired-me on a Friday evening.",[12,1236,1237,1240],{},[244,1238,1239],{},"Educational videos."," If you're explaining ancient Rome, the central bank's policy rate, or how CRISPR works, voice is secondary, content is primary. AI keeps a steady tempo even on long sentences, never forgets stress on tricky words, doesn't slump after an hour.",[12,1242,1243,1246],{},[244,1244,1245],{},"\"5 facts about…\" and listicles."," Nothing personal here, the energy comes from editing. I moved my whole list-format channel to AI narration and a month later nobody had emailed asking for me back.",[12,1248,1249,1252],{},[244,1250,1251],{},"Sleep stories and meditations."," This is one place where the very flatness of the voice is a feature. Live narrators tend to over-perform \"calming,\" and the result is the opposite of calming.",[12,1254,1255,1258],{},[244,1256,1257],{},"Book trailers."," A two- or three-minute excerpt of a book in AI narration as a social media ad — that's not \"cheap and cheerful\" anymore, it's a working tool.",[19,1260,1262],{"id":1261},"where-ai-breaks-down-audibly","Where AI breaks down, audibly",[12,1264,1265,1268],{},[244,1266,1267],{},"Personal vlogs."," If your channel is about you, your thoughts, your experience, swapping your voice for a neural net pulls out the most valuable thing in the video. I tried. It doesn't work. People subscribed to me, not to a TTS provider.",[12,1270,1271,1274,1275,1278],{},[244,1272,1273],{},"Interviews."," Obvious. The reaction, the pauses, the awkward laughs, the cross-talk — that ",[298,1276,1277],{},"is"," the interview. A scripted AI dialogue comes out sterile.",[12,1280,1281,1284],{},[244,1282,1283],{},"Emotional storytelling."," When you're telling a personal story about loss, or unexpected joy, or something you actually went through — the voice has to crack a little. AI reads it level. And in that levelness, it sounds fake.",[12,1286,1287,1290],{},[244,1288,1289],{},"Comedy."," This is the one place AI clearly loses. Joke timing is tenths of seconds between \"and then I realized\" and the punchline. AI doesn't feel that. So the joke lands a beat early or a beat late, and either way it lands flat. Comedy videos with AI narration are usually dead on arrival.",[12,1292,1293,1296],{},[244,1294,1295],{},"ASMR."," No.",[19,1298,1300],{"id":1299},"things-i-figured-out-by-doing-this","Things I figured out by doing this",[12,1302,1303],{},"Length isn't an issue. A ten-minute video renders in a few minutes, editing takes about as long as with my own voice, sometimes less because there are no retakes.",[12,1305,1306],{},"Sound mixing has a quirk. A raw AI track sits a little \"drier\" on background music than a live track does. Light reverb and a couple of dB of room ambience fix it — after that the mix stops outing the AI on careful listens.",[12,1308,1309,1310,1313],{},"YouTube playback speed surprised me. AI voices at 1.5x sound ",[298,1311,1312],{},"cleaner"," than live voices at 1.5x. The model has already baked in an optimal cadence, so speeding it up doesn't break anything.",[12,1315,1316],{},"If your service can spit out time-coded captions alongside the audio — take it. Saves a whole evening on subtitles.",[19,1318,79],{"id":78},[12,1320,1321],{},"Bluntly: a freelance narrator for a single 10-minute video starts around $80–150 and goes up. AI narration of the same video costs a few dollars, sometimes less. If you're publishing one or two videos a week, the quarterly difference is a vacation.",[12,1323,1324],{},"Your own voice is \"free,\" but there's an hour of work and an emotional tax on listening to your own ums. For me that tax was higher than the subscription.",[19,1326,1328],{"id":1327},"podcasts-are-their-own-thing","Podcasts are their own thing",[12,1330,1331],{},"I keep wanting to say \"podcasts work in AI\" but I can't. A podcast is two live people interrupting each other, getting surprised, leaving pauses. Without that, you don't have a podcast, you have an audio article.",[12,1333,1334],{},"What does work in podcast format:",[40,1336,1337,1340,1343],{},[43,1338,1339],{},"Monologue \"reads\" — author essays, longreads, breakdowns. That's basically an audio article anyway.",[43,1341,1342],{},"News podcasts — same as YouTube digests.",[43,1344,1345],{},"Audio versions of blog posts and books — basically a short audiobook.",[12,1347,1348],{},"I've heard experiments with two AI voices simulating dialogue. Technically it works, emotionally it's dead — there's no reaction to what was said, just turn-taking. Maybe in a couple of years. Not today.",[19,1350,1352],{"id":1351},"the-legal-piece-briefly","The legal piece, briefly",[12,1354,1355],{},"Most decent providers (us included) allow commercial use of synthesized voice — so monetizing YouTube with their voice is fine. But three things I'd keep in mind: read the terms of the specific service you're using; don't clone celebrity voices, that's a hard no almost everywhere; label \"AI narration\" in the description if the platform requires it (TikTok specifically does).",[19,1357,1359],{"id":1358},"what-id-try-first","What I'd try first",[12,1361,1362,1363,1366],{},"Take the script of an existing video — one you recorded yourself, where you can compare side by side. Run it through, no edits. In five minutes you'll have an audio file you can lay next to your own version and judge honestly. ",[298,1364,1365],{},"Then"," decide what to keep doing yourself and what to hand over.",[12,1368,1369,1370,1373,1374,1377],{},"The biggest takeaway from six months of this: AI doesn't replace your ",[298,1371,1372],{},"voice"," — it replaces the ",[298,1375,1376],{},"recording step",". There's a very specific slot for it in your production chain, and you shouldn't try to stretch it across everything. Your voice on a personal channel is an asset. The script of a faceless news digest is not.",{"title":195,"searchDepth":196,"depth":196,"links":1379},[1380,1381,1382,1383,1384,1385,1386],{"id":1227,"depth":196,"text":1228},{"id":1261,"depth":196,"text":1262},{"id":1299,"depth":196,"text":1300},{"id":78,"depth":196,"text":79},{"id":1327,"depth":196,"text":1328},{"id":1351,"depth":196,"text":1352},{"id":1358,"depth":196,"text":1359},"Can you replace a human voice with a neural net for YouTube videos or podcasts? Honest breakdown — where yes, where no.",{},"\u002Fblog\u002Fen\u002Fai-narrator-youtube-podcast","2026-03-08",{"title":1216,"description":1387},"blog\u002Fen\u002Fai-narrator-youtube-podcast",[1394,1395,1396],"youtube","podcasts","narration","taHBtycauCI-DPGNcrfiVOzEN0iaD-zNvGPxiYT5a4U",{"id":1399,"title":1400,"author":7,"body":1401,"description":1651,"extension":209,"meta":1652,"navigation":211,"ogImage":212,"path":1653,"publishedAt":1654,"readingTime":547,"seo":1655,"stem":1656,"tags":1657,"updatedAt":212,"__hash__":1659},"blog\u002Fblog\u002Fen\u002Fcharacter-voices-assign.md","How to assign different voices to audiobook characters",{"type":9,"value":1402,"toc":1641},[1403,1406,1409,1413,1416,1419,1433,1436,1450,1454,1457,1463,1469,1475,1481,1487,1493,1499,1505,1509,1515,1521,1527,1533,1537,1544,1547,1550,1553,1557,1560,1563,1566,1570,1573,1584,1588,1591,1594,1598,1601,1638],[12,1404,1405],{},"I once rendered my first book in a single voice, and two chapters in I caught myself scrolling back up the page to figure out who had just spoken. Which was a bad sign, because I'd written the text. Since then I never push a dialogue-heavy book through render without a casting pass. Ten or fifteen minutes before render saves hours of \"wait, who?\" later.",[12,1407,1408],{},"What follows is what I've worked out for myself over time. Not rules — patterns.",[19,1410,1412],{"id":1411},"when-automation-is-fine-when-it-isnt","When automation is fine, when it isn't",[12,1414,1415],{},"Most services these days auto-assign voices: parse characters out of the text, guess gender, age, type, match to descriptions.",[12,1417,1418],{},"Trust automation when:",[40,1420,1421,1424,1427,1430],{},[43,1422,1423],{},"The text is short, under thirty thousand characters.",[43,1425,1426],{},"There are no more than two to four speakers.",[43,1428,1429],{},"It's non-fiction without dialogue.",[43,1431,1432],{},"You're just running a draft to feel out the narrator voice.",[12,1434,1435],{},"Step in manually when:",[40,1437,1438,1441,1444,1447],{},[43,1439,1440],{},"The book is long, and you'll be living with it for hours.",[43,1442,1443],{},"There are characters you specifically care about (the main pairing in a fanfic, your protagonist).",[43,1445,1446],{},"Automation visibly missed — gave a stern adult man a teenage voice.",[43,1448,1449],{},"Two characters in dialogue end up sounding so similar you can't tell them apart.",[19,1451,1453],{"id":1452},"archetypes-that-work-most-of-the-time","Archetypes that work most of the time",[12,1455,1456],{},"I keep a few mental templates for casting. They're not \"right,\" they just keep working as starting points.",[12,1458,1459,1462],{},[244,1460,1461],{},"The narrator."," Neutral, mid-age, no accent, moderate pace. I default to male — it's the conventional choice in English audiobook tradition. For first-person novels with a female protagonist-narrator, female reads cleaner.",[12,1464,1465,1468],{},[244,1466,1467],{},"The young protagonist."," Slightly higher than mid, lively, emotional. Not bassy. A seventeen-year-old shouldn't sound forty.",[12,1470,1471,1474],{},[244,1472,1473],{},"The female lead."," Depends on character. \"Strong, independent\" — confident mid-range. \"Sensitive, artistic\" — softer, slightly below mid. Either way, no syrup.",[12,1476,1477,1480],{},[244,1478,1479],{},"The antagonist."," A common mistake is reaching for an \"evil voice.\" It doesn't land. Even, calm, cold, no obvious emotion is what works. Scarier than theatrical laughter.",[12,1482,1483,1486],{},[244,1484,1485],{},"The wise mentor."," Older, usually male, slow pace, warm. The Gandalf template hasn't gone anywhere, and it still does the job.",[12,1488,1489,1492],{},[244,1490,1491],{},"Love interest."," Mid-range, slightly more warmth than the narrator, no slipping into sweet. The line is thin and easy to cross.",[12,1494,1495,1498],{},[244,1496,1497],{},"Parents."," Adult, mature voices. Mother — warm. Father — even, authoritative without weight.",[12,1500,1501,1504],{},[244,1502,1503],{},"Children."," High, fast. Only when the text actually has children. A fifteen-year-old protagonist sounding eight is unintentionally funny.",[19,1506,1508],{"id":1507},"the-mistakes-almost-everyone-makes","The mistakes almost everyone makes",[12,1510,1511,1514],{},[244,1512,1513],{},"Too much contrast."," Ten characters, all radically different voices, from squeaky teen to bass grandpa — your ears tap out twenty minutes in. Better: three or four anchor voices for the main cast and others built around them, with small timbre shifts. The book should sound like an ensemble, not a zoo.",[12,1516,1517,1520],{},[244,1518,1519],{},"Stereotyped picks."," Villain: raspy and low. Fool: thin and high. Scientist: monotone. It works in cartoons; in audiobooks it goes from cute to caricature in an hour.",[12,1522,1523,1526],{},[244,1524,1525],{},"Ignoring the author's signals."," If the text says \"Anna's high voice\" and you cast her low and velvety, complaints will follow. Before casting I scan the text for any explicit voice descriptions the author baked in. Five minutes, saves embarrassment.",[12,1528,1529,1532],{},[244,1530,1531],{},"Gender slips in auto-cast."," Sometimes the automation routes a male role to a female voice or vice versa. A quick scan after auto-cast catches this in seconds.",[19,1534,1536],{"id":1535},"style-hints","Style hints",[12,1538,1539,1540,1543],{},"Most modern services let you describe the ",[298,1541,1542],{},"style"," on top of picking the voice itself. This works, and I lean on it.",[12,1545,1546],{},"What reliably lands: \"cold, detached,\" \"with a slight smirk,\" \"slow, thoughtful,\" \"fast, certain,\" \"raspy, tired.\" Two or three words is enough. The longer the description, the worse the model handles it.",[12,1548,1549],{},"What doesn't land: long cinematic sentences like \"the cold, aggressive baritone of an experienced officer who just received bad news.\" The model treats this worse than a plain \"cold, firm.\" Less is more.",[12,1551,1552],{},"A separate note: don't overdose on emotion. Light coloring is good, especially when the scene is genuinely heavy. Whispers, sobs, full hysterics — almost never. You can listen to that for five minutes. You cannot listen to that across a fifteen-hour book.",[19,1554,1556],{"id":1555},"long-books-are-their-own-pain","Long books are their own pain",[12,1558,1559],{},"In 500-plus-page novels there's a real risk you'll start losing track of which secondary character has which voice. What works for me is a small table: character — voice — style hint — a couple of distinguishing notes. Something like: \"Boris — Charon — 'tired, ironic' — always pauses before speaking.\"",[12,1561,1562],{},"For truly minor characters who appear in two or three scenes, I don't bother — let automation pick. Tuning \"the bookstore clerk in chapter seven\" isn't a good use of an evening.",[12,1564,1565],{},"For recurring characters, write them down. A hundred pages later you won't remember what you picked for the secretary in chapter four.",[19,1567,1569],{"id":1568},"listen-to-chapter-one-before-pushing-the-rest","Listen to chapter one before pushing the rest",[12,1571,1572],{},"After the first chapter renders, listen all the way through. I check three things:",[40,1574,1575,1578,1581],{},[43,1576,1577],{},"Can I tell characters apart in dialogue without prompts? If three voices in a single scene blur, something needs to change.",[43,1579,1580],{},"Do any two voices conflict? Sometimes my picks are so close they merge in a shared scene.",[43,1582,1583],{},"Do characters sound the way I imagined? If I cast \"cold antagonist\" and hear theatrical menace, the style hint needs work.",[19,1585,1587],{"id":1586},"re-rendering-is-your-friend","Re-rendering is your friend",[12,1589,1590],{},"Decent services (we're one) let you re-render specific chapters or scenes. That's the safety net: in chapter five you realize Ira's voice isn't right — override — render only chapter five. The rest stays.",[12,1592,1593],{},"Faster, cheaper, and removes the fear of \"now I have to redo everything.\" Don't be precious about pointwise fixes — they're a normal part of the process, not a sign you set things up wrong upfront.",[19,1595,1597],{"id":1596},"the-pre-render-checklist","The pre-render checklist",[12,1599,1600],{},"I run this myself before queueing a full book:",[40,1602,1605,1614,1620,1626,1632],{"className":1603},[1604],"contains-task-list",[43,1606,1609,1613],{"className":1607},[1608],"task-list-item",[1610,1611],"input",{"disabled":211,"type":1612},"checkbox"," Every key character got a manually picked voice.",[43,1615,1617,1619],{"className":1616},[1608],[1610,1618],{"disabled":211,"type":1612}," Voices are distinguishable in a single dialogue scene.",[43,1621,1623,1625],{"className":1622},[1608],[1610,1624],{"disabled":211,"type":1612}," Author-set voice descriptions in the text were respected.",[43,1627,1629,1631],{"className":1628},[1608],[1610,1630],{"disabled":211,"type":1612}," Style hints added for complex roles.",[43,1633,1635,1637],{"className":1634},[1608],[1610,1636],{"disabled":211,"type":1612}," The narrator doesn't pull attention to itself.",[12,1639,1640],{},"All five yes — push it. Most likely the result will hold up, and you'll only come back to this list for the next book.",{"title":195,"searchDepth":196,"depth":196,"links":1642},[1643,1644,1645,1646,1647,1648,1649,1650],{"id":1411,"depth":196,"text":1412},{"id":1452,"depth":196,"text":1453},{"id":1507,"depth":196,"text":1508},{"id":1535,"depth":196,"text":1536},{"id":1555,"depth":196,"text":1556},{"id":1568,"depth":196,"text":1569},{"id":1586,"depth":196,"text":1587},{"id":1596,"depth":196,"text":1597},"Voice casting for AI narration — what to pick for each character archetype, how to avoid typical mistakes.",{},"\u002Fblog\u002Fen\u002Fcharacter-voices-assign","2026-03-01",{"title":1400,"description":1651},"blog\u002Fen\u002Fcharacter-voices-assign",[402,1658,552],"casting","XWkVcTY8i-Arf4mpPlz3Meq56ms9PvS3LhzS-J-Kb2Y",{"id":1661,"title":1662,"author":7,"body":1663,"description":1811,"extension":209,"meta":1812,"navigation":211,"ogImage":212,"path":1813,"publishedAt":1814,"readingTime":398,"seo":1815,"stem":1816,"tags":1817,"updatedAt":212,"__hash__":1820},"blog\u002Fblog\u002Fen\u002Faudiobook-cost-comparison.md","How much does it cost to narrate a book — AI, studio, freelance",{"type":9,"value":1664,"toc":1801},[1665,1668,1671,1675,1678,1681,1684,1687,1691,1694,1697,1700,1704,1707,1710,1713,1716,1720,1723,1726,1730,1733,1736,1739,1742,1745,1748,1752,1755,1758,1761,1764,1768,1771,1774,1777,1780,1783,1786,1788,1791,1798],[12,1666,1667],{},"The most common author question I get is \"okay, but how much does it actually cost to make my book into an audiobook?\" The honest answer spans from $30 to $6,000+. The range is wide enough that no single number means anything, and almost every option in that range is legitimate — for different jobs and budgets.",[12,1669,1670],{},"Let me anchor in a concrete case: a 300-page novel, about 600,000 characters, around 15 hours of finished audio. English prose with dialogue. And walk through the four real options.",[19,1672,1674],{"id":1673},"option-1-ai-narration","Option 1. AI narration",[12,1676,1677],{},"A book like that, in our shop and most competitors', runs $30 to $100. Cheap services with a single voice and basic quality come in at the lower end. A premium tier with proper character casting, top-shelf voices, and human support sits closer to $100.",[12,1679,1680],{},"Time-wise: a few hours of actual rendering, plus a couple of your own hours setting up voices and listening to the early chapters. None of it blocks you in the moment — start the job, do something else, come back.",[12,1682,1683],{},"What's in the box besides the audio file: chapter-split MP3s, the ability to re-render if you fix something in the text (find a typo, hour later it's gone from the audio).",[12,1685,1686],{},"What can sting: rare-word stress occasionally lands wrong (one to three percent of words, often fixable with manual stress markers); emotional scenes don't quite hit a live actor's level; and, individually, some listeners get fatigued by synthesized voice after extended listening. Most don't, but it's not universal.",[19,1688,1690],{"id":1689},"option-2-freelance-narrator","Option 2. Freelance narrator",[12,1692,1693],{},"A young narrator without an obvious portfolio — around $300. Someone with credits whose voice you've already heard somewhere — $600–1200. With editing and a clean mix — add another $100–200.",[12,1695,1696],{},"Schedule is its own pain. The narrator isn't only working on you; figure three weeks to two months in normal conditions, no holidays or emergencies factored in.",[12,1698,1699],{},"What you get: one (occasionally two) live voices for the whole book, manual intonation, light editing — they'll clean their own takes and mistakes. But: one human reads everyone. Dialogue is conveyed through tone, not voice change. And typo fixes after delivery cost extra time and money — it's a re-record, then re-mix, then re-master. I once watched that process eat an extra week.",[19,1701,1703],{"id":1702},"option-3-professional-studio","Option 3. Professional studio",[12,1705,1706],{},"Different order of magnitude. Base studio narration starts around $1,500. Full post-production (edit, mix, master) — $2,500–4,000. Multiple narrators for a multi-character cast — $4,000–6,000, and I'm not exaggerating.",[12,1708,1709],{},"Timeline: one to three months from contract to delivered files.",[12,1711,1712],{},"What you get: studio-grade audio, possibly multiple narrators, optional background music and sound effects, full compliance with platform spec (Audible, Storytel — they like specific things). This isn't just a recording, it's an audio production pipeline.",[12,1714,1715],{},"The \"but\" here isn't even the price by itself, it's that for an indie author with a thousand-copy run, three thousand dollars doesn't recoup. You won't earn it back, and a studio is a tool for confirmed bestsellers, not a debut.",[19,1717,1719],{"id":1718},"option-4-record-yourself","Option 4. Record yourself",[12,1721,1722],{},"I know people who do this. They spend $50–500 on a mic, some treatment, and recording software, and then forty to eighty hours of their life on the actual recording, retakes, edits, and re-recording the chapters that didn't work. About a third of those hours produce usable audio; the rest is learning and frustration.",[12,1724,1725],{},"If you're writing a memoir or first-person non-fiction, this is potentially your strong move. Your own voice is uniqueness AI can't reproduce, and certain genres reward that. Otherwise it's a very expensive (in time) project that often doesn't get finished.",[19,1727,1729],{"id":1728},"the-honest-math","The honest math",[12,1731,1732],{},"Take an indie author. Run of 1,000 copies, $10 per audio sale.",[12,1734,1735],{},"AI at $50 breaks even on five sales. Everything past that is upside.",[12,1737,1738],{},"Freelance narrator at $800 breaks even on eighty. Will you sell eighty? For many, yes. Not everyone, and not instantly.",[12,1740,1741],{},"Studio at $3,000 breaks even on three hundred. If you don't have a confirmed audience that size, that's just money out of pocket.",[12,1743,1744],{},"Your own voice — \"free,\" but sixty hours of your time. If you value an hour at even $20, that's $1,200 in equivalent labor. Not \"free.\"",[12,1746,1747],{},"For most indies in 2026, AI is the only option that pencils out. Freelance and studio are tools for people who already have thousands of repeat readers.",[19,1749,1751],{"id":1750},"when-each-option-actually-makes-sense","When each option actually makes sense",[12,1753,1754],{},"AI fits almost every common case: a debut without a confirmed audience, a series that ships every few months, niche genres without mass demand, a parallel release with the ebook, a budget under $100. So roughly 90% of indie scenarios.",[12,1756,1757],{},"Freelance narrator earns its keep when you have a tested audience of three to ten thousand, you're writing literary fiction where \"warmth\" matters, and your budget runs $300–1,200.",[12,1759,1760],{},"Studio is for confirmed bestsellers with audiences in the tens of thousands, direct interest from the big platforms (when Audible reaches out, not the other way around), and a budget starting at $1,500.",[12,1762,1763],{},"Your own voice fits memoirs and first-person non-fiction, when you have time and patience to learn recording, and you're selling through your own channel (your site, Substack, Patreon — not platform-dependent).",[19,1765,1767],{"id":1766},"what-people-forget-when-they-budget","What people forget when they budget",[12,1769,1770],{},"Whatever number you came up with, add 20–40%. I rarely see people plan for these ahead.",[12,1772,1773],{},"The audiobook cover. Separate from the ebook cover, square 3000×3000. Designer: $50–150.",[12,1775,1776],{},"Promotion. Without it, audio doesn't sell, no matter who narrated. Minimum launch budget — $100 and up.",[12,1778,1779],{},"Platform commissions. Audible takes around half (more on non-exclusive). Storytel similar. A $10 sale leaves you $5.",[12,1781,1782],{},"Taxes. Self-employment adds 15–30% on top, depending on jurisdiction.",[12,1784,1785],{},"Sum it up and you're easily 20–40% over base. It's a normal planned cost, just don't pretend it's not coming.",[19,1787,530],{"id":529},[12,1789,1790],{},"In 2026, a typical indie author can't afford anything other than AI. It's not a preference, it's run-rate math. $800 on a freelance narrator doesn't return on a thousand-copy run, no matter how badly you want a live voice.",[12,1792,1793,1794,1797],{},"Exception: you already have an audience that buys every new book by the thousand. Then live narration earns its money — your reader showed up for ",[298,1795,1796],{},"this"," edition, not just the story.",[12,1799,1800],{},"For everyone else: AI. In 2026 it's good enough that it stops being a compromise.",{"title":195,"searchDepth":196,"depth":196,"links":1802},[1803,1804,1805,1806,1807,1808,1809,1810],{"id":1673,"depth":196,"text":1674},{"id":1689,"depth":196,"text":1690},{"id":1702,"depth":196,"text":1703},{"id":1718,"depth":196,"text":1719},{"id":1728,"depth":196,"text":1729},{"id":1750,"depth":196,"text":1751},{"id":1766,"depth":196,"text":1767},{"id":529,"depth":196,"text":530},"Real prices for book narration in 2026. Comparing AI, studio, and freelance narrator on a 300-page novel.",{},"\u002Fblog\u002Fen\u002Faudiobook-cost-comparison","2026-02-22",{"title":1662,"description":1811},"blog\u002Fen\u002Faudiobook-cost-comparison",[1818,403,1819],"pricing","economics","uC67I-RMU8g_on9D98_aKxD5qyej4H1Py1FIgyEzVCE",{"id":1822,"title":1823,"author":212,"body":1824,"description":1990,"extension":209,"meta":1991,"navigation":211,"ogImage":212,"path":1992,"publishedAt":212,"readingTime":212,"seo":1993,"stem":1998,"tags":1999,"updatedAt":212,"__hash__":2000},"blog\u002Fblog\u002Fen\u002Fai-vs-human-narrator.md","AI vs human narrator — who reads better in 2026",{"type":9,"value":1825,"toc":1983},[1826,1829,1832,1836,1842,1848,1854,1860,1866,1870,1876,1882,1887,1893,1899,1903,1906,1909,1912,1915,1918,1922,1925,1928,1931,1935,1938,1941,1958,1961,1977,1980],[12,1827,1828],{},"Once a week I get the same question: \"is your AI worse than a human narrator?\" I'm tired of answering \"depends,\" so I'm writing it down once. Over the last six months I've listened to a lot of books in both formats — sometimes the same novel in a live and an AI version, just to feel the difference in the moment.",[12,1830,1831],{},"Nobody wins across the board. Below — by category, where the difference is actually audible.",[19,1833,1835],{"id":1834},"where-ai-does-better","Where AI does better",[12,1837,1838,1841],{},[244,1839,1840],{},"Speed, and it's serious."," A studio recording of a 300-page novel is two to four weeks of narrator time. AI ships the same book in an evening. The difference isn't \"more convenient\" — it's that a whole class of books that nobody used to bother narrating (fanfic, small-press authors, niche translations) now exist in audio. That's a qualitative shift, not a quantitative one.",[12,1843,1844,1847],{},[244,1845,1846],{},"Cost."," A basic studio recording starts in the high three figures, usually four. AI is an order of magnitude lower. That changes who can afford an audio edition. Used to be: big publishers. Now: any author with a manuscript and the will to ship.",[12,1849,1850,1853],{},[244,1851,1852],{},"Character casting."," In a budget human recording, luxury is two narrators (one male, one female). Three is rare. AI gives you as many as you need and doesn't roll its eyes when you mention there are twelve speaking roles. For fantasy and crime fiction this isn't a \"nice bonus,\" it changes how dialogue feels.",[12,1855,1856,1859],{},[244,1857,1858],{},"Consistency."," Human narrators get tired by the end of a session. They flatten an emotion or, occasionally, oversell one. AI reads page one and page eight hundred the same way. Sometimes that's a good thing — especially on long books in headphones, where steady is more comfortable than performative.",[12,1861,1862,1865],{},[244,1863,1864],{},"Fixes."," Found a typo? With AI it's a two-sentence re-render, minutes. With a human narrator: book the studio again, re-record the line, splice it back into the master, pay extra. I once watched a name correction take a full week.",[19,1867,1869],{"id":1868},"where-humans-still-do-better","Where humans still do better",[12,1871,1872,1875],{},[244,1873,1874],{},"Subtext."," AI reads what's written. A good narrator reads what the author meant. Light irony, hidden frustration, tenderness behind sarcasm — AI captures only some of that, and in literary fiction the gap shows.",[12,1877,1878,1881],{},[244,1879,1880],{},"Edge cases."," A Latin phrase in the middle of an English text, an unexpected Yiddish word, a Japanese name — a human narrator handles it creatively. AI may pronounce \"amor vincit omnia\" by English letters, which is funny exactly once.",[12,1883,1884,1886],{},[244,1885,1289],{}," Weakest link, by far. Joke timing is a pause that runs a half-beat longer than expected and then breaks on a single word. That's an actor's instinct, and AI hasn't reproduced it. Comedic books in AI narration sag, and I personally don't recommend them.",[12,1888,1889,1892],{},[244,1890,1891],{},"Poetry."," Where meter matters and stress lands by sense rather than by dictionary, humans win. AI has improved, but I'd choose a live recording of poetry every time.",[12,1894,1895,1898],{},[244,1896,1897],{},"Performance in heavy scenes."," When a character cries, growls, whispers in horror, a great narrator gives you goosebumps. AI gives you \"technically correct.\" The gap is small, but it's real.",[19,1900,1902],{"id":1901},"where-the-comparison-goes-nowhere","Where the comparison goes nowhere",[12,1904,1905],{},"There are categories where \"AI vs human\" is just too close to call.",[12,1907,1908],{},"Business books, popular science, self-help — AI and an average human narrator are even. Sometimes AI wins because it doesn't tire, doesn't trip on long sentences, doesn't lose intonation by chapter forty.",[12,1910,1911],{},"Classics: an average narrator can't quite carry it — you can hear it's a job. A top narrator does it so well that's the reason you're listening. AI is steady and good. So in the middle, AI beats the mediocre human; at the top, the human still beats AI.",[12,1913,1914],{},"Children's books: a human with a sincere voice wins. Without that, well-tuned AI is reliably fine.",[12,1916,1917],{},"Thrillers and mysteries: split. AI is steadier, doesn't oversell tension. Live gives you depth. Pick what you're after.",[19,1919,1921],{"id":1920},"i-ran-a-blind-test","I ran a blind test",[12,1923,1924],{},"Not a real study. Just on friends, for fun. I took a three-minute passage of contemporary prose, made two versions — AI and a professional human narrator. Played them back to back without saying which was which.",[12,1926,1927],{},"Out of twenty people, fourteen couldn't tell with confidence. Four guessed right but said they weren't sure. Two got it wrong — mistook AI for human, and human for AI.",[12,1929,1930],{},"That's not proof, that's a snapshot. The 2026 gap is thin, and most people don't catch it. A few years ago you'd hear it in the first second.",[19,1932,1934],{"id":1933},"a-simple-rule","A simple rule",[12,1936,1937],{},"Stop framing it as \"AI or human\" and start asking \"what am I trying to do.\"",[12,1939,1940],{},"AI is a good fit for:",[40,1942,1943,1946,1949,1952,1955],{},[43,1944,1945],{},"Fanfic and personal manuscripts that wouldn't otherwise get an audio edition.",[43,1947,1948],{},"Non-fiction, business, self-help.",[43,1950,1951],{},"Podcast-style readings, audio versions of longreads.",[43,1953,1954],{},"Translations of obscure authors.",[43,1956,1957],{},"Personal projects — journals, memoirs, letters.",[12,1959,1960],{},"A human narrator earns the cost on:",[40,1962,1963,1966,1969,1971,1974],{},[43,1964,1965],{},"Bestsellers with a serious audio budget.",[43,1967,1968],{},"Children's books for the mass market.",[43,1970,1891],{},[43,1972,1973],{},"Comedic fiction.",[43,1975,1976],{},"Anniversary editions where the recording is part of the package.",[12,1978,1979],{},"For the vast majority of what people actually want to listen to in 2026, AI is enough. For the rest, a human narrator is worth the money. And that's fine, them coexisting — there used to be no choice, now there is, and people use it differently.",[12,1981,1982],{},"In 2020 I wouldn't have written this piece. In 2028 I'll probably need to rewrite it.",{"title":195,"searchDepth":196,"depth":196,"links":1984},[1985,1986,1987,1988,1989],{"id":1834,"depth":196,"text":1835},{"id":1868,"depth":196,"text":1869},{"id":1901,"depth":196,"text":1902},{"id":1920,"depth":196,"text":1921},{"id":1933,"depth":196,"text":1934},"[object Object]",{},"\u002Fblog\u002Fen\u002Fai-vs-human-narrator",{"title":1823,"description":1994},{"Honest comparison":1995,"publishedAt":1996,"author":7,"tags":1997,"readingTime":398},"where AI narration outperforms human voice actors, where it still falls short, and what to choose when.","2026-04-05",[403,1396,552],"blog\u002Fen\u002Fai-vs-human-narrator",[],"pc8HcJYSWs2NYETuAk2qiSswMgf_yDpkneWCXPF9zGc",{"id":2002,"title":2003,"author":212,"body":2004,"description":1990,"extension":209,"meta":2164,"navigation":211,"ogImage":212,"path":2165,"publishedAt":212,"readingTime":212,"seo":2166,"stem":2172,"tags":2173,"updatedAt":212,"__hash__":2174},"blog\u002Fblog\u002Fen\u002Fconvert-text-to-audiobook.md","Turn text into audio — real-world scenarios",{"type":9,"value":2005,"toc":2155},[2006,2009,2012,2016,2019,2022,2025,2029,2032,2039,2042,2045,2048,2052,2055,2058,2061,2068,2071,2075,2078,2085,2088,2095,2098,2102,2105,2108,2111,2114,2118,2121,2124,2127,2130,2133,2136,2139,2143,2146,2149,2152],[12,2007,2008],{},"When we launched the service, I assumed there'd be one use case — \"person uploads their book.\" Reality turned out more interesting. People bring everything: lecture notes before exams, longreads from Substack, a behavioral economics textbook they don't have time for, an unpublished novel, a grandmother's memoirs. Each needs different settings, and a generic guide doesn't help.",[12,2010,2011],{},"So below — six concrete scenarios. Not exhaustive, just the ones I see most. If yours doesn't match exactly, find the closest and adjust.",[19,2013,2015],{"id":2014},"_1-lecture-notes-before-an-exam","1. Lecture notes before an exam",[12,2017,2018],{},"Familiar setup: semester ends, forty lectures of notes, you can't read them in their dry form. I have a friend who spent her whole junior year listening to her own notes at the gym and on the way to class. Her take: it actually helped, not because of AI magic but because going through the same material with your ears, days after writing it, sticks better than re-reading it for the tenth time.",[12,2020,2021],{},"What to set: one narrator voice, neutral, businesslike. Pace 1.15–1.25, because notes aren't literature and listening fast doesn't hurt comprehension. No character casting.",[12,2023,2024],{},"What to expect to lose: formulas and charts just disappear. If your notes are mostly equations, listening doesn't help — go back to the original. Foreign-language terms can come out odd, and in critical spots it's worth transliterating them in the source. Splitting by topic into chapters helps you jump back to a specific section later.",[19,2026,2028],{"id":2027},"_2-a-long-article-off-the-internet","2. A long article off the internet",[12,2030,2031],{},"\"Found a great longread, thirty thousand characters, no time to read now, want to listen on the train\" — the most common scenario after books.",[12,2033,2034,2035,2038],{},"The prep step is the important one. Copy the text from the page into a notepad. Remove ad blocks, navigation, \"see also,\" image captions. Keep only the body. Save as .txt or .md. If the article has subheadings, keep them as ",[50,2036,2037],{},"## Heading"," — the service will turn them into chapters.",[12,2040,2041],{},"Voice: depends on topic. Technical — businesslike. Journalism — slightly livelier. Personal essay — warmer. Pace 1.0; speed up in the player on the move.",[12,2043,2044],{},"One thing that helped me. Long articles tend to come with filler — repeated points, surplus examples, rhetorical detours. Before render I usually trim. Saves both listening time and the characters you're paying for.",[12,2046,2047],{},"And one caveat. Not every article survives audio. If the piece relies on infographics, screenshots, tables — half the meaning stays on the page. Pick longreads where the substance is in the words.",[19,2049,2051],{"id":2050},"_3-a-textbook-for-self-study","3. A textbook for self-study",[12,2053,2054],{},"Sometimes you buy a 300-page book on psychology, history, or marketing, and sitting down to read it cover to cover never happens. You'd at least like to walk through the material.",[12,2056,2057],{},"Ideal source — epub or fb2. PDF is possible but means a text conversion that always comes out uneven; you'll spend half an hour cleaning.",[12,2059,2060],{},"Voice: one, neutral, like a competent lecturer. Not too young, not too old, somewhere in between. Pace 1.0 for new material, 1.2 for review. Casting usually unnecessary, except: if the book has lots of attributed direct quotes (\"Marx wrote…\", \"Jung observed…\") you can give the quotes a different voice — improves retention.",[12,2062,2063,2064,2067],{},"What doesn't work in textbooks: footnotes — strip them before render, they break flow. Tables and diagrams — gone. A statistics or econometrics textbook in audio loses half its value. A history, theory, or psychology textbook actually ",[298,2065,2066],{},"gains"," — verbal absorption beats visual for many readers.",[12,2069,2070],{},"And remember: a 500-page textbook is twenty-plus hours of audio. You don't listen to it in one sitting, don't try.",[19,2072,2074],{"id":2073},"_4-your-own-book-before-publication","4. Your own book before publication",[12,2076,2077],{},"This is the scenario I plug to every author I talk to. Pushing your own manuscript through AI narration is the best diagnostic you can give yourself before submitting to an editor.",[12,2079,2080,2081,2084],{},"For the narrator voice, deliberately ",[298,2082,2083],{},"don't"," pick the one that sounds like your inner reading voice. Pick the opposite, so you hear the text \"through someone else's ears.\" This is critical — your text in your imagined voice gets auto-completed mentally; you'll fill in what you meant. You need distance.",[12,2086,2087],{},"Pace 0.95–1.0. Slow enough that you can hear when a sentence limps. Faster — you'll skim past it.",[12,2089,2090,2091,2094],{},"Casting — just take the automation. The goal isn't a final product, it's hearing how your text walks. The important part is to listen ",[298,2092,2093],{},"with a notebook or notes app",". When you catch a clunky line, write it down, keep going. Three or four hours later you'll have a list of edits no eyes-only proofread will give you.",[12,2096,2097],{},"Pay extra attention to dialogue. Audio surfaces unnatural lines instantly. And to long paragraphs without breaks — if it sounds suffocating, you need to break it up.",[19,2099,2101],{"id":2100},"_5-translation-in-another-language","5. Translation in another language",[12,2103,2104],{},"Your translation of a book or article, or someone else's, and you want to hear how it lands in the target language.",[12,2106,2107],{},"The main thing is matching language to voice. English voice for English text, Russian for Russian. Don't experiment with cross-language combinations, they sound wrong. Pace standard — listening to translation in a non-native language fast is its own challenge.",[12,2109,2110],{},"Quality is excellent for English and Russian, good for German and French, medium for Polish and Ukrainian (stress sometimes drifts), still experimental for Korean, Japanese, Arabic. I wouldn't ship those last three.",[12,2112,2113],{},"Audio is great at catching translation errors. Awkward phrases or typos jump out within the first chapter. It's basically the same diagnostic as on your own book.",[19,2115,2117],{"id":2116},"_6-memoirs-and-letters-from-family","6. Memoirs and letters from family",[12,2119,2120],{},"A scenario I treat with extra care. A grandmother wrote her wartime memoirs. A grandfather kept a journal. Someone transcribed letters from the front. You'd like to keep this not just as text but as audio.",[12,2122,2123],{},"Source is often handwritten and needs to be digitized. Either type it out (manageable for short volumes) or run OCR with a careful proofread — handwriting recognition still misses things.",[12,2125,2126],{},"Voice: older, warm, matching the author's gender. Don't try to artificially imitate the actual person's specific voice — it won't work. Pick a general tone that doesn't dissonate.",[12,2128,2129],{},"Pace slow, around 0.9. Memoirs don't read fast, and in audio that's especially audible.",[12,2131,2132],{},"Emotion: restrained. Memoirs are often about heavy things, and over-emoted AI turns the text into pathos. Better even, calm, with respect.",[12,2134,2135],{},"What I usually tell people who come with these projects: don't edit the text. Keep the author's stylistic quirks, their phrasings, even what looks to you like errors. That's part of the person's voice, and for a family archive that matters more than literary polish.",[12,2137,2138],{},"And don't forget proper nouns. Cities, people, events — verify stress; for a personal archive, this is critical. Format: mp3 or ogg, so it opens on every relative's phone without trouble.",[19,2140,2142],{"id":2141},"whats-true-across-all-of-them","What's true across all of them",[12,2144,2145],{},"Text prep is almost always more important than service settings. Ten minutes cleaning the source saves hours of redo. I usually read the first two pages as-is, and it's already obvious what should go.",[12,2147,2148],{},"Test on a short chapter. Don't render 500,000 characters of something you haven't sanity-checked at 5,000. This is a rule I had to learn the painful way myself.",[12,2150,2151],{},"Listen in headphones at least the first time. Speakers smooth out artifacts, and on speakers a book can seem perfect. Headphones surface details you'll need to revisit settings for.",[12,2153,2154],{},"And finally, don't expect perfection. AI narration in 2026 is a good compromise, not a replacement for sitting down and reading attentively with a highlighter and margin notes. It's a different way of consuming text, with its own strengths and limits. Treated that way, it works.",{"title":195,"searchDepth":196,"depth":196,"links":2156},[2157,2158,2159,2160,2161,2162,2163],{"id":2014,"depth":196,"text":2015},{"id":2027,"depth":196,"text":2028},{"id":2050,"depth":196,"text":2051},{"id":2073,"depth":196,"text":2074},{"id":2100,"depth":196,"text":2101},{"id":2116,"depth":196,"text":2117},{"id":2141,"depth":196,"text":2142},{},"\u002Fblog\u002Fen\u002Fconvert-text-to-audiobook",{"title":2003,"description":2167},{"Typical use cases":2168,"publishedAt":2169,"author":7,"tags":2170,"readingTime":398},"lecture notes, textbooks, long articles, books. How to pick format and settings.","2026-02-15",[552,219,2171],"scenarios","blog\u002Fen\u002Fconvert-text-to-audiobook",[],"zvocP-faWdpMzM2kq2ui_qMQzcUkWMougZiQigpjYP0",1777641460869]