The Secrets of Successful Podcast Music

The producers, hosts, and composers behind This American Life, S-Town, Radiolab, and more on how they use music to enhance—but not manipulate—a listening experience.

By Jordan Kisner

June 19, 2017

My first encounter with what we’re going to call “the space” happened at age 13, sitting in the front seat of my mother’s black Ford Expedition. She and I were parked outside the local bank, and we’d come to a full halt in the middle of our errand to finish listening to the woman’s voice on the radio. The segment was on This American Life, from a 2001 episode called “Stories of Loss,” and something about its author, Genevieve Jurgensen, pinned us to our seats in a kind of reverent and abject horror.

“I would like you to have heard me talking to them just once, if only the telephone. I feel powerless in trying to make you accept this evidence: They were here. I was their mother.”

Underneath her voice, there was a simple riff on the electric guitar. It was a quiet, low bit of an A-chord that ebbed and flowed under her story about her daughters, who were there one May morning and gone by the evening. The guitar was there under the moment when she bellowed the eldest’s name into the hills. It was silent when she seemed to need silence. Mostly, it just repeated with neutral insistence, turned over itself again and again in the same way this mother circled her grief. I’ve heard that guitar riff three times in my life, under Jurgensen’s words, and I can tell you exactly where I was each time. (Bank parking lot, O’Hare International Airport, a cold kitchen.)

Podcast music has become its own cottage industry: WNYC, New York City’s NPR affiliate, employs house composers for its stable of shows, and most podcasts with any budget contract musicians to write original theme and background music. The worlds of high-profile podcasting and music are melding: Nick Thorburn from Islands composed music for Serial; Merrill Garbus of tUnE-yArDs wrote the New Yorker Radio Hour’s theme; Andrew Dost from fun. and That Dog. collaborated on the theme for Lena Dunham’s Women of the Hour. And Daniel Hart, Helado Negro, Trey Pollard, and Matt McGinley, who contributed to the score for the recent blockbuster podcast S-Town, had 16 million people hear their music in its first week of release.

There’s something fundamentally different about the music that’s written for radio, even by great pop writers whose work seems effortlessly catchy—primarily, it’s not supposed to be very catchy. Its function is to create a space for the human voice, and for the silence of a voice that’s stopped speaking. This notion of “the space” comes by way of JD Samson of Le Tigre, who recently wrote the season teaser for Radiotopia, the Public Radio Exchange’s podcast network. “My first bunch of songs I sent in were… songs,” Samson tells me. “And [the producers] were like ‘No, it can’t be a normal song. It has to be really boring.’” She’d take out an element and send it back, only to get another note: “More boring.” And again: “More boring than that.”

Samson is calling me from her car, driving somewhere by herself, and she points out that solitude is the basic state of the podcast listener: by oneself, interstitial, and yet tethered by voices to an invisible and vast network of human experience. “It’s this moment when I get to be truly alone,” she says, and yet she is entirely connected. The voice provides the primary score, and the music is there to facilitate its power to deliver you to that twinned space: solitude and communion.

There’s some consensus that while many radio creators are talented at this, the most consistent geniuses of magical podcast space creation is the team at This American Life, which first aired in 1995. (“They’re just fucking masters,” says one producer; “It just seems like they’re writing the playbook,” a composer attests.) They produced Serial, the true-crime phenomenon that kicked podcasts up to the strata of media production worthy of critical attention. (It accrued more than 80 million downloads.)

The same team is behind S-Town, the true story of an unusual man living in what he refers to as Shit-town, Alabama. S-Town came out in late March and provoked a storm of praise, protests, and think pieces. Its most obvious innovation is grafting the shape and mood of a Southern Gothic novel onto a modern podcast. Jad Abumrad, who hosts, produces, and composes music for the WNYC radio show Radiolab, which is itself a revolutionary work of radio sound design, tells me that S-Town’s structure altered the way the medium is consumed: “It brought a Faulknerian novel approach to a disposable medium, so now [it] feels like a medium you might want to listen to again and again.” From a production and music standpoint, S-Town is exquisite.

When it came time to find a composer for S-Town, Ira Glass, the creator of This American Life and executive producer of its sister shows, asked St. Vincent’s Annie Clark for advice, and she suggested her onetime bandmate Daniel Hart, a composer and violinist. Glass sent Hart scratch tracks of the first two episodes and explained that the team wanted something that sounded Southern without being twangy. Or, as Julie Snyder, another S-Town producer puts it to me: “Not Deliverance banjos. Not like we’re leaning too hard.” Maybe strings over a hip-hop beat? Strings that sound like a hip-hop beat?

Early requests from producers to composers often come in the form of vague questions, pastiche-y chains of adjectives, and slightly demented lists. Jenna Weiss-Berman, who has edited and produced podcast hits like Women of the Hour, Still Processing, Missing Richard Simmons, and Hillary Clinton’s podcast, kindly read me a sample of the average directive for musical vibe. It’s her job to ask show hosts what they want the music to sound like; this is the kind of thing they send back:

“Funky and sexy but not corny
Janet Jackson ‘Miss You Much’ is the perfect beat
Old school R&B vibes
Tropical bubblegum”

Or:

“Electronic steel drums
‘Sorry’ by Justin Bieber
A song with airplane sounds in it?
Possibly a song with airport announcements in it
Something soothing.”

Others go minimal. For her new podcast Never Before, author and activist Janet Mock just requested “something Solange-y.”

Typically, creating the musical profile of a podcast is about reflecting both its mood and its style. Weiss-Berman often devotes an entire production meeting to determining the 10 most prominent moods of the show so she can communicate that to musicians. For Lena Dunham’s show Women of the Hour, she and Dunham wanted something that evoked “girl group-y” and “’90s grunge” style, with a mix of energies and moods adaptable to different stories. For S-Town, Snyder, Glass, and host and creator Brian Reed needed a soundtrack that could suggest the Southern Gothic, with its history, suspense, desolation, humor; convey feel of modern small-town Alabama; and, most of all, create momentum underneath long stretches of exposition or monologue.

Daniel Hart, who has lived more than half of his life in the South, liked the assignment: “I wanted it to sound like the complexity of Southern rural life as I had known it to be, as I had rarely seen it portrayed in media.” He decided to use instruments common in the South but pitched or played in a way that sounded slightly unfamiliar. He used a Bulgarian cousin to the mandolin called a tamboura, a dulcimer, and “a fair amount of strings, handclaps, and knee slaps.” (And, ultimately, lots of banjo.) He wrote the show’s theme, and then an album’s worth of background music to mix and match under each episode.

The formal demands of writing background music for podcasting are a bit rigid. Hart explains that the songs work best at between 90 seconds and two and a half minutes long, with a clear delineation between sections with different energy levels: an intro, a section that’s the full theme with all its instrumentation, a reprise of that theme with a sparer arrangement, and a satisfying final section. He says the music should have peaks and valleys “because that echoes the rhythm of the storytelling.”

Mainly, it shouldn’t be too interesting. The worst thing music can do in a podcast is call attention to itself, or to compete with the voices over it. In some specific cases, this means stripping away instruments that tend to interfere with the human voice: Weiss-Berman is constantly telling musicians to nix the synth. Julie Snyder explained to Daniel Hart that S-Town narrator Brian Reed has a voice that tends to drop out at certain registers, so the instrumentation would have to work around the lacunas in his resonance. So: no horns.

The goal is emotional ambiguity or—more often—neutrality. For many podcast producers, the sign of good taste in music editing is letting the vocal tape and narration speak for itself, never using music as what Abumrad calls “emotional steroids” to manipulate the listener’s feelings. Snyder points out that her rule of thumb is to play away from the obvious emotion. “You’re usually trying to undercut emotions. If it’s sad, then you don’t want the music to be sad, and if it’s funny, you don’t want circus music. There are obvious exceptions to that where you can go big, but mostly it’s like Pull back! Pull back!”

Martin Fowler, who scored the serialized mystery podcast Limetown and is one of the stable of composers This American Life calls on to supply their stock music library, says this type of music should be “so understated you have to try to pay attention to hear it at all—a melody that stays out of the way, so it gives forward propulsion without drawing your attention.” It’s a listening experience so passive that you might be moved by an episode’s score and then, 10 minutes after finishing, forget it had any music at all. I ask Fowler if there’s anything frustrating about writing music that by definition no one should remember, and he corrects me. “It should always have a melody you can sing and remember, even if you don’t realize you’re hearing it. You’re trying to make music that people don’t hear. It’s an important distinction.”

But if the music isn’t there to be heard, precisely, then what’s it there to do?

Firstly and maybe most basically, music can set tone. Consider the music that runs underneath the first time Brian Reed meets S-Town’s protagonist, John B. McLemore: a shimmery single note pedaling on the celeste over low pizzicato fourths. On its own it’s fairly neutral, but over the tape, which includes John’s voice and Reed’s narration, the music sounds curious, tentative, excited, suspicious—like everything is still to come.

Or, maybe, think of the scoring under an early episode of Radiolab, “Finding Emilie,” about a young art student who’s hit by an 18 wheeler shortly after she falls in love. In the early portions of the episode, when Emilie’s boyfriend Alan is describing how they met at a party during a snowstorm, the music is an irradiated, almost ghostly vibraphone floating downscale, layered over a low drone. When I ask him about it, Abumrad recalls writing this “love theme” to sound as if the young couple had been placed in a snow globe.

He remembers more sharply scoring the moment that he considers the emotional node of the story: when Emilie is lying in the hospital, now deaf and blind, able to speak and move but totally disoriented and unreachable. “Pull me out of the wall,” she starts saying out loud, “it’s dark in here.” The music is tight, tense, excruciatingly simple, just a series of drones that rise almost imperceptibly, maybe two semitones over the course of 45 seconds, just enough to feel disorienting escalation. “It’s almost non-music,” says Abumrad. “You want motion but you don’t want to inject too much feeling. [Emilie] was still in the hospital when we were telling that story. It didn’t feel right to overproduce—that would be impolite at best. [The music] is just holding the space.”

In this sense, music is a proxy for image. Where a film director can establish character or set a mood with tone, light, focus, and angle, a podcast producer has only music and sound design, which has to offset the storytelling voice without obscuring it or drowning it out. Podcast producer Brendan Baker suggests that, at its best, music can conjure an image or a scene or a quality of light with perfect clarity. “If you take a piece of music that already has a set of subconscious imagery in it, and match that to a story with particular descriptions, and then maybe add a set of subtle sound effects—then something really cool happens and the image solidifies. You can almost see it.” Done right, he says, it’s magical, a kind of synthesis.

Music also has the useful ability to expand or contract a listener’s experience of time. Editing S-Town, Snyder often had to bridge several months, or even several years between two pieces of tape in only seconds. And there’s a moment from the wonderfully strange and atmospheric podcast Love + Radio that does this. An episode called “Greetings from Coney Island” tells the true story of a modern woman who starts receiving love letters written in 1938; at one point, the story transitions away from the woman’s narration, and the voice of the letter writer—the voice from the past—comes in. Brendan Baker, who produced that show, describes the music’s effect as “changing the depth of field, like you’re zooming out and then zooming back into a different era. The music blossoms into this historical world.” The sound cue is only a minute long, but it pushes the listener back 80 years.

Conversely, producers often use music to create momentum to propel a listener forward, often through long pieces of text. Snyder, in editing S-Town, often used this to offset the effects of a main character like John, a savant-like hypomanic clockmaker prone to hours of entertaining but obtuse monologue. “When hearing one person talk, especially scripted talk, it’s hard to pay attention,” Snyder says. “You zone out a little bit.” So when Reed reads aloud an email John sent him detailing a list of global catastrophes (“99 percent of rhinos gone since 1914, 90 percent of big ocean fish gone since 1950, 50 percent of great barrier reef gone since 1985…”), Snyder adds a little music for energy: It’s an arpeggiating synth that loops, as if the music itself were rolling its eyes, but over a simple, joggy kick drum riff that seems to assure you that this is all going somewhere.

Nearly every producer I speak to uses the word “punctuation” to point out the way music serves to delineate between thoughts, create pauses, or emphasize important moments in a story. “If you tell an anecdote and then take a moment to reflect on what it means, then you’d have one piece of music under that whole unit to set it out,” Snyder says. “Or no music except right before and right after.” This helps the listener subconsciously organize discrete sections of the story in their mind, look for the resonances within it, and then patch those microsections into a coherent whole. Alternately, a swell of music or a punchy beat can draw attention to a plot twist or a key piece of dialogue.

“In the roughest way,” Snyder says, “we’ve always thought that if you start music you pay attention to what happens right after the music starts, and then if you take it out you pay a lot of attention to what happens right after the music ends.” The most dramatic example of this in S-Town comes at the beginning of the third episode, which opens with queasy tones, a suspended, beating half-chord, and a faint single note on the piano sounding again and again, as if a child were dully beating the key far away in the house. It’s all already starting to fade by the time we hear a telephone call between Brian Reed and a friend of his protagonist. “We have some bad news to tell you,” the woman’s voice says, a little hollow sounding over the phone line. The music evaporates like mist. “John B. killed hisself Monday night.” The musical silence that follows is deafening. Suddenly, all you hear is human noise: the shift of clothes, a baby fussing in the background, the shake in her voice, the ragged sound of Reed’s breath as he starts to cry, a moan, long moments of speechlessness. As a production choice, it feels artless, and almost unnervingly naked. They speak this way, unadorned, for almost seven minutes.

This points to music’s most fundamental purpose in this setting: to expose the human voice and everything it carries, to facilitate everything profound and strange and funny about a voice speaking to be heard and our desire to hear it. At its best, it amplifies not only what’s being said but the act of saying.

“It’s that feeling of being lifted off your seat,” Jad Abumrad says. “These tiny inconsequential humans have this epic Shakespearean experience. It’s the sound of transcendence and this little human being so much bigger. And music can do that! It can make you step out of yourself and out of time.” This point about transcending self and time and space rings true. There are now millions of people who care about whether a man they never met from a little “shit town” they’ve never been to has experienced love; through Hart’s work, they’ve felt an approximation not only of what it feels like to be from a small town in Alabama, but what it felt like to be John B. McLemore. The music, in part, preserves the space he occupied in the world, and makes room for us to enter it. Sixteen years after I heard Jurgensen’s story on the radio, I remember her eldest daughter’s name. It was Mathilde.