Podcast Audio Editing With Audacity

This post is a work in progress and will be updated over time.

Here’s my process. A lot of this comes from a video from the Atheist Nomads. I’ve never been traditionally trained in audio editing. I’ve learned everything from the internet and from teaching myself.

Editing is a very personal process. You have to balance your time with the value you’re getting out of it. Audio quality is very important to me so I spend a lot of time editing and improving my editing process. But if audio quality is less important to you, save some time by doing less editing.

What’s the Point?

The first goal of editing your podcast is to make it more enjoyable to listen to. After editing, your audience should be able to easily listen to and understand your podcast.

The second goal of editing is to reduce your podcast’s impact on your audience’s resources. Those resources include time and storage space. After editing, your podcast should not waste time or space.

Recording

You should record good. Prevent mic bleed. one mic per person. watch out for echoes in corners

Remove Noise

Noise Reduction

Noise is any audio that you don’t want your audience to hear. Sometimes it’s the background hum of the air conditioner, sometimes it’s breath sounds.

You want to reduce noise as much as you can. Any time the audience has to pull your intended audio out of some noise it takes mental effort. Spending mental effort makes listening to the podcast less enjoyable so we don’t want that.

I find the interface for Noise Reduction in Audacity confusing, so don’t feel bad if you find it confusing too. It involves two steps:

  1. Sample some noise.
    1. Select some noise
    2. Effects > Noise Reduction
    3. Hit the top button
  2. Remove the noise.
    1. Select the whole area to remove the noise from
    2. Effects > Noise Reduction

Here are the settings that I use for Noise Reduction. I started by copying the values from the Atheist Nomads video linked above and then tweaking from there.

noiseReduction

Content Editing

This is the portion that takes the longest. Listen to your entire recording, cut any audio you don’t want and reorder any sections that don’t make sense. I’ve never gotten the label track in Audacity to work how I expect it to, so I find it very confusing to reorder audio, so I rarely do it. I do cut a ton of audio though.

[Example of content edited audio]

Audio I look for to cut:

  • Any mouth sounds like loud breaths or the darn clicking sound I make with my tongue before I speak.
  • Ums, uhs and ahs.
  • False starts and stutters.
  • Meta discussion such as asking someone to repeat a phrase for a better recording.

Cutting these will make your speakers sound more intelligent and professional, while also saving your listeners time.

[Example of desynced audio]

Do not desynchronize your recorded tracks. If you have a local recording and the tracks are out of time at all it will be audible as an echo. Any cut that you make to one track should be done to all the tracks. If you just want to cut the audio from one track while maintaining the others, silence a portion of that track. (ctrl + L).

Truncate Silence

[Example of truncated silence]

Once you’ve done content editing you’re done with the hard part. Select everything and then go to Effects > Truncate Silence. I truncate any silence down to .5 seconds. This is a bit of a sledgehammer solution, and sometimes I wish I would leave more of a gap between sections, but Truncate Silence is very easy to run and it reduces a lot of useless audio from the podcast, saving your audience time.

Equalization

Equalization (or “EQ”) is the process where you can boost some frequency ranges and reduce other frequency ranges. EQ is where you can give yourself that radio announcer voice.

EQ is where I feel like my editing skills are weakest. If you have any tips or critiques, hit me up on twitter.

The best resource I’ve found for learning EQ is this interview with Rob Williams. The website hosting the vocal EQ cheatsheet mentioned in the interview is defunct, but you can find quite a few similar resources by searching online for “vocal EQ cheatsheet”.

Here are the settings that I use for EQ. If you’re just starting out I recommend starting by just cutting the very low end and the very high end since those are outside the human vocal range anyway so they’re just noise.

equalization

And here that is in XML if you want to import it:

<equalizationeffect>
<curve name="Podcast Vocal5">
<point f="20.000000000000" d="-80.000000000000"/>
<point f="49.237316986327" d="-33.107692718506"/>
<point f="54.196034330446" d="-29.553844451904"/>
<point f="88.033573501041" d="-6.923076629639"/>
<point f="95.871851182279" d="-4.523078918457"/>
<point f="108.957037410504" d="-1.938461303711"/>
<point f="132.599316556226" d="2.445087432861"/>
<point f="156.339334382973" d="2.445087432861"/>
<point f="248.195108586157" d="-3.046241760254"/>
<point f="505.708456346672" d="-2.771678924561"/>
<point f="1016.395768252276" d="-0.300577163696"/>
<point f="1971.410215909012" d="4.367052078247"/>
<point f="5041.428276830616" d="4.916185379028"/>
<point f="10132.490968285008" d="4.367052078247"/>
<point f="14864.778932891884" d="-1.124279022217"/>
<point f="23998.298441881070" d="-24.736995697021"/>
<point f="23999.149205860322" d="-64.000000000000"/>
</curve>
</equalizationeffect>

[Embed image of frequency analysis]

Audacity has a frequency analysis tool but I’ve never been able to get any usable information from it. I’ve tried to sweep through and reduce the spikes at certain frequencies and it just makes it sound worse.

Compression

Compression is an audio process to reduce the range of volumes for sections of your podcast while keeping your quiet parts quiet. For example, if you have a moment where everyone laughed and the volume is a lot higher than the average volume, running compression will bring that volume back down towards the rest of the audio. This video helped me understand compression.

Compression is very important for reducing the difficulty of listening to your podcast. Have you ever watched a movie where the quiet parts were too quiet, but the loud parts were too loud? Without compression, your podcast may end up like that. Your audience will have to constantly adjust their volume to maintain a comfortable and clear output. That’s a lot of effort and you want to reduce the effort to listen to your podcast.

Before running Compression I will usually run a limiter to reduce the damage from any egregious sections. For example if someone clapped their hands it’ll often peak the recording for a moment. A limiter will reduce that a bit. A limiter puts a hard cap on the volume of a section, clipping the audio. This clipping adds a bit of distortion, but these segments are usually short and the distortion is usually pretty minimal. I will usually run a limiter at -2 or -3 db.

compressor

Then I run compression. Click compress based on peaks. I don’t entirely know what that checkbox does but the results are better with it clicked. The Audacity Wiki says that it applies upward compression instead of downward compression but I don’t know why that would affect the result.

Normalization

Normalize to -3. This makes it fairly loud while not getting too close to the max. My least favorite thing is when a good podcast is too quiet. I listen to podcasts while commuting and if I can’t hear your podcast over passing cars with my phone volume all the way up I get sad.

Normalization is actually not the most correct way to finalize audio. The most correct way would be to normalize “loudness” based on a values like LUFS. I haven’t gotten around to looking into how difficult this is to do in Audacity. Hit me up if you’ve got it figured out.

Export

Mono. mp3. 86 kbps. Download some podcasts you like and take a look at their files.


		
Advertisements

How to Use the Metropolitan Museum of Art’s Collection Listing API

The Met has an Open Access Policy which is that the museum makes public domain art images in their collection easily available for download. And that’s great if you want to use their website.

The MET Open Access Example
An example of the web view of an Open Access item. Source

But that’s not great if you want an automated way to get at the data and images. There are two ways to fetch the data, one is downloading the full dataset from a regularly updated Github repo and the other is a web API. The Github data notably doesn’t include references to the images. It would also require manually updating a local copy of the data at regular intervals.

But the web API exists and as far as I can tell it’s completely undocumented. The only mentions of the exact syntax I could find were in some comments on some social media posts like this.

I ended up using the API and stumbling around trying to figure it out, so I’m going to document the API for anyone else who has a similar need.
Read More »

Word Chain Dev Log #5

I was able to steal away some time to work on Word Chain because I finished the July milestone for Train a couple days early. I prioritized getting the back end to generate word chains so that I can actually enjoy the game. Up until now I hand wrote every word chain so I was easily able to complete all the chains.
CaptureI
 think I was completely successful. I wrote two python scripts and use two json scripts to generate the chains. The first python script, wordPairInput.py, allows easy input of lots of word pairs. The pairs are stored in wordPairs.json. The data can be modelled as a directed graph, possibly cyclic. The chains will be paths in the graph of a constant length. The pairs are stored in a dictionary of lists of strings.

Here’s what an excerpt looks like:

"quitting": [
    "time"
],
"day": [
    "break",
    "off",
    "time"
],

The key in the dictionary is the first word in the pair. The list contains all possible second words. This is like an adjacency list. It allows for very quick lookup of the second words for a given first word including quick lookup of whether a full pair has already been entered.

The second python script, generateChains.py uses wordPairs.json to generate a list of chains in wordLists.json. I thought I would need a graph processing library to generate the chains in a reasonable time but I decided to try a very naive approach to start. I run through all the possible first words, then for each word, recursively try to add on all the possible second words from its entry in the pair dictionary. It ends the recursion once the chain has 6 letters in it.

This process is probably best defined in code. partialChain is a list of strings representing the chain so far. reList is list of finished chains and pair dictionary is the dictionary described by wordPairs.json

def appendWordsToChain(partialChain, retList, pairDictionary):

    lastWord = partialChain[-1]

    if lastWord not in pairDictionary:
        return

    nextWords = pairDictionary[lastWord]

    for nextWord in nextWords:
        if nextWord in partialChain:
            continue

        newChain = copy.copy(partialChain)
        newChain.append(nextWord)

        if len(newChain) < chainLength:
            appendWordsToChain(newChain,retList,shortList,pairDictionary)
        else:
            retList.append(newChain)

I was able to quickly get the basics of this algorithm down and only ran into a few snags. In my first iteration I mistakenly had “return” instead of “continue” after the check for repeat words in the partial chain. This prevented further options in a partial chain from being tried. I also had to refresh myself on how python works with passing by value/reference. Turns out it passes by object reference which barely makes sense to me. To replicate C style passing by value I used copy.copy. There may have been a more pythonic way of doing this but I’m used to C style recursion algorithms so I just replicated it here.

This algorithm worked surprisingly well. It is able to generate about 900 chains now nearly instantly on my laptop. As far as I can tell, I don’t think I’ll need to do any more complex graph processing unless it slows down greatly as wordPairs.json gains more and more entries.

Once I had it working I aimed to the chain generator output information that would let me easily add more chains. Currently, I have it store every failed chain in a list. After generating all the chains it prints all the failed chains that were one word short of being completed.

['muzzle', 'flash', 'back', 'space', 'station']
['muzzle', 'flash', 'back', 'seat', 'cushion']
['muzzle', 'flash', 'back', 'pack', 'mule']
['part', 'time', 'share', 'crop', 'circles']
['money', 'back', 'side', 'burn', 'notice']
['money', 'back', 'side', 'kick', 'stand']
['now', 'boarding', 'house', 'fire', 'works']
['chock', 'full', 'time', 'sink', 'hole']
['chock', 'full', 'house', 'fire', 'works']
['quitting', 'time', 'share', 'crop', 'circles']
['body', 'bag', 'lady', 'bug', 'zapper']
['quarter', 'note', 'book', 'case', 'worker']

This makes it very easy to look over the printed list, find a word at the start or end of the list that I think I could find a pair for. For example I could look at the output above and try to come up with a pair ending in muzzle because I know it will create at least 3 more chains. Achieving this required also computing a reversed pair dictionary. This was to check whether or not a failed list could be extended from the front. I may end up serializing the reversed pair dictionary while entering pairs as well.

With just these simple scripts and the output of the one word short failures I was able to go from 6 chains to over 900 in about a day.

Next I’d like to also print out any word pairs that are not connected to any others. I could then try to come up with pairs to connect those islands to the other pairs. I’d also like to output some general analysis data such as the number of pairs, the average number of second words for a first word, the average number of first words for a second word (using the reversed dictionary), the number of single pair islands and the longest nonrepeating chain.

Doing some some sort of auomated cross checking with a known Engish dictionary to check for typos would also be valuable. I just noticed that I entered “done dry” instead of “bone dry”.

I’d also love to be able to visualize the graph of the pairs. Being able to see it visually seems like it would be really really cool. 😀

Word Chain Dev Log #4

This is a continuation of a previous post.

This is a belated post of work from last Friday. I was planning on writing up a post then and continuing work on Saturday but our apartment got bedbugs and had to be prepared to be treated by today so I had to deal with that. 😦

The game started looking like this:

Word Chain Game Screen Day 3

And ended looking like this:

Word Chain Game Screen Day 4

I was concentrating on polishing up the turns system and giving more feedback. At the start, the only thing that alerted a player that it was there turn was when their screen showed the controls. If they weren’t watching their screen and hadn’t been paying attention to the turn order they would not know it was their turn. I added a highlighted border around the scoreline of the player whose turn it is and a sound that plays on their phone. This sound automatically worked even when my phone’s screen was turned off so I am very happy about that. I wish I could make their phone vibrate as well but I could not find an easy way of triggering that via Javascript.

The score lines are all dynamically generated html elements. I spent a while trying to make it so that when a player’s turn starts, the class of the element is set to a predefined class in the CSS file.That way if I wanted to redesign what the highlighted scoreline looked like I would just change a CSS file and not any code. I am disappointed that I could not get it to work and did end up having to write all the highlighting style in code.

The sound was taken from JumpJump but it is not just a simple sound file. It is an array of data that is used to dynamically generate the sound.

var sounds = {
coin: { jsfx: ["square",0.0000,0.4000,0.0000,0.0240,0.4080,0.3480,20.0000,909.0000,2400.0000,0.0000,0.0000,0.0000,0.0100,0.0003,0.0000,0.2540,0.1090,0.0000,0.0000,0.0000,0.0000,0.0000,1.0000,0.0000,0.0000,0.0000,0.0000], },
};
g_audioManager = new AudioManager(sounds);

The library being used is jsfx, a Javascript library inspired by sfxr. It has a sound generator interface very similar to sfxr that output the array of data. All this means you don’t have to package up any audio files with your game and you get interesting retro sounding effects.

It will be a while until I work on this again. Today was my first day back at NetherRealm and I will only have time to work on one side project. I will be working on Train and we’re hoping to have it finished by December. If I ever get time to work on this again, I will be working on a back end script that takes in pairs of words, processes a graph of those pairs and then outputs word chains to a json file. That way I won’t know all the chains beforehand and I’ll actually have fun playing the game!

Word Chain Dev Log #3

This is a continuation of a previous post.

I took a break yesterday. I ended up playing a bunch of Crusader Kings 2, watching Guardians of the Galaxy with my parents and then watching some late night tv for the first time in a long while.

At the start of today the game looked like this:

Word Chain Game Screen Day 2

And at the end of today it looked like this:

Word Chain Game Screen Day 3

I redid how colors were being assigned to players. In JumpJump it is a set of hsv offsets from the image that is used as the avatar. I changed it to be rgb colors that can be easily used as css values. I use the colors to set the player’s controller background and the color of the text in the scoreline.

I just realized it could be really cool to set the color of each completed word based on who finished it. I may do that and add a bit of flashiness to the scoreline and controller of the current player but other than that I think I’m done with visual stuff.

Next I want to try my hand at the back end of this project. Currently, the word chains are written out by hand in a json file. I came up with all of them so I know all of them. What I’d like is a system to dynamically generate the chains. I’m fine with keeping them stored in the json file. What I’d like is a python script that manages a graph representing word pairs that allows me to easily punch in word pairs on their own instead of in predetermined chains. Then the script would process the graph and spit out a number of chains into the json file. The json file is read by the front end and a chain is picked to use for each round.

That’s the dream. I don’t know if I’ll be able to finish it by this week. I want to be done after this weekend so I can shift my attention back to Train. If I can get it done, I can easily see myself punching in word pairs in my down time and generating more and more word chains.

Word Chain Dev Log #2

This is a continuation of a previous post.

Today I spent some nice vacation time playing Crusader Kings 2 and buying some new board games with my mom. We picked up Forbidden Desert and Carcassonne. We played Forbidden Desert today and it was great!

Today the game started looking like this:

Word Chain Game Screen Day 1

And ended looking pretty similar:

Word Chain Game Screen Day 2

That’s the controller screen in a separate window on the right. Ideally you’d see that on your phone.

I spent a few hours getting the turns to work in Word Chain. Yesterday I made the functionality for disabling and enabling parts of the controller so some of my work today was trying to manage those calls so that only one player has an enabled screen at a time and they still get buttons first and then the word choice afterward.

The more difficult task is managing player additions and removals. HappyFunTimes is designed so that players can leave or join at any time so I’m trying to support that. The tough cases are what to do when the current player leaves and how to add new players during the game.

My solution to the first problem is to just pass control on to the next player if the current player leaves.

My current solution to the second issue is just push the new player onto the end of the list of players. This leads to inconsistent behavior, though. In some circumstances the new player could be up next. In other circumstances, the new player would have to wait for everyone else to go before their turn arrives. From a design perspective I’m not sure yet what I want the new player joining behavior to be. I know I don’t like putting them on the spot as soon as they join, but making them wait for all the other players to go first could be a while. Maybe I could find a sort of midpoint and insert them there.

From here on I have most of the functionality that I wanted in the front end of the game. Players can join and play the game. Their turns work. They get points for guessing words. After a set of words is completed a new one is chosen.

Now I want to get to more of the details and polish. First I’d like to work on enhancing the feel of turns.

I pulled out the code for the avatars from JumpJump today. I like the idea of associating each player with a random color. Currently the controller backgrounds are all the same pleasing purple color. I want to change that to the player’s color. I also want to show the player’s color in the score lines some way. I might replace the duck avatars with something like a square of color. I may just change the text color.

I’d also like to add some stuff to make the current player feel more special. I think they should get a star by their name in the score lines at least. I looked quickly into making their phone vibrate but it looks like there is no easy to use commonly supported api for phone vibration from javascript. A flashing background could work in its place.

Those features are what I want to conquer tomorrow.

Word Chain Dev Log #1

Blah This year I’m taking a week to rest between finishing up Spring quarter at DePaul and starting my second internship at NetherRealm. I’ve been dying to make a game I’m calling Word Chain and now I finally have some time to do it. It’s based on the game show Chain Reaction and has an interface similar to Fibbage. I started a bit of work on it last week during finals but today is the first day of more concentrated work.

I’m using a library called HappyFunTimes and it takes care of getting the game onto players’ phones or laptops. I’ve been picking up Javascript, HTML and CSS as I go, spending most of my time in Javascript. I’m not concentrating on writing good code. I’m treating this like a game jam. My main goal is to have a fun playable game at the end of the week.

At the beginning of today the game looked like this:

Word Chain Game Screen 0

And at the end of today it looks like this:

Word Chain Game Screen Day 1

After today the player can select to add a letter to the top or bottom word and then guess that word, very similar to Chain Reaction. The hardest things I did today were getting the game to read in word chains from a json file and getting everything to work right with being able to play on the top or bottom words.

Tomorrow I’ll be aiming to take care of managing whose turn it is and how to deal with players leaving and joining. HappyFunTimes games are designed so that any number of players is supported and players can leave and join at any time so it may be a bit of a challenge.