17 Jul 2018 — SpeakUp Editor
Today we are pleased to welcome Brian Whitmer to share his expertise regarding message banking.
Did you watch any of the old Star Trek TV shows, spending an occasional hour with Captain Kirk, Picard, Sisko or Janeway? Do you think you’d recognize the voice of the computer (“Computer, locate Ensign Lynch”, “Ensign Lynch is not aboard the Enterprise”)?
The same actor, Majel Barrett-Roddenberry, actually recorded the computer voice for all Star Trek shows, movies and even video games up until her death in 2008. In the technology sector she is often known as the “original Siri”, and Google’s smart assistant was originally codenamed Project Majel. Interestingly enough, most smart digital assistants still carry the same flat, non-human style of voice that Barrett was best known for.
AAC systems, like smart speakers, come with synthesized speech as the default option, which means that many AAC communicators sound like different flavors of the computer from Star Trek. Synthesized speech is nice because it’s flexible enough to generate any phrase without having to be pre-recorded, but it sure doesn’t allow for the expression of emotions as in sounding friendly, sarcastic, grumpy, or tired. AAC users find workarounds to help express the right tone. Google and Amazon are making progress in this area, but for the time being, synthesized speech is doomed to be a bit, well, robotic.
Voice banking, which lets you record or match voices to create a personalized voice (VocalID, Acapela, ModelTalker) is expensive and has the same limitations and strengths as synthesized speech. However, there is another option called message banking. Message banking, unlike synthesized speech, means storing actual sentences, phrases and stories using the communicator’s own voice as audio files. It’s particularly valuable for individuals with degenerative conditions like ALS or situations where individuals have context-dependent apraxia. They may record (or “bank”) messages early after their diagnosis with the intent to use them later. Since they are actual audio recordings, all the communicator’s personality is preserved and it sounds and feels more authentic than synthesized speech output.
Emphasis should be placed on preserving some personality for these individuals. The drawback, obviously, is you can only say what’s already been recorded. No novel speech is possible with message banking. Therefore, many communicators actually benefit from a hybrid approach by using audio recorded phrases paired with synthesized output. Recording every possible phrase just isn’t practical. The question then becomes, what do I record?
John Costello of Boston Children’s Hospital helped pioneer message banking. He has developed resources and recommendations for good message banking strategies. Instead of focusing on core words and phrases, Costello suggests focusing on recording the “fringe” expressions and stories that are most personal to the individual. He suggests including jokes, personal catch phrases, sound effects, personal stories, nicknames or pet names. These are the phrases that are connected to personal identity, and have the most to lose if replaced with a synthesized voice. For examples of phrases to bank, see the figure.
When recording audio, it’s important to try to keep the original recording as authentic and high-quality as possible. Costello recommendscarrying around a personal recording device like the Zoom H1 to record personal phrases on the fly. It’s also possible to record phrases using a podcasting microphone like the Blue Snowball. Although the microphone is cheaper, it needs to be recorded at the computer. To insure a quality recording, it is important for audio files to be at least 256kbps.
When recording files you can either record the sound clips individually, or record them as large audio files and split them up using a free tool like Audacity. Either way, once you have the individual files, you can upload them to a number of different AAC systems from providers like Smartbox, CoughDrop and Tobii Dynavox. Message banking tools are getting easier and easier to use.
Recently, Costello and his team have actually been working on a cross-platform collaboration called mymessagebanking.com to make it easier to manage and upload message banks over time using tools like Tobii DynaVox’s Message Banking tool, Smartbox’s Grid app, or CoughDrop’s Recordings tool. It’s exciting to see the progress in this area.
Both message banking and synthesized speech output have their strengths and weaknesses. Neither can ever completely replace a communicator’s own voice — maybe someday when we surpass Star Trek’s technology we’ll get something that advanced, but in the mean time both can make a powerful difference in helping communicators make their voices heard.
Brian Whitmer is a computer programmer with extensive experience in human-computer interactions who graduated with his Masters of Computer Science from Brigham Young University. He co-created the learning management system, Canvas, utilized by hundreds of districts and universities nationwide. In an effort to improve communication for his daughter who has Rett Syndrome and cannot speak, Brian worked with dozens of speech professionals to develop and then create the cross-platform augmentative communication app, CoughDrop. CoughDrop is now used around the world to give voice to people with communication struggles.
Brian is employed by and is the founder of CoughDrop AAC, an AAC application that is mentioned and discussed in this post.
Jill E Senner, PhD, CCC-SLP