Captioning + Transcription
The Basics | Captioning, Transcription and Audio Description
Audio Description, captioning, and transcription are vital components of creating accessible artwork and performances, especially for D/deaf people and people who are hard-of-hearing. Studies have also shown that captioning can help reading and comprehension skills, as well as expand vocabulary among children and those learning a second language. Along with many other forms of accessibility, captioning, Audio Description, and transcription benefits a broad audience.
As a professional artist, Audio Description, transcription, and captioning will primarily be used for audio and video artworks as well as digital documentation of performance art and artist talks. (Live performance and artist talk can also include captions - usually CART - Audio Description and live image description. See more in the Community and Audience Engagement section.)
As with other accessibility features, we encourage you to view this as an expansion or accompaniment to your artwork. These additional layers can enrich your artwork further and welcome a broader audience to engage with your time-based artworks, performances and artist talks.
Captioning (and Subtitles)
Captioning
Captions are time synchronized text of the audio content and include non-speech elements like music and noises (versus subtitles, which just includes language-based audio content.) They can include spoken dialogue with identification of who is speaking and non-speech sounds that are important to understand the content. Captions are an alternative to audio content primarily for people who are D/deaf or hard of hearing. However, many other people find them helpful, including some people who are neurodivergent, have sensory-processing differences, or who have cognitive disabilities.
Captions vs Subtitles
Captions assume the viewer cannot hear. They are time synchronized text of the audio content and include non-speech elements like music and noises.
Subtitles assume the viewer can hear but doesn’t understand the language. Subtitles translate the audio into another language and don’t include non-speech elements.
Closed Captions (CC) vs. Open Captions
Open Captions are embedded/burned into the video and cannot be turned off. Open captions are best used when the video/media player is unable to support attaching a closed caption (CC) file. (Ex. Instagram videos)
Closed Captions are provided as a separate file in addition to the original video/media file. This allows the user to have more control as they can turn the captions on and off as needed/desired. Additionally, when the video/media is being played on an accessible platform, attaching a CC file (instead of embedding open captions) allows the user to format the captions to best suit their own needs.
Transcription
Transcript
A Transcript is a text document that serves as a word-for-word record of the spoken narration or dialogue and non-speech audio information needed to understand the content. These descriptions function as an alternative to auditory information for people who are d/Deaf or hard-of-hearing, as well as some folks who have cognitive disabilities and/or sensory processing differences.
Descriptive Transcript
A Descriptive transcript is an extended version of basic transcripts that include visual information needed to understand video-only and audio-video content. When a descriptive transcript is provided, it is unnecessary to create a separate basic transcript. Description transcripts generally assist people who are blind, low-vision and/or D/deaf, hard-of-hearing.
The most accessible option is to provide closed captions (CC) and a descriptive transcript whenever possible.
Guidelines for Captioning and Transcription
Video work should include closed captions, a transcript, and a video description.
Audio work should include a transcript (if spoken language is a part of the piece) and describe the audio.
Closed captioning is preferable so those that require captioning can tailor the format of captioning to their specific needs.
Test your caption file on multiple platforms. Research which caption files type work with the platform which is hosting your video.
Videos that have no audio do not need closed captions, but the lack of audio must be noted in the video description (e.g. “video has no audio”). If the video was done in silence but the ambient noises haven’t been removed, use the term “in silence” at the beginning of the video description.
Captions should be written as closely as possible to the person’s actual speech (e.g. “ya know” or “gonna”).
Captions should appear at a comfortable pacing that is easy to read and the font size and color should be easily legible on screen.
Text should be broken into small chunks that are not too large to avoid a “wall of text” that readers may not be able to process. A good rule is roughly two lines of text on screen at a time.
Examples
Example | Video Art with Sounds
Captioning, Transcript, Video Description
bits of self, at once, in fragments | Akari Komura and Hannah Marcus | 2021 | Video | 7 min 14 sec
View Transcript | bits of self, at once, in fragments
Artwork Statement | bits of self, at once, in fragments. 2021. Live performance & digital media work on Zoom between a bedroom in Ann Arbor, MI and a basement in Chicago, IL. Two human beings, three cameras, one handheld mirror.
Video Description | The video unfolds over Zoom in real time. An Asian female with shoulder-length black hair carries a colorfully ornamented, handheld mirror. She is wearing a red sweater and sits in her room. The mirror in hand reflects a scaled down image of another seated performer, who is a white female with long, wavy brown hair, wearing a buttoned down denim shirt and jeans.
The performer in denim initiates a casual conversation with herself about the weather, as the first performer chimes in intermittently with sounds of acknowledgement, sans words. She moves away from her seated position, and an aerial shot replaces the frontal vantage point reflected in the mirror. The first performer begins to speak in fragmented sentences sourced from the original one-way conversation. She overlaps her consonant sounds and transforms her words into melodic phrases. A drone sound made up of the performer’s humming lingers in the background for the entire piece, entering softly and swelling in intensity.
The second performer eases into gestural movement of the upper body that begins in fidgets and eventually grows into limbs escaping outwards. The movement falls into repetitive loops that take her down to the ground to explore edges of her space. The final moment of the piece leaves her lower legs in the frame, still and limp. | Video as described by Hannah Marcus and Akari Komura |
Video Still Alt Text | A Japanese woman wearing headphones holds a small round, bejeweled mirror in front of her face, pointed out at the viewer. Reflected in the mirror is the slightly blurred image of a young white woman.
Learn More
Creative Approaches Captioning and Transcription
Emily Watlington Writes About Creative Captioning
Critical Creative Corrective Cacophonous Comical: Closed Captions | Emily Watlington | Mouse Magazine: In this 2019 article, Emily Watlington - an writer, curator, and arts critic who focuses on feminism, disability justice - highlights several artists who are exploring closed captioning within their art practice. Watlington writes about projects by Christine Sun Kim, Carolyn Lazard, Joseph Grigley and Liza Sylvestre, who have all created artwork that engage with captioning as both an accessibility feature and an art material. (Also, note the image descriptions within the article.)
Resources about Captioning and Transcription
These articles and guides are fantastic entry points into captioning and transcription:
DaVinci Resolve | If you are looking for a free software to add captioning, this is a good option for you
Seattle University | Guidelines and Best Practices for Video Captioning