In a corner of the Digital Imaging Lab in the basement of UC Berkeley’s Moffitt Library, recent graduate Olivia Dill is checking on the latest shipment of fragile wax recordings from the Phoebe A. Hearst Museum of Anthropology. These hard wax tubes, invented by Thomas Edison in the 1880s, are one of the earliest sound recording media.
Dill is one of several part-time staffers tasked to scan 2,713 of these rare and fragile items made over 100 years ago by the renowned anthropologist, Alfred Kroeber. Taking a newly invented portable recorder into the field, Kroeber was able to capture the linguistic and cultural diversity of California’s Native American community, recording languages now threatened with extinction. The Hearst Museums’s wax cylinder holdings of native speakers are the third largest in the country. Only the Library of Congress and the University of Indiana’s archives have more.
The $450,000 digitization project is now one year into its three-year timeline. Leading the team is a quartet of researchers funded by the National Science Foundation, National Endowment for the Humanities, and UC Berkeley. Co-investigators include UC Berkeley linguistics professor, Andrew Garrett; Erik Mitchell, associate university librarian, director of digital initiatives and publication services; Ira Jacknis, The Hearst’s head of research and publications; and Dr. Carl Haber, particle physicist at the Lawrence Berkeley National Laboratory.
“Without Carl,” laughs Mitchell, “the whole thing doesn’t happen.”
Dill, in fact, is soon joined by Haber. He’s stopped by for an hour before heading to UC Davis for another speaking engagement. Haber built the scanning machine, dubbed IRENE (which stands for “Image, Reconstruct, Erase Noise, Etc.”) with his Berkeley Lab colleagues. In 2013, the effort earned Haber a MacArthur “Genius” fellowship, for extrapolating concepts from particle physics to create imagining technology capable of reproducing sound recordings without physical contact.
Dill and Haber examine the five wax cylinders that have completed an overnight cycle threaded along a long, metal spindle. For 24 hours, these cylinders have been exposed to an ultra-high-resolution scanner. An optical beam methodically probes each cylinder’s original sound grooves to capture a single pixel’s image—less than a micron’s width of the wax cylinder’s surface area—with each pass. The resulting data files opened on a computer screen, show only a tight group of squiggly lines. But fed through conversion software, the data represent depth, pitch, tone, and amplitude of sound waves unfolding to replicate the audio wave pattern of the original wax recording.
“Basically,” Haber says, “we’re making sure technically that we’re capturing the content. Again, what IRENE does is take an object, transforming it into images, and extract sound.” He cautions me not to touch the workbench. “Even the lightest vibration will alter the process,” he says. “That spindle has to be absolutely stable. No tremors or bounce.”
On her computer, Dill pulls up a graphic image of a scan. “For me what’s surprising is finding the occasional variation,” she says. “Thirty or forty scans will look exactly the same. But then comes a change in the general shaping of the groove.” Dill points to an area where the lines close in as the recording progresses. “That’s evidence of something going wrong. A lead screw in the recording device has come loose. Or you can see where it’s too loud—the force of someone’s voice was so close to the diaphragm, it blew the needle out of the wax. I can see it bouncing across the cylinder’s surface.”
According to Moffitt Library’s Erik Mitchell, each wax cylinder costs $166 on average to scan. When he first saw Haber demonstrate IRENE at a 2015 conference at the Library of Congress, “my jaw-dropped,” he says. Going on to visit the Smithsonian exhibit, “Hear My Voice,” which used IRENE technology to reproduce Alexander Graham Bell’s voice from his earliest experiments “was also very impressive.” Adds Mitchell, “Recording technology represents such a seismic shift in cultural preservation. Sound is so transient. IRENE increases the durability of these fragile objects.”
Mitchell ticks off several milestones as the three-year project completes its first year. “First,” he says, “we’ve become more efficient in scanning, making sure we hit our production target phase. Next year, we hope to scan eight cylinders a day.”
Saving Endangered Languages
In his book-lined, Dwinelle Hall office, Garrett, who also chairs UC Berkeley’s department of linguistics, moves from desk to computer to show me the public web portal to the California Language Archive. This will be one way The Hearst’s digitized cylinders will be available to the public.
Garrett, who sports a bold purple streak in his snow-white hair, says the CLA contains 14,500 thousand recordings of 265 different languages, plus another 5296 online items, including transcriptions of speeches, field notes, and other materials collected by graduate students.
He notes that the Moffitt team has conducted 650 scans of the total 2713 wax cylinders. Completed material includes Wyot, Oholone, Wailaki, Salinan, Coastal Miwok, Yuki, Hahi, and Yurok languages—8 of the 45 languages in the collection. Garrett, whose work focuses on the Yurok and Karuk languages, plays a snippet uploaded to the CLA database. A woman’s crackly speaking voice from a hundred years ago fills the room. At the end, there is a faint, mumbled conversation.
“This is what’s frustrating and impressive about the IRENE scans,” says Garrett. “I remember listening to a previous version of this recording on cassette tape,” says Garrett. “I never heard that last bit before. The IRENE technology captures so much of the original audio signal, we want every second to be crisp. But we are still limited by the initial technology they used in the field.
Alfred L. Kroeber made these field recordings beginning in late 1901, one month after being named curator of the newly formed Phoebe A. Hearst Museum of Anthropology. As the first Ph.D. student of famed Columbia University anthropologist Franz Boas, Kroeber continued his mentor’s technical approach to ethnography. Using Edison’s portable recording equipment, Kroeber and his colleagues recorded California’s Native American tribes as part of ongoing survey that lasted until 1938. At the time, California held the greatest concentration of native tribes in North America, creating the most linguistically rich region of 90 different native languages.
“Another unique element about the Hearst collection,” says Garrett, “is that Kroeber conducted spoken-word recordings, not just songs.” Although 75 percent of The Hearst’s 2,713 wax cylinders are songs, there are also funeral oratories, public speeches and ceremonies, oral myths and storytelling. Because each cylinder can only hold 3 minutes of sound, Kroeber’s use of multiple cylinders to record one story was innovative.
Scholars have always known about this collection, according to Jacknis, the Hearst Museum’s curator. Even the public was aware of certain highlights, such as the nearly six hours of recordings of Ishi, California’s last surviving Yahi Indian. That material formed the basis of several books, Hollywood films, documentaries, and even TV movies. Additionally, the entire wax cylinder collection was transferred to reel-to-reel tape in 1975.
Using tape cassettes made from those remastered reels in the early 1980s, ethnomusicologist Dr. Richard Keeling created the California Indian Music Project. Wider outreach continued with the work of Keeling’s then-graduate student, Dr. Lee Davis, professor emeritus at UCSF and former head of its California Studies Program, who launched the California Indian Library Collection. This outreach program distributed these cassettes to native tribes and educational centers. “I drove to the most remote areas of the state to distribute this material to tribal communities,” says Dr. Davis by telephone. “Having this available by Internet will be tremendously significant to a population that’s really suffered. In my experience, the more cassettes I handed out, the more requests I got. When people heard these songs or stories, they would collapse in tears. Sometimes, we’d meet with the grandchild of the person singing. It was very moving.”
Garrett, who is planning outreach efforts when the full collection is digitized, says that an additional strength of this new effort is that recordings are embedded in the online catalog. “Unlike a cassette, you can link to images and photos that Kroeber took of the performers. You can read detailed information about when the material was recorded and what the language is.”
The wealth of spoken material, including traditional stories, vocabulary lists, place names, legal texts, public and funeral oratory has never circulated. “Given the spiritual and personal nature of some of this material, out plan is to meet with tribal communities first to find out what they want to make public.” Some material will be restricted, requiring permission from tribal elders or teachers to gain access.
The fact that this material spans a century is linguistically significant to those working on language acquisition and reintroduction. “When you have a larger population of native speakers,” says Garret, “you get a richer, more variegated picture of the language. Even more so than I found doing my own field research in the early 2000s.”
Garrett’s own field research emphasizes how endangered these native languages are. “As late as the 1990s,” he says, there were still many speakers of 40 to 50 native languages in California. “Between 2001 when I started working with the Yurok language and 2008, the last Yurok First Speaker passed away.”
With the Yurok-related wax cylinders now digitized, Garrett is working on a scholarly publication of the Yurok language and vocabulary to be published by the research arm of UC Berkeley’s linguistics department, Surveys of California and other Indian Languages. The scholarly publication, The Reports, due in 2017, is oriented to scholars and Yurok community, one of the largest native groups in California with more than 5,000 members. “We’ll include a CD with the publication,” says Garrett, “all those texts, stories, and anecdotes will be heard in Yurok and with English translations.”
The Long View
Eying the future, Erik Mitchell says the library will examine new ways to extend IRENE’s reach. “How can we leverage IRENE’s capabilities to unlock and preserve other early recording material in UC California or state archives? Because we don’t want to stop after scanning this collection. The project should also a prototype for public access and ongoing scholarship.” With a note of hope, Mitchell predicts IRENE will remain at Moffitt Library scanning incoming material for another 10 to 20 years.
Barbara Tannenbaum is a journalist and fiction writer based in San Rafael. Her pieces on arts and culture have appeared in The New York Times, San Francisco magazine, the Los Angeles Times, Daily Beast, Salon.com, Craftsmanship Quarterly, and Catamaran Literary Reader.
Posted on January 3, 2017 - 12:44pm