DL Seminar | Saving Languages from Digital Extinction
By Hemanth Kondapalli | MS Student, Cornell Tech
Did you know that linguists expect 70 to 90% of languages to be extinct by the end of the century? Or how about that a third of the world’s languages have less than 1000 speakers? Is digital technology exacerbating the current linguistic genocide or alleviating it? We talk a lot in society about racial diversity and gender equality, but how important is linguistic diversity for us? In today’s talk at the Digital Life Initiative, Columbia University researcher Isabelle A. Zaugg gave an excellent talk on her research titled “Saving Languages from Digital Extinction”. Below is a brief summary and recap followed by an analysis of what Zaugg talked about in her presentation today.
Linguistic extinction is expected to have a drastic impact on our future culture and society. Losing languages will lead to a loss of intergenerational cohesion and subsequently identity and culture, as Zaugg mentions. While with modern technology, languages can be transcribed, much if not most of the world’s information is not written in books or online, and if these languages become extinct there is no way for it to be passed down. In her talk,Zaugg told a story about the importance of languages thought to be extinct: When Mao Zedong was the leader of China, he asked his scholars to research old archives. In their research, they found a previously-unknown remedy for Malaria which went on to save thousands of lives. This functional purpose along with loss of culture shows how important it is to preserve languages.
Zaugg's research focused primarily on how digital technologies are impacting this extinction process. She began with an overview of character encoding in computers. In the 1960’s when computers were first being used for commercial purposes, a standard was developed to represent characters on the computer. Since most computer usage and research was in America, the standard become ASCII which was a mechanism to represent an English character into the computer. However, soon after, user’s realized with ASCII they would not be able to represent other language’s scripts. By the late 70’s ISO was developed to support most Latin based language scripts. Finally by the 2000’s, Unicode was developed which supports around 150 scripts.
While having Unicode support a script is a great step, it is not sufficient to make languages accessible on digital platforms. Each language needs “Full Stack Support” as Zaugg mentions. This means messaging applications that have language specific keyboards (virtual and hardware), dictionary, spell-check features, font types, etc. A lot of this development has to occur in the open source community.
Today much of language communication occurs digitally and the lack of language accessibility in digital spheres has hindered these languages. Many people take and send pictures of hand written language to communicate while others transliterate their script to the English script. Even on websites/profiles that should predominantly be in a native language, a shockingly high percent of that content is still written in English.
Zaugg presents compelling evidence that digital technologies have made it very difficult for languages to compete with English on digital methods of communication. But how this translates to a language becoming extinct to me is shaky. It’s plausible that a language could be minimally used digitally but flourishing orally. As we move through time, languages change as they have for the past thousands of years, and similarly in this regard we are in the midst of a transformation of digital languages and orally dominant languages. Just as local languages have evolved recently to include more English words (there is no Hindi word for Computer for example), we could see the language sphere as a whole evolve into a set of two types of languages.