CYBORG MUSIC: Digital Voices and The Rise of the Virtual Pop Star

Photo Credit: http://time.com/author/daniel-daddario/page/14/

How does the voice survive in the digital 21st century, in a period when voices can be digitally synthesized with a convincing degree of humanness and when the recording machine is no longer a dedicated technology but a complex processing device that can rapidly generate a huge quantity and variety of sonic material from a single vocal syllable without demanding physical effort?

-David Toop, “Sound Body: The Ghost of a Program”

Singing is a fundamentally embodied act. It requires focus, attention to breathing, and awareness of the diaphragm and vocal chords. It relies on the perfect synchronicity of many biological components, which coalesce into the expressive and wondrous voice organ. Singing, and even speaking, presupposes an awareness that is a direct result of inhabiting a body. So when digital voices began to crop up more frequently – announcing arrivals and destinations on trains and planes, speaking directions from cash registers and ATMs, sounding from devices in our pockets, and even representing autonomous entities in popular films like 2001: A Space Odyssey or Her,—I had to ask myself: What are digital voices? How are they made, and what does it mean that we have begun to deny the embodied act of vocalization in favor of something that in comparison seems soulless? What does it mean for contemporary urban cultures to have largely shifted the burden of communication to digital platforms? Digital voices are more precise, more efficient, and have more information readily available to backup whatever they say. They have begun to replace human bank tellers, cashiers, and secretaries already, but what happens when they begin to stand in place of our celebrities, on stage with our idols, and generally among the people to whom we designate huge amounts of cultural capital? In 2006, Madonna famously took to the stage during the Grammys, weaving between animated holograms depicting the singing and instrument-wielding cartoon personas of British virtual rock band, the Gorillaz. In 2012, a hologram of the late Tupac Shakur appeared on stage with Snoop Dogg and Dr. Dre at Coachella, speaking words that Tupac himself could never have said while still alive. Examples of the digital voice gaining power abounds in popular music, which, as a self-aware, sound-driven medium, is a great place to begin to dissect the digital voice.

Our contemporary soundscape is crawling with an ever-increasing number of digitally-created, or digitally-rendered, voices. If you haven’t recognized their prevalence, now is a good time to start. Begin by tuning into the radio: ask who is singing, and try to discern what means, both digital and physical, were used to achieve the sound of that voice. We readily recognize some of the more obvious signifiers of digital manipulation like the famed plug-in Autotune, but far more digital manipulations go ‘unheard,’ as they imitate very accurately the physical apparatuses that give rise to certain effects like reverb, echo, and even the elusive ‘warmth.’ The voice is yet another physical apparatus which itself has been entirely synthesized to a surprising degree of accuracy.

The leading vocal synthesis product today is Yamaha’s Vocaloids, a complex engine developed in the early 2000s that extracts phonemes, the smallest unit of vocalized sound, from human vocal samples. By concatenating those phonemes and smoothing over their ruggedness, the engine creates an entirely new voice that is able to speak any combination of letters in the target language. Vocaloids were first developed for producers who don’t sing and can’t afford to hire singers for all of their tracks, but the brilliant tool inspired an entire new aesthetic appreciation, a fascination with a sound that is obviously cyborg-esque, yet chillingly human. Certain qualities, such as their being too quantized and too pitch-perfect, make Vocaloids obviously a non-human replica of the voice—but leave it to the Internet and file-sharing to breed an entire subculture that, through its musical exploration of this new cyborg tool, unearths something ultra human, somehow revealing or highlighting many unexplored qualities laying dormant in the acoustic human voice that we had previously taken for granted, or ignored entirely.

Vocaloids’ popularity is partially due to Crypton Future Media, a music technology company that released the Character Vocal Series of Vocaloids in 2007. In this series, for the first time, individually-sold Vocaloid “voices”—commonly referred to as lower-case “vocaloids,”—were given humanlike anime representations, complete with physical characteristics like age, gender, and voice type. The characters were even attributed certain musical preferences like favorite genre and tempo.

From there, an “open-source culture” erupted around vocaloids-a term borrowed from a computer programming ideal whereby software code is openly released so that users can edit it and in turn release their updated code. The idea is that the software will continue to develop to user needs. In Miku’s case, a user network of millions could now collaboratively contribute to the vocaloid personas’ bodies of work. Their contributions consisted not only of songs, but also of music videos, generated by a fan-developed software called MikuMikuDance. The software allows users to create “motion data” for models of vocaloid characters and share it with other users, and gets its name from Hatsune Miku, the first character of the series, whose name loosely translates to ‘first sound of the future.’

Miku became a global star. In 2009 she took the stage before a live audience for the first time in the form of a programmed, lip-syncing hologram accompanied by a band of live, human performers. It is this juxtaposition that makes the performance event so fascinating: the digital voice, physicalized in space between two very real masses of humans, no longer exists in a separate, digital landscape but moves freely through and interacts with our physical world. It suggests that digital voices can do more than offer an aesthetic alternative to, or something more efficient than, the human voice. It suggests that digital voices can enact physical change, become role models, even rally a fanbase. In the third episode of British television series Black Mirror, a bodiless personality called Waldo runs for office and takes second place. Waldo is voiced by a human and animated digitally, but a similar current runs through the story-a questioning of what roles must continue to be filled by humans and what roles can be delegated to machines. If a certain ubiquity makes animated characters more appealing icons to the public, perhaps a digital voice does have a shot at being a figurehead under which to organize social progress, create meaningful change, and inspire art. However, and perhaps tellingly, Black Mirror paints a darker picture of the future as mediated by machines.

Some will that Crypton Future Media’s innovation lends itself to the creation of an ideal pop-star, one who can meet and transcend the demands of a popular performer without being subjected to the scrutiny that so many celebrities experience. But if Miku becomes the standard for what pop idols ought to be, we may soon be met with a music culture so bent on the fetishization and objectification of its stars that it begins to lose currency as an artistic form used to express and come to terms with lived experience. One could say that Miku represents a vision of posthumanity, a recently emerging theory contending that the entire human experience can be reimagined and articulated in full by machines. Posthumanism cites as its beginnings a general technological trend whereby information is continuously separated from information-carrying bodies, turned into data.

Posthumanism favors the utility of the voice over emotional capacity, embodied experience, and the ability to communicate, or to resist—and this is precisely what Miku can’t do. While she offers the singing-and-dancing image of an idol onto which the public can project their musical ambitions and desires, she can’t resist any representation of herself in media because she has no agency. One could contend that most pop-stars are stripped of agency when controlled by major labels, but the interplay between the public and private life of a celebrity has informed her image since the notion of celebrity came about. Balancing a desire to propel certain aspects of one’s persona into the spotlight and to hide others away is the constant task of someone in the public eye. Consider how many celebrities have been crushed by their own fame, stepping out into the spotlight only to realize it was too bright and hot, an unsustainable perch. In the case of Hatsune Miku, there is not a separate public and private life, there will never be a moment when her fame becomes too much, as her existence is but a simulation that accumulates representations any time her persona is invoked.

Vocaloids and other digital voice tools fall into a lineage dating back to the late 19th century of digital and analogue technology that has continuously disembodied the voice in favor of its ability to reach people. Included in this progression are the telephone, the radio, the phonograph, and the microphone. Many electronic musicians will tell you (and I will agree) that the future of music is to be found in digital technology, but the fact that a certain degree of distance from the human is built into electronic music troubles them. Music technology takes the potential of our acoustic instruments and voices and carries them into regions unexplored, in between the discrete pitches of piano keys, or above and below the physical range of the human voice. It opens up our potential to create new sounds infinitely. A developer of Vocaloids, Alex Loscos, contends that a declining number of original melodies producible acoustically has shifted attention to digital tools, and authorship has been transferred from the artist to the artistic producer, who uses technology to make something unoriginal sound fresh and futuristic.

But what do we do about the anxieties that are stirred? The anxiety that digital voices infiltrating our physical world might stifle the power of the human voice? Hatsune Miku’s image is being propagated scandalously throughout consumer culture without demanding consent, and has already made appearances in Japan’s Playboy as well as various user-created interactive models, in which fans can touch her body or ruffle her skirt. What might the popularity of Miku and other vocaloids communicate to young performers, who are growing up in the moment after holograms have attained celebrity status, and who may decide to emulate them?

One thing we can do is be more conscious of how the voice is employed in the music we make and listen to. Are our favorite artists innovatively interacting with the technology they use, or using it as a crutch, to somehow augment a physical characteristic of the voice they feel is lacking? We can accept and even appreciate technology’s infiltration of music production while still being critical. In fact, we have to in order for the voice to survive as a resistant, expressive, and embodied agent. From Billie Holiday’s impassioned wailing on “Strange Fruit,” to the cold, breathy melodies floating in futuristic instrumentals as in the songs of FKA Twigs, the voice has long been employed to resist dominant enclosures. The digital voice can take this role in one of two directions: it can bring the power of the voice to new levels of resistance, or it can mute and disfigure the already-muffled voices of the underrepresented. It is up to the creators and consumers of popular music to promote positive models of virtuality.

Digital voices are a strange and useful new tool. Now it is time to use them and their propagation through the material world to explore questions both brand new and fundamental to the human-technology relationship.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>