Or an attempt at one. After my research this week, I decided to make a simple app to test the experience of texting words into sounds. Nothing fancy. I just wanted a sketch to emit a different sound for each word submitted and if time, integrate that into a simple chatroom using web sockets.
I quickly realized that I knew next to nothing about making and working with sound in the browser. I taught myself about synths, envelopes, frequencies, octaves, amplitudes, and more. Remembering that the music peeps on the floor like Tone.js, I found some starter sketches on The Code of Music to help me with that library—thank you, Luisa!
Though I very much enjoy technical challenges, working with musical terminology and coding specific to this area took longer than I expected.
Currently my sketch can:
Receive words through an input field, count those words, and print the quantity in the console. Eventually this amount will pass into a function to emit the same number of tones.
Until then, a mouse press triggers three tones at random frequencies, each of a different duration, in order, and do not loop.
…but only after the first press, and then I hear nothing. However, I can see evidence in console that the function is executing properly.
Best guess right now is that I need to use objects and that with a class I can create a new Audio Context upon every mouse press (or eventually when new words are submitted). I need to spend some time learning more about the Web Audio API and talk with music people.
Most important, here are the questions that this process raised for me:
Um, do I really want to work with material and methods with which I’ve had very little experience so far (natural language processing and sound)?
How will participants know that the sounds are from their texts? They need immediate feedback from their actions to care. I found a video interview with Werthein in which he describes the development Samba Surdo project. He noted that people with sight wanted to see where the sound was originating.
It’s one thing to generate random sounds from words, but if it’s always random, then there is no meaning in the sonic translation. What if for each new word the associated random generated sound was saved and a participant constructs a new audible language—an abstract sound lexicon—exploring and learning as they go?
Does each new participant have their own “animal” sounds? I suppose I can represent this by placing users in different frequency ranges for now.
What if participants build a new audible language together?
What if the language is already coded for them, and participants need to figure out their words’ sounds?
Right now, my program recognizes a “word” as surrounded by spaces. What’s to keep folks from typing in gibberish? What if they use emojis?
Speaking to the above research, are sounds somehow related to the meaning of words? If so, how do I make that clear and understandable to participants? And again, what about gibberish submissions?
In general, what’s the story arc of the experience?
How might participants hold conversations if they are unsure how others’ texts are translated? How might I offer expressive possibilities with sound to convey intent and/or meaning?
Finish this chatroom prototype! Visit Luisa in office hours for assistance and her advice on moving forward on a sound-related project (scheduled).