Flagship

Virtual Audio Workstation

An immersive spatial computing environment for vocal performance and musical expression using eye tracking andgesture-controlled effects

Timeline

September 2020Present

Key Technologies

VRUnityC#Ableton Live+1 more

Vision Pro Live Looper

I'm exploring how augmented and spatial reality tools can help people express themselves more freely than ever before. This "pedalboard in the sky" is my first prototype that feels easier to use than a lot of hardware interfaces I've tried. It's changed the way I want to play and perform and I'm excited to keep pushing it forward for myself and all of you~ ✨ This live looper was developed for the Apple Vision Pro, with earlier iterations on Meta Quest devices. It's still in its early stages, and I have plenty of ideas for future improvements and new features.

Latest Feature Demo

Latest Feature Demo

Vision Pro Demo

Vision Pro Demo

Background

The majority of audio effects are controlled primarily through two dimensional interfaces. This limits both the functionality of a toolset as well as the embodied connection between users living and acting in a 3D space.

Objective

Explore a variety of interaction schema to control an audio landscape using the affordances of Virtual Reality.

Design Phase 1: Prior Art Review

The first step was to analyze the successes and shortcomings of sonic interfaces that were already being utilized to actively control audio in virtual reality. After testing a variety of the forefront software and interviewing a collection of music producers and game designers, I learned that the majority of these relied heavily on mapping 2D interactions to a 3D scheme and did not utilize the affordances VR had to offer.

Design Phase 2: Prototyping Beyond Skeumorphism

The first prototype was based on an outer space scene, revolving around objects fit with visual feedback of what was happening in Ableton. For example, rings around the planets labelled "Kick and Snare", "Hats", and "Guitars" would come into existence and vanish based on whether the player triggered their respective audio clips to play or pause. The large, central black hole orb was synced to grow in discrete intervals every measure, and every 8 measures would play an exploding animation and return to its original, smaller size to give the performer a sense of where in the music's temporal space they were. The animations on the intergalactic dancers also synced up to the beat of the song being played, which gave users visual feedback of the groove and made the space and project feel more fun and bright, granted their dancing was very genre specific.

Design Phase 3: Embodiment through the Affordances of VR

Through user testing and my own research, I learned that this method was essentially a remapping of a 2D DJ interface in 3D. As such, I wanted to reapproach the interaction schema in a way where the music producer was not at all tied to the action of "point-and-click", which is essentially a more labor inducing mouse, and rather to have every trackable movement of their body control different parameters of a virtual synthesizer.

Design Phase 4: Improvements on Intuitive Control

The previous design schema allowed for a plethora of fine details to be controlled through small changes in body position, but it was not very intuitive to link each motion with exactly what parameter, especially without a graphical user interface. After conducting interviews with music producers, vocalists, and game designers, I realized that it would be more beneficial to focus in on designing a tool for vocalists. With earpiece microphones, they perform hands free and can use all of the affordances a standard VR headset has to offer without sacrificing their ability to perform musically. This iteration included a variety of settings for vocal harmony generation based on the distance between the users' controllers and their headset, the ability to control background tracks and tempo, and a screen that displays lyrics, editable in the Unity file.

Design Phase 5: Gestural Controls and Automatic Harmonization

After considering what level of control over vocal harmonies I wanted, I realized that it would be more user-friendly to automatically generate bulk chords, rather than to have individual note control. This adds the ability for more than 2 "musically correct" harmonies to be generated at a time by not trying each individual harmony with the 2 handheld controllers. It does eliminate the ability for a vocalist to sing one root note and actively improvise the harmonies, but the peace of mind and new access to mobility allows for other types of gestural controls to enter the space without inherently interfering with the vocal controls simultaneously. In the current environment, I decided to simply control the enabling and disabling of these harmonies, as well as a few other sought after vocal effects (e.g. formant shifting), through simple body language gestures.

Design Phase 6: Next Steps

After rigorous user testing, I have learned about the successes of my design (primarily translating meaningful gestures into functional audio controls) and the shortcomings (primarily an excess of menu diving and non-modular setup options), and have received a lot of feedback prompting me to redesign towards an advanced karaoke system. I am currently in the process of building a scanner that scrapes lyrics from Genius.com and pipes audio from Spotify and Youtube into a collaborative performance environment. In this space, the software will automatically match the key and tempo of audio effects to the songs being queued, and everyone will be able to manipulate their own voice freely.

Outcome & Reflections

This multi-year research project has evolved through careful iteration and extensive user testing, generating valuable insights about embodied interaction in virtual environments. By reimagining how musicians and performers can engage with audio processing tools in three-dimensional space, Space Jam challenges conventional interfaces and explores more intuitive and expressive modes of sonic creation. The project has significantly expanded my technical expertise in spatial audio, real-time DSP implementation, gesture recognition systems, and cross-platform VR development. Perhaps more importantly, it has deepened my understanding of how technology can enhance creative expression through thoughtful design that prioritizes human movement and natural interaction patterns. As immersive technologies continue to evolve, the insights from Space Jam will inform future work in creating more embodied, intuitive tools for musical expression across different realities.

Key Features

  • Real-time vocal harmonization with gesture controls
  • Visual feedback synchronized with audio parameters
  • Automatic key and tempo matching
  • Body movement-based parameter control
  • Collaborative performance environment
  • Lyrics display and editing capabilities

Technologies

VR
Unity
C#
Ableton Live
visionOS