Virtual Musical Instruments: Technological Aspects and their Interactive Performance Issues


Suguru Goto




I have been creating various Gestural Interfaces (*1) for use of my compositions in the project of Virtual Musical Instruments (*2). These Virtual Musical Instruments do not merely refer to the physical instruments, but also involves Sound Synthesis (*3) programming and Interactive Video (*4) to a great extent. Using the Virtual Musical Instruments, numerous compositions and performances were experimented. This paper is intended to report my experiences, as well as their development, instead of being merely abstract theory. This also contains the issues of performance and the problem of the notion of interactivity.


I. Introduction

Our confronted problem is that most artists use commercially-produced computers. Very few of them build their machines from scratch. Individuals tend to purchase computers that have evolved from marketing strategies, while institutions equip themselves with larger computers in accordance with their administrative requirements. Yet these commercially-produced devices are aimed at the mass market, rather than at the individual artist who wishes to develop original ideas based on an imaginary world of his own making. Even if the artist knows how to program his computer, his possibilities are limited by the computer's operating system and by the power of his machine. He might create a controller using sensors and a circuit of his own design, but these signals end up being treated by the computer. The artist has to be aware of the fact that his creativity is always reliant on commercial considerations.

Likewise, we have to be selective with regard to the overwhelming mass of information surrounding us. Otherwise, we will succumb to the totalitarianism of the media. If we are to retain a certain amount of individuality, we have to be aware of these manipulative processes. This is not the case of many artists, which is why many artistic creations are banal and conformist.

We are living at a time when technology is accessible to all. The computer may be regarded as the symbol of our democratic society, inasmuch as it is a product that is available world-wide. At the same time, it can be an instrument of power and authority.

However, the artist can also profit from society. Today, the relationship between art and technology is a much closer one than it ever was in the past. Likewise, technology is generally regarded as something that enriches our lives, but this does not always apply in the case of art. However, in some cases, technological developments can trigger new artistic genres. For instance, technological progress has led to the creation of multi-media art and interactive art, and artists have been able to profit from this development.

In recent times, this interaction has been increasingly apparent in music. Although technology does not invent a new art form by itself, it offers musicians many new choices. In this sense, the artist's function is no longer that of conveying traditional values and thoughts. The artist is an intermediary who offers his audience new values and perceptions based on his interaction with technology.

An attitude of naive optimism towards technology is no longer possible. The artist should rather consider how he can confront technology, while remaining aware of its dangers. Technology itself doesn't create a new sensibility, but the interaction between humans and machines can do so. Once we accept the notion of interaction, we can better exploit the new possibilities it offers. The artist will no longer express his traditional thoughts and emotions, but interact with the machine.

In my view, certain artistic trends, such as the development of Virtual Musical Instruments, can go some way towards resolving these difficulties. Unlike commercialized products, a Virtual Musical Instrument is created by an individual for his own artistic purposes. He can modify his instrument by programming it in accordance with his artistic leanings. Virtual Musical Instruments thus go some way towards answering a question that all art forms have to confront today.


II. Virtual Musical Instruments

Before the discussion is continued, I will attempt to define the Virtual Musical Instrument. A. Mulder describes about Virtual Musical Instruments as [1] :

"..., analogous to a physical musical instrument, as a gestural interface, that will however provide for much greater freedom in the mapping of movement to sound. A musical performer may control therefore parameters of sound synthesis systems that in real time performance situations are currently not controlled to their full potential or simply not controlled at all."

H. Katayose et al talk about their system, Virtual Performer [2] :

"... the Virtual Performer which is a system composed of gesture sensors, a module analyzing and responding obtained information, and a facility for presentation."

The latter definition rather emphasizes the whole system, instead of merely defining a controller itself.

I started the idea of Virtual Musical Instrument, especially in the case of the "SuperPom" (Figure 1), in which a gesture *5 of a real violin is modeled without however, producing any sound itself. The sound can be generated with the aide of a computer. According to the algorithms of Mapping Interface *6 and Sound Synthesis, the sound may be extensively varied (Figure 2).


Figure 1. View of the SuperPolm

Figure 2. The Transformation of Data


One of Gestural Interface I have designed is the BodySuit (DataSuit), a suit fitted with bending sensors that are attached to each joint of the body (Figure 3). This suit is an ideal performance tool: it enables me to make wide, sweeping movements that can easily be observed by the audience. Another of my instruments, the PowerGlove, triggers sounds by means of the bending of the fingers. However, although it opens up new possibilities with regard to performance, the only week point is that it does not allow for subtle musical nuances.

The "SuperPolm", which allows me to make small, subtle movements that can trigger far more complex sounds. The "SuperPolm" is better adapted to my musical, when subtle musical nuances are especially required in my composition.

Self-produced instruments offer far more possibilities than traditional musical instruments that are augmented by sensors. Such instruments can produce acoustic sounds and control a computer at the same time. Since Gesture Interfaces merely trigger sounds, their capabilities can be modified by programming. This is an essential factor in my compositions. One of my gestures at one moment might produce a sound similar to a traditional instrument but in the following section the same gesture might trigger a very different sound. As well as allowing for more possibilities in terms of sound, it also allows for a certain theatricality in performance.


Figure 3. BodySuit (DataSuit)

Figure 4. Performance with the SuperPolm


A controller is adapted to a specific type of gesture. In this case a controller refers to Gestural Interface, but it also means a remote controller device to manipulate a computer from a distance through MIDI etc. Although the performer is not required to master complex fingering techniques, as with traditional instruments, he still needs to learn how to play it. A controller can lead to the creation of new gestures. For example, the "SuperPolm" contains a force sensor placed in the chin rest and an inclinometer measuring respectively the performer constraint to maintain the instrument and the angle impressed toward the vertical. Therefore the performer can control two added parameters with pressure of chin and bending the upper-half of a body towards front without the hand movement (Figure 4).

However, these body movements do not convey any particular meaning, nor do they have any choreographic relevance. On the contrary, their sole function is to control the Virtual Musical Instrument. A dancer, for instance, might find it difficult to use such an instrument. Dancer needs to control their bodies in order to execute intricate figures, and even a well-trained dancer is incapable of controlling musical materials at the same time as he dances.

Now this non-functional gesture derives some issues, especially in a situation involving sound and image in real time. The crucial point here is "Interaction" which refers to "possibilities" of interaction and "definition" of interaction in the performance context. Gesture does not state by itself, though, it may trigger sound and can be completely altered by a program (Figure 5).


Figure 5. The configuration of the transformation process

This article intended to discuss each specific subject about my project, Virtual Musical Instrument with following this flow chart.


III. Gestures and Music


"The interest is how electronic systems can extend performance parameters and how the body copes with the complexity of controlling information video and machine loops on-line and in real time".

- Stelarc, Host Body/Coupled Gestures: Event for Virtual Arm, Robot Manipulator and Third Hand [3]


We may now raise a relationship between gesture and music, however the further discussion will relate to human perception which may occur during the performance of the Virtual Musical Instrument.

Before we continue the discussion, it would be important to define "gesture" in the context of the article.

The French researcher, Claude Cadoz, defined gesture [4] as follows,

Associating a hand, we can consider three different functions, but complementary and overlapped. Each of the three different functions intervenes with the other in varying degrees.

Epistemic - The sensory function through tactile experience.

Ergotic - The physical movement, such as transformation and transportation, includes only the extremity of the human body with its skeleton joints or all its muscles.

Semiotic - The gestural behaviors that function to make others know: the gestures that produce an informative message destine for the environment.


The instrumental gesture is, however, the combination of these functions. If we think it together with instruments and music, we find a huge variety of ways to convert the gesture into sound information.


- It applies to a material object and there is a physical interaction with it.

- The different physical phenomenon that the forms and the dynamics of evolution can be self-controlled by the subject happen in the frame of the interaction.

- These phenomenon could, therefore, become the supports of communicative message.
The instrumental gesture is the gesture of the production, but allows the ergotic gesture. It particularizes by doing that which it produces or it transforms: gestures are informative phenomena.


The gestures of a violin are originally imitated on the "SuperPolm", however, it is not necessary to play in a similar manner always. Gestures are translated into parameters. Gestural Interface may be regarded as an interface between gesture and the computer insofar as they translate the energy derived from body movements into electrical signals in order to control sound or images.

Virtual Musical Instruments also allow learning new gestures, because they can be assigned new functions by programming. A controller can lead to the creation of new gestures. However, these body movements do not convey any particular meaning, nor do they have any choreographic relevance. While an audience observes this non-articulated gesture, there are much aspects to deal with his perception of the resulting artistic materials. We may define "gesture", especially in the context of Virtual Musical Instrument here, as "individual experience of the perception".

The German composer, Dieter Schnebel, talks about Sound and Body in his book, Anschläge - Ausschläge, Text zur Neuen Musik [5]:


Most sounds, especially music, are created by actions.

While looking at those, the sound production becomes the dramatic course of the event.

Therefore, the gestures of the musicians at the sound production have their own visual lives. If music is however, an art metalanguage of feelings with strong affinities to the consciousness, then the gestures essentially belong to it, and then the feelings appear in sound and gesture. The gesture is like a visual sound. Although a musician clearly expresses his large fantasy through his gesture occasionally on the other hand, it proves to reduce (in order to turn away the attention from sound). The gesture is never autonomous here, but is always related to the progression of music - therefore to the composition.

The listener's experience of gestures and music can also be affected by other factors, such as concurrent stimuli. Visual stimuli can stimulate an aural experience. The listener's environment can also modify his state of mind. In our home listening environment, we can play music at any time and in any circumstances. We can imagine the performer's gesture as we listen to it. We can moreover concentrate on the sound, as there are no visual stimuli to divert our attention from it. My projects however, are oriented towards the concert environment, in which sound and vision interact.

Sounds are visible - therefore, also for the eyes.


Although music is originally for the ears, we don't like to listen to it in darkness - we prefer to listen to it in half-darkness. A concert however, where music comes from speakers, gives us some frustrations - maybe that is why electronic music was not fully accepted and the need of live electronics was created.

Nowadays it is quite easy to listen to music with audio equipment at any time of the day or night, yet people still go to concert halls to experience music. What is the difference between listening to a recording with headphones and listening to live music? In a concert hall, acoustic instruments can be heard without loss of quality. Moreover, the audience experiences a different kind of space, which differs from its living or working environment and, even more importantly, it can also observe the gestures of the performer on stage. Of course, there is generally a relationship between the sound and the performer's gestures. Broader gestures tend to signify a greater dynamics, and audiences notice a difference in dynamics as a result of the musician's gestures, even though there is little difference in terms of decibels. To sum up, in the concert situation music becomes an aural and visual experience, and gestures are of paramount importance. This is where Virtual Musical Instrument comes in, inasmuch as it makes it possible to incorporate gestures directly into performances of electronic music.

An essential aspect of my solo performances, which I call interactive media art performances, is the interaction between gesture, sound and image. With the "SuperPolm", a single body movement can control sound and images at the same time. This relationship can be clearly presented in real time, and can be an unexpected and complex one. It is a concept that could undergo considerable development inasmuch as I can play with notions such as simplicity and complexity, for instance by triggering complex textures with simple hand movements. Sound, image and gesture play an equal part in these events.


IV. Gestural Interfaces

I have chosen to focus on the use of Gestural Interfaces in a performance context. Gestural Interfaces differ from traditional ones in that they cannot produce sounds by themselves. They merely send signals that produce sounds by means of a computer or a sound module. They may be regarded as an interface between the performer and the computer insofar as they translates the energy derived from body movements into electrical signals.


1. General Description of "BodySuit"

The "BodySuit" was built from 1997 to 1999. This is intended to be motion capture for entire body. 12 bending sensors are attach on each joint, such as left and right wrists, both elbows, both shoulders, both ankles, both knees, and both groins.

Although a performer wears the "BoduSuit", this does not mean that he merely controls a physical instrument. As matter of fact, the physical limitation is lighten in contrast to playing a traditional musical instrument. Therefore, a performer merely produces sounds with his gestures which is bending and stretching each joint.

According to the human's body limitation, bending one joint can cause moving other sensors that brings changing unexpected parameters, although a performer do not wish. When one bends his left knee for instance, it is inevitable to keep straightening his left groin. In such a case, a performer may switch on and off for each sensor with buttons on his arm.


2. General Description of the "SuperPolm"

The MIDI Violin the "SuperPolm" was built in 1996. The "SuperPolm" was created with the collaboration of engineers in IRCAM, Patrice Pierrot and Alain Terrier. It was originally intended to complete a piece I composed for IRCAM in 1995 - 1996. It is based upon the idea of short range motion capture, such as finger, hand and arm movements. The signals are translated into MIDI signals so as to control generated sound in real time. In this project the fundamental concept of motion capture is divided into three categories. Short range movements include finger, hand, eyes, and mouth movements. Medium range movements consist of movements of the shoulders, legs, head etc. These can easily be observed by the audience. Large range movements involve spatial displacement and include those of the feet and legs. The parameters of these movements can be translated as position or distance.


Figure 6. Finger board of the SuperPolm

Figure 7. The potentiometer on the bow


The "SuperPolm" may be also regarded as a controller that operates a computer from distance. Needless to say a controller does not generate sound by itself, but it nonetheless allows the performer to express complex musical ideas. The performer plays in a similar manner to a violin, except that the fingers touch sensors on a finger board instead of pressing strings (Figure 6). The velocity of the sound may be changed by a movement of the bow, which records variations in the resistance (Figure 7).

Sound may also be produced by means of chin pressure or by changing the angle at which the controller is held. The original design of the "SuperPolm" was modified during the course of its development for practical and technical reasons. A great deal of time was spent looking for sensors and investigating their possibilities in a laboratory and many of them had to be abandoned. Once the electrical circuits were ready, many more changes had to be made so as to adapt the fingerboard and bow to the needs of the performer.

The "SuperPolm" was designed as an interface for small-scale gestures, and one particular movement of the composition focuses specifically on the possibilities opened up by the controller. A traditional instrument physically limits the possibilities to create the sound. With aid of a computer, one gesture can create complex and large number of notes at the same time. This also allows a sense of real interaction between sound and performance and sound and intuition in real time.

In order to detect finger position and pressure a sensor called Interlink's "FSR" is used. Four position and pressure sensors are attached on the finger board. Those four sensors which are 10 cm * 2.5 cm have to be placed in an irregular way on a finger board. However, this allows to reach all the sensors at the same time in any position.

The capture of the bow movement was the most difficult issue. Many experiment models have been done to arrive at the final model, which uses the bow as a potentiometer. There are more than 100 resistance which are placed on the bow in serial. In the middle of the instrument's body a metallic bridge is fixed. When playing, the bow is touching the bridge. The output voltage depends on the contact point of the bow on the bridge. Therefore a performer plays the bow with a similar manner of a traditional violin.

According to bend a body slightly forward, angle of instrument automatically changes. A sensor called accelerometer "Analog Device - ADXL" is placed inside the body of the controller in order to measure the value of the bend. This allows to perform with changing incline towards gravity.

A violinist holds an instrument between his shoulder and a chin. On a position which chin touches there is attached inside. One may change parameter which he changes a value of intensity of pressure with chin.

For the accelerometer and chin pressure sensor there are switches for each to start and stop sensors. There are buttons on a heel of bow where a player holds and on a root of a finger board.


3. BigEye

This software, "BigEye" [6] is programmed by Tom Demeyer at STEIM foundation in Amsterdam, Holland.

BigEye is an application designed to take real-time video image and convert it into midi information. This may be configured in the program in order to extract objects, based on color, brightness and size. These objects are captured (up to 16 channels at the same time) and their position is scanned according to a pre-defined series of zones.

Instead of a color object, I have chosen two halogen lights in order to be able to be detected their positions of lights in space. One of the major reason is that the color object can be unsteadily scanned by the computer depending on the situation of the light. Although the light is not much different for the human's eyes, the computer sees in a different way.
Depending on the position on a concert stage, the situation of light can be changed. It may cause much unexpected results between the preparation in a studio and the performance on a stage, regardless the function which allows to adjust the brightness in the program. Usually there are much stronger lights on a stage. If a performer holds two halogen in his hand, he can easily control the parameters without much disturbance of the problems of light conditions. There would be no problem of the difference of light, since the two halogen lights themselves emit light. Therefore, the scanned result is much stable.

Large range movements involve spatial displacement and include those of the feet and legs. The parameters of these movements can be translated as position or distance. These two dimensions are explored with this video scan program.


4. Analog to MIDI Interface

Body motions are first transducer by sensors into electrical signals. We need to clarify the process of transformation of these signals for computer input. Indeed captured data from body motions need to be transformed into sound. With this point of view gesture and Gestural Interface are merely located at the very beginning of this process. Although each different sensor has a different construction, with an electric circuit, the signal varies merely from 0 to 5V. In order to communicate with a computer these analog signals need to be translated into digital signals. For practical reasons, I choose an analog to MIDI interface. Then the MIDI signals are conveyed to the computer, and used as parameters to generate sound.

This "Analog to MIDI Interface" refers to the interface which concerts the analog signals to MIDI in order to be able to communicate with a computer. Building the "Analog to MIDI Interface", the board "AKI-80" was used. This has the powerful CPU at that time, Toshiba, "TMPZ84C015BF". The "Analog to MIDI Interface" has 32 analog inputs and outputs. Each channel can be independently controlled by MIDI, such as controlling from Max. The CPU was programmed by Assembler. This interface was built by a major contribution from Yoichi Nagashima.

V. Mapping Interface, Algorithm, Sound Synthesis, and Interactive Video

Those subjects are generally based upon the following ideas:

- issue of relationship and connection between one level to another

(this level means ex.. gesture -> Gestural Interface -> mapping -> algorithm etc.)

- issue of Virtual Musical Instrument and Interactivity

- issue of application in a composition and performance theory


1. Relationship between Gesture and Interaction

Clear interaction between gesture and sound can be reminded asrelationship between visual aspect and oral aspect in traditional
instruments. For examples movement of fingers derives difference of pitch and intensity of movement refers to difference of dynamics etc. On the contrary, the movement of rising hand may create difference of density of sound texture with Virtual Musical Instruments. This can be much related to the subject of sound algorithm.

This may also involve cognition of human with observing interaction process:

-simplicity and complexity

-expectation and unexpectation

A good interaction with gesture may promise to success for interactive performance. Gesture can be represented immediate sound materials. When the relationship is too simple, an audience easily looses his attention. Gesture can be triggered to start repetitive patterns, sequence, or complex texture with algorithm. Those may perhaps bring higher musical quality than a simplistic approach, however, when the relationship is not obvious, an audience may loose his interest after a while. Those are not in a domain of technological issue. Perhaps a composer needs to find his solution in his piece.

The relationship between gesture and interaction can be flexibly changed during the course of a piece. The perception of interactivity can be integrated into a musical context. Gesture can be clearly reflected to the acoustic property. For example, slow movement of body reflect lower dynamic level, softer articulation, or slower tempo. On the other context, the same gesture produce a sound like cello which implies to bow.

Not only this abstract perception level between gesture and their visual/oral result, but also an creative approach how the physical action can be interpreted into a digital domain, then how these signals are efficiently expanded. Therefore, the discussion of Mapping Interface/algorithm, and sound synthesis which are talked in the following section are much related to this issue.


2. Relationship between Gesture and Mapping Interface and Algorithm


Figure 8. Mapping Interface


As it is already pointed out, gesture can be mapped by algorithm, such as Fuzzy theory or Neural Networks (Figure 8). Gesture spontaneously interacts some of parameters which controls degree of randomness, speed of sound texture transformation, order of selection of data.

The Mapping Interface refers to disposal of MIDI signal from an analog to digital interface into various hierarchies of algorithm. The application "Max - Opcode" is used on Macintosh PowerPC. The function of Mapping Interface are as follows :

a. the value of voltage from Gestural Interface rarely varies full range from 0 V to 5 V exactly. Therefore the MIDI value from the analog to digital interface does not range from 0 to 127. On the Mapping Interface the value is scaled to full range from minimum to maximum value.

b. Depending on a sensor a response of value varies in a different manner. For the sake of a practical performance or a compositional reason the value is treated either liner or exponential.

c. Although it depends on a speed of CPU, a performer may be cautious against MIDI overflow in a live performance. The Mapping Interface can regulate the maximum limit of speed of scan.

d. The potentiometer of bow on the "SuperPolm" merely capture the position of bow. It however, needs to integrate gesture and sound intensity. The Mapping Interface translates parameter of position into energy of movement in a limited time. The difference of value within a short period is translated to velocity of sound.

e. MIDI noise is eliminated in the Mapping Interface.

f. Each sensor can be regulated either to be in active or nor active according to necessities of performance or a section of composition.

In the distribution, a signal (or signals) is divided or are combined to the following methods:

a. one sensor -> one parameter

b. one sensor -> multiple parameters

c. multiple sensor -> one parameter >

d. multiple sensor -> multiple parameter

After the Mapping Interface, the signals are treated in various manners to generate sound. This is rather domain of purpose of performance and compositional taste. It may also facilitate to clarify the conjunction between gesture and musical expression, as well. There are also other aspects which can be considered as followings :

a. With trigger, sequence, pattern, and lists of data can start and stop as being an accompaniment.

b. Parameters of sensors can be translated to complex musical texture with calculation in algorithm.

c. To enrich timbre, this can control to organize the sound, according to parameters from Gestural Interface.

d. This may regulate the parameter in order to communicate with another computer.

Algorithm can be integrated in sound synthesis and signal processing. With applying to Physical Modeling, gesture can be transformed to an imaginable instrument. Instead of assigning to each parameter of signal processing directly, gesture can control a value of ratio of morphing and interpolation algorithm.


3. Sound Synthesis, Musical Sound Production, and Gesture

Virtual Musical Instrument can be widely changed according to sound synthesis programming. As matter of fact, instrument design with sound synthesis is one of the important factor. Eventually this deeply relates to gesture and the notion of interactivity.

One of the major problem is a limit of sound synthesis in real time. Although the CPU has been greatly developing lately, it is still enormous task for a computer. Johannes Goebel pointed out the critical issue of "poor" sound that is generated in real time and the perception against it [7].

Since the 1960s, it has been a long-term computer music project to utilize the gestural control virtuosity of traditional trained artists (conductors, performers, dancers, etc.) in the digital domain. However, investigating the compositional tools specifically supplied by digital technology for the precision and control of sonic properties has dropped to a low level since digital sound became available as "ready made", and as the imitation of acoustical instruments became a major aim of real-time synthesis. "Low level" does not refer to the scientific and computational complexity to create such sounds, but rather refers to the compositional level linked to the auditory perceptible result. Listening to a "boring" piece of music set for acoustical instruments, I might still focus on the acoustical properties, of the instruments and find a sensual richness. When I listen to a "boring" piece of electro-acoustic music, however, it will usually also be presented with "boring" sounds, and if the sonic part of a piece with digital sound generation is "not boring", usually the whole piece is quite a bit closer to being "not boring". Rarely will we find a piece that supplies digital audio signal processing techniques with non-imitative and "convincing" results that we also find musically "boring".

It is not only in the case of a gestural controller, however, disappointment about "poor" sound is one of major problems in the interactive music generally.

Concerning this problem of sound synthesis in real time, the alternative possibilities are experimented in order to find the solutions. One of important element for those are to concern about the possibilities of controlling sound algorithm with gesture: controlling sound synthesis, changing sound effect, such as filter, delay/reverb, chorus/flanger, pitch shift/harmonizer, and spatialisation. In another word, the relationship between gesture and sound algorithm need to be much explored. While sending signals from gesture, this simultaneously conveys two elements, but those are not exactly same things: playing notes and controlling parameters. These may be interpreted to a performance of musical materials and controlling effect, especially in a compositional context.

With a complex algorithm, intricated sound texture can be produced by a simple gesture. The parameter of gesture goes through various levels of hierarchy. Signals in one channel can transform into mass of sound with randomness. One single movement of body can be spread many channels of parameters. As a note which is produced by an acoustic instrument contains many information, in a same manner, many parameters are assigned to a note of sound synthesis which is controlled by many parameters at the same time. Many channels of gesture's parameters can be combined into one single note. Eventually this allows to express subtle and complex musical expression.
Additional controls may be included at the same time in order to achieve subtle musical expression. The simultaneous control of parameters can also bring richer sound, such as jitter, complex envelops, or interpolation. Since timbre has a lot of factors which is related to time domain, in another word, the spectral may gradually changes as time goes, this technique can be applied to sound synthesis in Virtual Musical Instrument. However, additive synthesis and FFT can cause much problems in CPU utilization in real time. Perhaps this CPU utilization problem can be solved using with several computers which communicate each other via MIDI and so on.

Sound Synthesis is a huge subject that is beyond the scope of this article. The method of utilization can be merely discussed here.
FM sound can be extensively varied with a few parameters: carrier, amplitude, mod. frequency ratio, mod. index. With a gesture, timbre can be easily changed with a few channels. When the timbre is not modified, notes are simply triggered by gesture, but pitch and amplitude can be changed.

Parameters in Granular Synthesis can be altered by a gesture either in independent parameter or several parameters at the same time.

a. number of samples

b. sample changes

c. pitch tables which are previously prepared

d. random values

e. speed for triggering notes

f. duration in sample

g. position in sample

h. random values of duration

A sample can be gradually changed from one to another with foggy grain sound texture.

According to position of Gestural Interface in space, sound source can change in real time. Sound is reproduced by 4 speakers. As a position of Gestural Interface moves in circular, sound from the speakers moves in the same manner. The virtual size of space in sound production may be changed depending on the position of a performer in a stage, as well. If necessary, the Doppler Effect may be included in order to simulate movement of sound source.

The relationship between gesture and sound synthesis is crucial theme to explore further, since this decides fundamental elements of interactive compositions. For example, the relationship between composed gesture and predetermined sound synthesis can vary from section to section in term of composition technique. This derives further performance models, such as, not only instrumental gesture, but also visual aspects: theatrical or conceptual performance models. While Gestural Interface physically remains same, the function of instrument models can change in each sections according to the different sound synthesis. Therefore, those relationships can be integrated into a composition concept.

In improvisation context, free gesture can flexibly varies sound according to how much the sound synthesis programming allows to alter the original sound. However, it is not too much to say that the sound synthesis have to be sufficiently prepared beforehand, the selected sound is merely changed during performance. Free gesture can merely changes pitch/velocity, parameters of timbre, sample presets etc. At the point of view, this is rather question of controlling level of indeterminacy.
In the sense of indetermined composition, sound synthesis can be improvised in certain amount, but the presets can merely changed in liner way. In another word, sound synthesis can changed in each section which program change. If it is necessary, the preset can be changed in non-liner way in order to cooperate with much free improvisation gesture.

A primitive relationship between gesture and sound leads audience to withdraw their attention toward a performance. For example, this may especially happen, when the same effect remains for a long period, such as, to stop or to start a sound with a trigger of gesture or gesture of bending joint simply corresponds to pitchbend. However, it is certainly trade off that clear interaction brings less musical sound subtleties, but more clear understanding what is happening, while complex interaction may derive more complex sound texture, but easily fatigues an observer after a while.

Therefor, the instrument design with sound synthesis is really depending on the necessity of a composer or a performer.

Originally interactive sound synthesis part was done in Max/FTS with ISPW on NeXT computer. There however, is a problem to have a performance outside institution, since a larger computer, such as NeXT or SGI can not be easily taken out from there.
Lately as the CPU of Macintosh is much developed, this is rewritten for Max/MSP.


4. Issue of Interaction in Musical and Performance Context

a. Criteria of flexible reaction and degree of preparation and chance: in another word, how much the materials are prepared beforehand, how much the materials are improvised with free gesture by a computer in real time.

b. Feedback (this however, does not refer to physical feedback): except automated randomness, there are not so many possibilities of flexible feedback. As matter of fact, it is merely one direction from human gesture -> interface -> algorithm -> sound and image. Is it related to the fundamental limitation of artificial intelligence ?

c. To compare with complexity of human perception against musical events, the problem of limitation of parameter (which can be given by a human) can be exposed in a programming process.

d. Fundamental contradiction between definition of "interaction" and interaction in Musical and performance context...

Many interactive artists experience problems with these systems. The problem does not come from the computer, but from the way in which we perceive the world. There is always some doubt as to whether an interactive system produces something real or not. The feedback that is triggered by a sensor is created by an only program which is prepared beforehand. The computer merely obeys an order from a signal.

The physical feedback, such as friction bow or vibration of body may perhaps help a traditional performer in order to facilitate his musical expression, while unfamiliar sounds are constantly produced by a computer. However, in a sense of this notion of interactivity, this physical feedback is also in one way process that a computer sends signal to start a motor in order to vibrate.

During the development of the Virtual Musical Instruments, the notion of interacitivity is questioned deeper than before. As matter of fact, this requires much further research in order to find where I can find this answer. Perhaps Neural Network or Fuzzy[8] may help the realization of Mapping Interface, in order to lean and to be flexible to analysis the input data. We could apply the Generic Algorithm to create interesting sound texture in a composition's horizontal time progression as time further progress. However, we have to admit those capabilities in the artificial intelligence are far away from our demands towards musical expression and our keen sensibilities. Although those theories try to solve our problems with study of a creature (organic nature), our notion of interactiviy obliges to say that we perceive the one-way process interaction. In musical performance, those images and sounds which may contain complexities like Artificial Life, may easily turn to appear primitive texture while performance contains much complex time processing and dynamic space.


5. Interactive Video

This software, "Image/ine " [9] is programmed by Tom Demeyer at STEIM foundation in Amsterdam, Holland.

Image/ine is a real-time image processing tool. As a real time imaging tool, this allows to control parameters to all the functions of the program by MIDI control. Effect functions contain Keying (Chroma and Luminance), Displacement mapping, image sampling and playback.
Image/ine allows to control image source material from video image which is taken by a video camera, QuickTime movies, image files, and scanned images.

The Virtual Musical Instruments can also control the parameters of images in real time. For instance, it can superimpose live or sampled images on top of each other, add effects, such as delay, and speed up, reverse or repeat these images. It can also mix several images in different proportions and modify their color, brightness and distortion, while the sampled images can be started and stopped at any point.
There is a small contact video camera at the top of neck on the "SuperPolm". This can capture finger movements from close a distance. If I change the direction of the camera facing away from the fingerboard, the images widely move according to the angle of the "SuperPolm". In the same manner, those images are altered by the parameters of the "SuperPolm" in real time.

VI. Performance Issues

The Gestural Interface can be regarded as a media in order to communicate between human and computer. On the other hand, the Virtual Musical Interface contain algorithm and sound and image production. The notion of inteactivity is certainly raised as a question here, while a performer or an audience may be too nervous whether gesture correctly responds to sound in real time, as well as questioning that sound and images are rich enough on a compositional context.

The background of the development of the Virtual Musical Instruments is discussed here with the issues of interactivity and their own technological aspects, as well as about human perception.


1. Issues of Human Perception and the Limits of Computers

There is an underlying problem about multimedia pieces: although the way each media is used may be interesting in itself, they don't always work together. Theater-based performances get around this problem by stressing the narrative element and adapting the background music and scenery accordingly. But this is not the direction I have chosen. Drama and narrative are of no interest to me. My focus is on perceptual experience. I consider sound, image and gesture as neutral elements that exist in parallel and interact with one another. This bring to further new perception possibilities to audience. Eventually this creates multiple perception. The meaning is not given by the work, but the audience creates the meaning according to their own internal perception experience.

Our perceptual abilities are extremely complex in comparison to those of a computer. The question as to whether subtle artistic nuances may be conveyed by a computer is an issue that calls for considerable discussion. Indeed, it is one of the challenges that confronts artists working with new technologies. But although the computer offers a limited number of possibilities, it can still inspire an artist. He might wish to exploit its mechanical aspects, such as the repetitive figures and sounds that may be obtained from automated algorithms. Alternatively, he might try to develop his own individual approach to technology, by creating a new type of art work, or he might regard the computer as nothing more than a tool, which is an attitude frequently held by older generations. Admittedly, a computer cannot express subtle variations in pitch and time. While a human player expresses highly complex sound with vibrato, a mechanical vibrato with a computer does not bring subtle variation. In ascending scale, each pitch is usually slightly raised according to the context of phrase, however, a computer does not adequately select varied contexts unless it is fully programmed beforehand. 3D images may hold an audience's attention for a while, but even the most entertaining images produced by a Head Mounted Display and DataGlove will pall after a time.

This is a problem between artist's aim and the limits of the computer's capability., especially in interactive environment.
Claude Cados talks about the problem in another way concerning human's perceptual parameters and the monitor parameters in the analysis of a computer [10]:

The most immediate representation of gesture information can consist in the visualization of the signals in amplitude/time coordinates. By graphic edition these signals can be manipulated and transformed. However, this is much too simplistic, and more so since such a representation gives no information concerning the perceptual parameters in question, it is not adapted to displaying the pertinent forms of the gesture. We can also notice that between raw gesture signals and the parameters of gesture behavior, there exists the same anisomorphy as between the perceptual parameters and the monitor parameters of the synthesis algorithms. The representation of the gesture is in this case the dual problem of psycho acoustic analysis. We would point out that this opens a research domain that cannot be approached with the idea of finding immediate and definitive solutions.

Confronted with such experiences, artists react in different ways. In some cases, they might give up on computer-based art, whereas others may feel inclined to take up the challenge and pursue their research. Yet others apply traditional aesthetic concepts and materials to computer art, but they are merely avoiding a fundamental problem that will recur time and again, as computers continue to play an increasingly important part in our lives.

Yet even artists who approach computer art in non-traditional ways come up against problems. Many interactive artists experience problems with the notion of interactivity. The feedback triggered by a sensor is merely created by a pre-prepared computer program. A computer's improvisational capabilities are extremely limited. The prepared materials are merely reproduced with triggers from a player. They are not usually able to react in a flexible manner to their surroundings, nor to specific events or accidents. So is interactivity really possible? We program our computers to react to signals from a sensor or some other such device. The computer merely obeys an order from a signal. In fact, what we call "interactive" is only a one-way process.


VII. Conclusion

Virtual musical instruments are much better adapted to the multifarious musical styles and developments in modern-day music than traditional ones. With repeated practice, the performer becomes increasingly adept at controlling the musical output of Virtual Musical Instruments. At this level, the controller functions not only as a Gestural Interface but also as a musically expressive tool. The connection between different types of finger and arm movement and the musical quality of the performance will be clear to most observers. However, the player is not obliged to perform as if he were playing an acoustic violin. The angle at which the "SuperPolm" is held can be modified by grasping it with both hands. By associating the position and pressure sensors on the finger board with the pressure sensor in the rubber block, it is possible to play the "Superpolm" as if it were a percussion instrument.

In the near future, it will be possible to connect this instrument with a real-time physical modeling application. This will make it possible to play a non-existing instrument with ordinary gestures. Physical modeling makes it possible to create a non-existing instrument in a computer, such as 10 m long violin or a glass bow, and to relate this construction to gesture in real time.
Virtual Musical Instrument may be also applied to internet technology, as well as controlling sound installation and a robot.

But the technical possibilities opened up by these instruments are not the most important consideration. The area that most interests me is their capacity to modify the audience's perception in new ways, thereby transcending traditional aesthetic values. With the emergence of these new technologies, it has become necessary to rethink the criteria by which we judge art, and hopefully these instruments will help us to do that.




*1 Gestural Interfaces: An interface which translates body movement to analog signals. This contains a controller which is created with sensors and video scanning system. This is usually created by an artist himself or with a collaborator. This does not include a commercially produced MIDI controller.
*2 Virtual Musical Instruments: This refers to a whole system which contains Gesture, Gestural Interface, Mapping Interface, algorithm, Sound Synthesis, and Interactive Video. According to programming and artistic concept, it may extensively vary.
*3 Sound Synthesis: This is a domain of programming to generate sound with a computer. In this article, the way of this programming emphasis the relationship between gesture and sound production.
*4 Interactive Video: A video image which is altered in real time. In Virtual Musical Instruments, the image is changed by gesture. This image is usually projected on a screen in a live performance.
*5 Gesture: movement of body. In this article, gesture is defined as an experience of perception, while an audience observes the relationship between movement of body and visual/aural aspects in real time.
*6 Mapping Interface: An interface which translates input signals into a well-organized state. This differs from gestural interface, such as the Virtual Musical Instrument and analog to MIDI interface. Since this is programmed, the function may widely vary according to the algorithm.



1. A. Mulder "Virtual Musical Instruments: Accessing the Sound Synthesis Universe as s Performer." Available on the World Wide Web at http://www.cs.sfu.ca/~amulder/personal/vmi/BSCM1.rev.html
2. H. Katayose et al "Virtual Performer" in Proceeding of Internatinal Computer Music Conference. (ICMC '93) pp.138-pp.145
3. C. E. Loeffler and T. Anderson "The Virtual Reality Casebook" (New York, Van Nostrand Reinhold) pp. 185-pp.190
4. Claude Cadoz "Le geste, canal de communication homme/machine, la communication <<instrumente>> (Technique et science informatiques. Volume 13 - no1/1994
5. D. Schnebel "Anschläge - Ausschläge, Text zur Neuen Musik" (Carl Hanser Verlag München Wien, Edition Akzente Hanzer) pp.37-pp.49
6. Tom Demeyer, "Manual of BIgEye" (STEIM Foundation, Amsterdam, Holland)
7. Johannes Goebel "Freedom and Precision of Control" (Computer Music Journal, Spring 1996) pp. 46 - pp. 48
8. Masafumi Hagiwara "Neuro-Fuzzy, Genetic Algorithm" (Tokyo, Sangyou Tosho)
9. Tom Demeyer, "Manual.pdf" of Image/ine (STEIM Foundation, Amsterdam, Holland)
10. Claude Cadoz "Instrumental Gesture and Musical Compositioin" (ICMC Proceedings 1988) pp.1 - pp.12

This article was first published in a slightly different version in Trends in Gestural Control of Music, a CDROM available at http://www.ircam.fr


Suguru Goto is a composer and multi-media artist, considered as a Japanese new generation composer. He has received numerous prizes and fellowships such as Boston Symphony Orchestra Fellowship, Koussevitzky Prize from the Tanglewood Music Center, the first prize at the Marzena International Composition Competition in Seattle, U.S.A., and was awarded the "Berliner Kompositionaufträge 1993" by the senate administration for cultural affair, and a prize by the IMC International Rostrum of Composers in UNESCO, Paris. His compositions have been performed in major festivals, such as Resonaces/IRCAM, Sonar, CICV-Les Nuits Savoueuses, ICC, Electrofolie , International Theater Festival Berezillia, Les Rencontres Internationales Paris Berlin, Haus der Kultures der Welt - Haimat Kunst, ISEA2002, NIME 2004, Olhares-Outono, Ressonancias and Audiovisionen etc. His recent works involves new technologies in experimental performing art.











- © 2005 all rights reserved -