While audiophile entertainment at home is usually limited to stereophony, we expect quite a bit more from movie houses, theaters, and multimedia installations: an exciting, three-dimensional, immersive sound experience which envelopes the listener. However, usual 3D-audio playback systems typically have a relatively narrow sweet spot. Outside the optimal listener position the sound image lacks definition, and localization suffers.

SpatialSound Wave Technology, developed by Fraunhofer Institute for Digital Media Technology IDMT, promises to solve this problem by creating a 3D sound experience within the entire room. At Ilmenau in the German province Thuringia, a new reference listening room has been created – equipped with Neumann loudspeakers. A welcome occasion for a visit!

3D Audio with Razor Sharp Localization

The centerpiece of the SpatialSound Wave system is a windows computer with a professional audio interface (MADI or Dante) and, of course, the software developed here, which allows to position audio objects as virtual sound sources anywhere in the room. The playback system of the new reference room consists of 30 Neumann KH 310 A powered monitors. Roughly half of the speakers are at ear level surrounding the workstation while the rest are suspended from the ceiling. Four KH 805 subwoofers in the corners of the room enhance the low frequency reproduction.

The sound reproduction of this system is breathtaking: The localization of virtual sound sources is razor sharp, even far outside the central listening position. The sound image remains stable, the overall impression does not suffer. The listener can move through the room almost like in an actual acoustic environment. Moving toward a sound source, it appears louder; the levels change in a natural way. So far, sound reproduction of this quality had only been possible using wave field synthesis, an extremely complex process which, due to its immense costs, had found little use outside of research projects.

Christoph Sladeczek, Head of Virtual Acoustics, explains: “Wave field synthesis attempts to reproduce the physical wave front correctly by using a great number of loudspeakers. In this room, this would require about 90 loudspeakers. However, there are de facto no applications that would justify such an investment. Therefore, we advanced this process to require fewer loudspeakers. We could actually work with fewer loudspeakers than are installed here.”

Dr.-Ing. Daniel Beer, Head of Electroacoustics, adds: “The intelligence is in the algorithm that controls the loudspeakers. Originally, SpatialSound Wave was based on wave field synthesis which controls the loudspeakers so the individual signals are superimposed in such a way that the sound field is reproduced. In recent years, however, we included a lot of psychoacoustics. The algorithm was altered in such a way that reproduction of the sound field is not physically correct anymore, but the loudspeakers may now be spaced further apart.”

SpatialSound Wave is object-based and is ideally suited to working with discrete audio tracks. Stereo and surround formats may be used by defining virtual loudspeakers within the software. The system is surprisingly easy to handle: The graphical 3D authoring tool allows sound sources to be positioned or even “flown” through the room using a mouse or tablet. Placement does not alter the sound character of the source so the audio engineer is free to focus on the creative aspects.

“In our experience, it takes about fifteen minutes for people to learn how to use the system,” Christoph Sladeczek confirms. “In the old days, when you created an installation, you had to pan 20 loudspeakers by hand. And if the installation was moved to another building or went on tour, you had to start from scratch again – one reason being that you might not have the same number of loudspeakers available. With this system, that’s not necessary anymore. You do one mix, and it will always be reproduced perfectly.”

Unlike usual surround formats, such as 5.1 and 9.1, the output format is not channel-based but it is independent of the loudspeaker installation. The SpatialSound Wave data format bundles object data and metadata, i.e. the sounds to be processed and their positioning. Loudspeaker signals are then be generated by the 3D rendering engine, which automatically ensures optimal playback on each system. A mix can thus be created on a smaller system and then played back on a larger loudspeaker installation; the size of the scene can be readjusted using the metadata scaling function. As the SpatialSound Wave system operates in realtime, it is able to process live input, too.


The main fields of application for SpatialSound Wave are theaters, installations, and planetariums. Its list of users includes the Opernhaus Zürich, the Staatsoper Berlin, and the Zeiss Planetarium in Jena.

“A system like that allows to convey entirely new experiences, also in terms of content,” Christoph Sladeczek explains. “In 2016 the Zürich Opera had an interesting production of Verdi’s Messa da Requiem. The curtain lifts, and you see someone lying in bed, listening to voices. Previously, the sound engineer recorded orchestra instruments playing wrong notes, and for the show, these were positioned three-dimensionally within the opera house. Suddenly theater is not just what happens on the stage in front of me; I become the one tossing and turning in bed, because I can’t get rid of the voices. That really gave me goose bumps!”

Moreover, the system has an additional use, as Daniel Beer explains: “For theaters in particular it is very interesting that the same computer that allows positioning of audio objects in a 3D space can also be used for room simulation. The reverberation time of the room can be increased. It is a regenerative room simulation system, so there are microphones above, which record everything that happens in the room. Their signals are then used for artificial reverb so the room acoustics can be modified according to the music genre. Straight theatre, for instance, requires a different reverberation time than a symphony concert. The big advantage is that the actual acoustics are picked up and modified, so it is not an alien room that is grafted upon the existing one. You can move through the room without the sound appearing artificial. There are solutions for such applications by competitors, but our system has the advantage to be able to do both, room simulation and 3D audio playback.”

The Optimal Loudspeakers

Finding the optimal loudspeakers for the new reference listening room was not an easy task. Although an O 500 system has been in use for some time in another reference room, Neumann loudspeakers were not chosen “by default”. Monitor speakers from a large number of manufacturers were compared in an elaborate test setup.

Christoph Sladeczek: “We used classic listening test methods. Fifteen stereo pairs were compared to our reference system [Klein & Hummel O 500]. All systems were room calibrated and the levels were matched precisely. Next we prepared a test procedure in which fifteen participants reviewed various parameters which we later evaluated statistically.

“An important criterion for us was a neutral reproduction which remains free from fatigue, even at high levels,” adds Daniel Beer. “We compared many loudspeakers from various manufacturers. In the first run we compared smaller models, including the Neumann KH 120. But at the levels that are sometimes required when you place a source on one speaker only, they reached their limits in terms of low frequency reproduction. Other loudspeakers of that size behaved similarly. We did like the sound and the price of the KH 120, but it did not give us enough level for our applications. “ This necessitated a second test run with bigger sized loudspeakers. Yet again, Neumann won: “In the end there were two loudspeaker models left, the Neumann KH 310 and one by a well-known competitor. The final argument for the Neumann speaker was that we preferred its high frequency reproduction.”

An extremely important criterion for multichannel systems is the localization of phantom sources. “Apart from sound and level handling the KH 310 also impressed with its sharp localization,” Daniel Beer explains. Product consistency plays a major part in this: “Later, before installation, we measured the individual loudspeakers and could confirm that the differences in frequency response were below 1 dB. That’s what you expect, of course, because that’s what our underlying theory of multichannel reproduction is based on: All channels are equally linearized in themselves so I can use processing to do what I really want to do in order to present the source as desired. What I don’t expect is that the individual loudspeakers must be matched before I even start. Unfortunately, that happens sometimes. But we expect to be able to focus on what we really want to do, free from such worries.”

Which is exactly the philosophy of Neumann’s studio monitors: highest precision and linearity for a reliable, uncolored listening experience. The result speaks for itself: SpatialSound Wave in the new reference listening room of Fraunhofer IDMT at Ilmenau sounds absolutely breathtaking!