Skip to main content

Hitachi
Research & Development

April 16, 2021

To contribute to comfortable listening environments, Hitachi has developed technology that uses a newly designed 27-channel microphone array based on a 3D sound propagation model to record acoustic environments that can be played back in a highly realistic manner through a HMD (head-mounted display) and a multi-channel loudspeaker system. In addition to recording and playback, this technology can analyze the direction and intensity of incoming sounds. By superimposing a visual representation of these sound sources on an omnidirectional image captured by a 360-degree camera, it is possible to create an audiovisual experience of a scene, including information about where sounds are coming from. In the future, we will combine this technology with acoustic simulation to produce acoustic environment design technology that can be applied to urban and residential spaces to contribute to the creation of comfortable, human-friendly living spaces where, for example, it is easy to hear and communicate with other people.

Video1: Explanatory video on the developed technology (in Japanese)

Fig.1 :Explanatory video on the developed technology (in Japanese)

For 360°video demonstration (in Japanese), click here.

Fig. 1 Audiovisual experience of a surrounding acoustic environment

Background and issues addressed

  • Sound is one of the primary means of human communication, so it is important to provide urban and residential spaces with a quiet sound environment where it is easy to hear what people are saying.
  • Although sound pressure (loudness) provides a general evaluation index for creating an acoustic environment, it is also said that the comfort felt by people is affected by other factors including the sound arrival direction and reverberation from the environment*1.
  • Incorporating human subjectivity into the design of comfortable acoustic environments requires not only technology for recording and reproducing sounds from the surroundings with a high degree of realism, but also technology for visualizing information such as the direction and intensity of incoming sounds so that they can be experienced audiovisually.

Developed technologies

  • Technology for recording and reproducing an acoustic environment to facilitate a highly realistic acoustic experience
  • Sound analysis and visualization technology that enables a deeper physical understanding of acoustic environments

Verified effects

  • When the acoustic environment inside a railway carriage or on a station platform is recorded and played back via a HMD and multi-channel loudspeakers, it is possible for listeners to experience these acoustic environments as if they were actually there.
  • By using information about the direction and intensity of incoming sounds to visualize the position of the locomotive, which is the main source of sound from a train, it is possible to visually understand the sound information that exists in the acoustic environment.

Published papers, conferences, events, etc.

This work was presented at the 2020 AES International Conference on Audio for Virtual and Augmented Reality, which was held online from August 17th to 19th, 2020. (Title: Tesseral Array for Group Based Spatial Audio Capture and Synthesis).

Acknowledgments

This result was achieved through a joint study with Professor Toru Kamekawa and Professor Atsushi Marui of Tokyo University of the Arts. In addition, the acoustic environments inside the train and on the station platform were recorded with the cooperation of Hitachinaka Seaside Railway Co., Ltd.

Details of developed technology

1. Technology for recording and reproducing acoustic environments to enable a highly realistic acoustic experience

Hitherto, one of the evaluation indexes that has been used in the design of acoustic environments has been the acoustic levels measured by an omnidirectional microphone (sound level meter) that does not have any directionality with regard to the incoming sounds. However, since humans have two ears and are able to hear sounds coming from all directions, sounds recorded in this way produce different sound pressure levels and create a different subjective impression of the acoustic environment.*1 In our technology, we measure sound pressure levels and the subjective impression of the sound environment by using a 26-channel omnidirectional microphone array with a tesseral system configuration*2 based on a three-dimensional sound propagation model*3 centered on the human listener. In this way, we have been able to record incoming sounds from all directions for the first time. By playing this audio back through a 26.1-channel loudspeaker system*4 with a similar configuration to the 27 microphones used for recording, or by converting it to stereo sound and playing it back through headphones, it is possible for the listener to experience the recorded acoustic environment with a high degree of realism. Furthermore, by recording video with a 360-degree camera, we can also perform subjective evaluations of acoustic environments in remote areas with the addition of visual information.

Fig. 2 Microphone and loudspeaker placement based on 3D sound propagation model


Fig. 3 Installed microphone array and loud speaker array


2. Sound analysis and visualization techniques that provide a deeper physical understanding of the acoustic environment

In working towards implementing a comfortable acoustic environment, it is essential to understand how humans interpret sound information such as the direction and intensity of incoming sounds. However, since sounds are not visible, they have to be analyzed by a highly skilled sound engineer. To make it easier to understand the physical information conveyed by sounds, we have developed technology that analyzes and visualizes the direction and intensity of incoming sounds by analyzing 27 channels of recorded sound. This technology first reconstructs the recorded sound by means of multiple spherical harmonics,*5 taking advantage of the spherical shape of the directional microphone array, which has an equiaxial geometrical configuration. Each of these spherical harmonics is then weighted to form a sharp directional peak that determines the intensity of sound from a particular direction. Then, by changing the direction of this directional peak to probe all directions, the intensity of the incoming sounds can be broken down by direction. The sound intensities obtained in this way can be visualized as a color map and superimposed on the recorded 360-degree image to provide a visual representation of the sounds coming from all directions. This makes it easy to understand the physical information of sound in a sound environment by augmenting the highly realistic acoustic reproduction with visual clarification of information such as the direction and intensity of incoming sounds.

Fig. 4


*1
Genta Yamauchi and Akira Omoto, “Basic study on application of loudness correction in consideration of arrival direction of sound,” Journal of the Acoustical Society of Japan, Vol. 75, No. 5 (2019), pp. 258–260.
*2
Tesseral system: A crystal system that is known to have the highest symmetry. Also called the isometric crystal system.
*3
Tanabe, Y., et al., “Tesseral Array for Group Based Spatial Audio Capture and Synthesis,” Audio Engineering Society Conference: 2020 AES International Conference on Audio for Virtual and Augmented Reality, Audio Engineering Society (2020), Paper 2–7.
*4
26.1 multi-channel loudspeaker: Hitachi’s proprietary multi-channel loudspeaker system comprising 26 separate loudspeakers arranged at the vertices of an equiaxial polyhedron (one overhead, eight in the top layer, eight in the middle layer, eight in the bottom layer, and one below the listener), plus one subwoofer for low frequencies.
*5
Spherical harmonic function: A function that can represent the spatial distribution of sound pressure on a sphere.

For more information, use the enquiry form below to contact the Research & Development Group, Hitachi, Ltd. Please make sure to include the title of the article.

  • Page top