‘Saojung’ AI speaker with dark speech… One sensor solved

Voice recognition (AI) speaker developed by KT.  This AI speaker, which appeared last year, can control the behavior of beds and curtains using voice commands. [사진 KT]

Voice recognition (AI) speaker developed by KT. This AI speaker, which appeared last year, can control the behavior of beds and curtains using voice commands. [사진 KT]

A high-quality voice recognition sensor that mimics the structure of the human ear has been developed. It is a technology that can be commercialized sooner or later by being mounted on a smartphone or artificial intelligence (AI) speaker.

KAIST research team develops AI voice recognition sensor
Improved voice identification of smartphones and AI speakers
“Commercially available soon. Looking for a partner”

The Korea Advanced Institute of Science and Technology (KAIST) announced on the 15th that it has developed the world’s first’resonant flexible piezoelectric voice sensor’ with up to 95% improvement in voice recognition errors. A resonant flexible piezoelectric voice sensor is a sensor that applies a phenomenon in which the sensor vibrates with a large amplitude in a specific frequency range (resonance) and an electrical signal is spontaneously generated when pressure is applied (piezoelectric).

KAIST researchers' flexible piezoelectric voice sensor actually installed in smartphones and artificial intelligence speakers [사진 KAIST]

KAIST researchers’ flexible piezoelectric voice sensor actually installed in smartphones and artificial intelligence speakers [사진 KAIST]

Until now, AI speakers have attempted to solve speech recognition errors primarily with software technology. In contrast, Lee Kun-jae and Wang Hee-seung, researchers from the Department of Materials Science and Engineering at KAIST have newly developed a sensor that physically recognizes speech. It’s the same concept that a person with dark ears can hear better when wearing a hearing aid.

In the human ear, the trapezoidal basement membrane in the cochlea amplifies sound by causing a resonance phenomenon in the audible frequency band. The base membrane, which is only 30 mm long, can induce resonance due to the micrometer (μm) structure. The researchers made a very thin and flexible sensor based on the structure of the basement membrane in the ear. Professor Lee Kun-Jae explained, “It is the first time ever to create a resonance type voice sensor by simulating the structure of a human ear.”

The researchers explained that the sensor produced in this way accurately identified speech. The accuracy of identifying voices was high, and the probability of misrecognizing voices was reduced by 60-95% depending on the situation. Professor Lee said, “As a result of testing each voice command 50 times, when a commercially available AI speaker or smartphone such as iPhone and Galaxy incorrectly recognizes the voice command 10 times on average, products using the researchers’ sensors will receive voice commands from 1 to 4 times. I only misrecognized about once.”

The researchers also succeeded in commercializing the sensor by mounting it on smartphones and AI speakers. The company founded by the researchers (Pronix) is promoting commercialization in collaboration with an information technology (IT) company in Silicon Valley, USA.

Professor Lee said, “The voice sensor that was commercialized this time is a key sensor that drives AI technology in the future, and sooner or later, we will see products applied in real life. It is the step of searching for a leading foreign company as a partner.”

The research, conducted with the support of the National Research Foundation of Korea, was published on the 12th of the international academic journal Science Advanced.

Reporter Moon Hee-cheol [email protected]


Source