An Overview of Effective Voice UI Design

Speech recognition technology has been around since the early 1950s, although many consumers today are likely more familiar with Siri or Cortana as the first brush with a voice user interface. Digital assistants and smart speakers, such as the Amazon Echo and Google Nest, have made a huge impression on the consumer market: Forecasts suggest there will be more smart speakers than smart homes. by 2024. It may seem that the platforms of Amazon, Google and Apple have met all market needs for voice technology.

In truth, these advancements are just the beginning of the shift from smart homes to full voice control. As incredible as the capabilities of existing smart speakers are, they are only version 1.0 of the great voice control revolution – a revolution powered by thoughtful design.

Designing to go beyond the limits

Consumers are increasingly looking for offerings that allow their home space to more comfortably meet overlapping needs. The essential deliverables of a smart home include entertainment, security, comfort, energy management and senior safety. These deliverables are all made simpler and more accessible through voice command. Owners expect their smart home devices to perform these tasks consistently without the issues of latency, security, or reliability.

As technology advances, engineers design new hardware and software to overcome the limitations of today’s smart speakers. If smart speakers are the gateway, the next upgrade should deliver much more natural user experiences.

New Digital Signal Processing (DSP), designed specifically for audio edge processing, can be discretely integrated into products and provide immediate response to user voice commands without latency in the cloud, and eliminates processing constraints frequently encountered by original equipment manufacturers (OEMs) that can hinder usability, such as power consumption, memory limitations, and integration compatibility. These offerings offer new levels of personalization, and with machine learning, edge devices will increase their intelligence and usefulness daily.

Design for far-field audio capture

Future voice user interfaces (VUIs) will also have a “anywhere, anywhere” feel where users can talk without the need for a nearby smart speaker. Lights, thermostats and other devices can turn the whole house into a listening zone, waiting for the wake-up word or sound.

Far-field voice capture, the general term for when a spoken voice is not physically close to the microphone, requires specialized hardware and software, and most importantly, dedicated design. Notably, this includes port orientation, microphone array, and beamforming.

Port orientation, the physical opening where audio signals can be accepted without obstruction, is a major concern. The acoustic port should be far enough away from speakers and noise sources, such as motors, to reduce extraneous noise at the source as much as possible. Improper port placement can lead to costly changes to printed circuit boards or plastics later in the product design cycle.

Microphone arrays and beamforming work together to mimic the ability of the human auditory system to localize sound. Multiple microphones, or an array, allow devices to simultaneously hear sounds from all directions. Using beamforming, the microphone array can be programmed to selectively pick up and reject sounds by recognizing the localization source of incoming sounds via timing, frequency, and amplitude signals. Different microphones work together to capture near and far sounds. They then send information to DSP that helps the system distinguish which audio is important (speech) and which is unwanted (noise).

Design for convenience and usability

Voice is the ultimate control for devices. Using VUI should feel as natural as any human conversation while still being incredibly responsive. Product developers have new challenges to create an always-on device. The expectation of immediate wake-up and always-attentive behavior requires a design that incorporates extremely low power consumption. Conventional VUIs with a smart speaker interface require commands to go through multiple steps and are dependent on server speed and home Wi-Fi connection.

For some devices, if the latency exceeds a few seconds, the user’s command is canceled (much to the frustration of many consumers). Voice systems designed for the edge greatly alleviate problems, increasing both convenience and usability.

Devices that operate via a battery, such as smart door locks, must retain enough energy to be convenient for the end consumer, without the need for frequent battery changes, while remaining active 24 hours a day. /24 and 7/7 for a wake-up word. The inclusion of a feature such as voice activity detection causes a system to recognize wake-up commands from specific individuals only.

Thoughtful design enables VUI and the technology it supports to meet ever-increasing customer needs. The explosive growth of voice control, even in its infancy, is proof enough of the market opportunities for voice control in the smart home. Voice control as an integrated feature in the smart home is no longer a surprise – it’s become a table-top game at the top of the market. As the technology proliferates and improves, it will become the preferred method of interaction between consumers and their devices, which means electronics manufacturers should consider voice control as a standard feature of any device. clever.

About the Author

Mehul Kochar is the Senior Director of Business Development, Audio Solutions at Knowles, a market leader and global supplier of advanced micro-acoustic microphones and loudspeakers, audio solutions, high-performance capacitors and RF products. Mehul has nearly two decades of experience in setting and leading client strategy and execution. It excels in introducing new technologies to various user bases.