HUMANS LIVES THEIR lives trapped in a glass cage of perception. You can only see a limited range of visible light, you can only taste a limited range of tastes, you can only hear a limited range of sounds. Them’s the evolutionary breaks.
But machines can kind of leapfrog over the limitations of natural selection. By creating advanced robots, humans have invented a new kind of being, one that can theoretically sense a far greater range of stimuli. Which is presenting roboticists with some fascinating challenges, not only in creating artificial senses of touch and taste, but in figuring out what robots should ignore in a human world.
Take sound. I’m not talking about speech—that’s easy enough for machines to recognize at this point—but the galaxy of other sounds a robot would encounter. This is the domain of a company called Audio Analytic, which has developed a system for devices like smart speakers to detect non-speech noises, like the crash of broken glass (could be good for security bots) or the whining of sirens (could be good for self-driving cars).
Identifying those sounds in the world is a tough problem, because it works fundamentally differently than speech recognition. “There’s no language model driving the patterns of sound you’re looking for,” says Audio Analytic CEO Chris Mitchell. “So 20 or 30 years of research that went into language modeling doesn’t apply to sounds.” Convenient markers like the natural order of words or patterns of spoken sounds don’t work here, so Audio Analytic had to develop a system that breaks down sounds into building blocks, what they’re calling ideophones. This is essentially the quantification of onomatopoeia, like in the Adam West Batman series. You know, bang, kapow, etc.