While I eagerly await a built in wake word function, I decided to see what other options are out there and started looking into open source hotword detection projects, and found snowboy. Combining a snowboy plugin with tasker and an android tablet connected all the dots to run local wake word detection that drops seamlessly into the assist service.
The only downside is that recognition for custom phrases is fairly limited (custom wake words are based on only 3 audio samples from the intended speaker), though it seems possible to use them, maybe even to setup different access per speaker, but I haven’t gone down that path yet.
Thankfully there is a small selection of “universal” built in wake words including:
Hey/Ok Google
Alexa
Jarvis
Computer
Snowboy
Smart Mirror
and a few other oddball ones that can be found here: https://github.com/Kitt-AI/snowboy/tree/master/resources/models
In short, using any android device (an old phone or tablet), install “Tasker” and “HotwordPlugin” which uses snowboy for the local wake word detection. Assuming you have the assist pipeline setup already and the home assistant app installed, have Tasker call the HA apps assist service whenever HotworkPlugin detects the wake word and triggers the routine. Set it to launch a new assist service with every trigger so it will take consecutive commands.
Depending on how the wake word functionality is implemented in HA eventually this may be just a temporary solution, but it’s working well enough for the moment.
Tasker: https://play.google.com/store/apps/details?id=net.dinglisch.android.taskerm (It’s a paid app, but well worth it considering what you get for $3.50)
HotwordPlugin: https://play.google.com/store/apps/details?id=nl.jolanrensen.hotwordPlugin (there is an ad supported free version as well)
Interesting, can I ask a couple questions about that setup? How did you go about connecting all the pieces? Is it all handled locally? What hardware are you running it on? Do you have any good resources/tutorials you followed?
Sure, I ended up taking apart the PicoVoice GitHub demo Python for Porcupine to integrate it into a chatGPT-powered chatbot. It’s also using google speechToText and textToSpeech, so a fair bit of cloud stuff. But surprisingly the latency is pretty low. I’m running it on a small form factor Debian Linux box from MeLE. I was trying to run it all on a Pi4B but it didn’t have the GPU I needed for the front end. The chatbot and Python side all ran ok albeit slower, but I abandoned the Pi. This is the example project I drew inspiration from: https://github.com/atxguitarist/BassGPT/blob/main/BassGPT.py