Voice Assistant QOL updates and troubleshooting
This summer I am making a personal voice assistant, because I like the idea of Alexa but I don't like volunteering to be spied on. Read the setup Post here.
This week I made some quality of life updates and did some troubleshooting. Overall the system is working pretty well, but there are some small bugs that I wanted to fix, and some things that I wanted to make more user-friendly.
The main bug was that my speaker wasn't loading when the satellite pi rebooted. Here's where I got a little sloppy at first; I just asked my LLM to fix it a couple times, but the third time it started suggesting some very strange workarounds involving giving a ton of permissions to a script that would run at random times... as soon as I pushed back, the model surfaced the knowledge that Linux automatically suspends audio devices after a while, which is known to cause buggy behavior, but you can disable that setting.
So I just did that. Sometimes the speaker still isn't found on boot, though, and requires me to unplug it and plug it back in. Better hardware/a 3mm jack would likely solve this, but I hate making uneccesary e-waste and am not going to buy a new mic just for that. So I added a small chime on startup. If I don't hear the chime, I know to unplug the speaker and plug it back in, which is a pretty easy workaround for now.
Enhancements:
-
Added a chime when the satellite has captured my question, since there is a delay between the question going to the server and the response. That way I know the model heard me without having to look at the logs.
-
bash script to sync code so that I don't have to keep looking up the rsync command
-
Upgraded the end-of-speech detection. For testing I was just using an amplitude-based measurement, but it wasn't precise enough. There's a lot of ambient noise where I live, and often the pi would just listen for the max time instead of detecting the end of my speech. Turns out the model can help with that, too!
First I had my LLM write me a script to measure the amount of ambient noise in the room, just to get a baseline of what I was working with. It found that the microphone levels were VERY high, which didn't help. Turning the gain down improved things significantly. But I still decided to go with the model-based detection, testing it against the recording of the ambient noise. This should help me tune future satellites to the room they're in.
-
Added a few more scripts and some documentation to make setting up new satellites as frictionless as possible for future me.