Voice Assistant Skill Timers and Reminders

This summer I am making a personal voice assistant, because I like the idea of Alexa but I don't like volunteering to be spied on. Read the setup Post here. I've got the architecture set up with a server that runs a local model (usually one of the llamas but theoretically could be anything) and a satellite raspberry pi that does wakeword detection and query collection, and then the server processes the request. I have decided on a modular skills-based architecture to keep things clean and extendable.

The first skill is timers. Timers are very easy to write in Python. I just wrote a class that instantiates an object with a deadline, a tick interval of a half second, and a function that runs on every tick and checks if the current count is past the deadline. The interesting part was implementing the response, rather than waiting for the wake word. For that, I set up a simple websocket server that sends the timer up message to the satellite to read. Which is exactly why I chose this skill; it's really simple logic, but it lays the foundation for a wide variety of other skills later.

I also added a function to send the remaining time in a timer to the satellite, which is not useful for short timers but is good for longer ones, as well as the ability to cancel the timer.

One interesting note is that the model at first insisted on setting the timer in minutes no matter what I said. I had to add explicit instructions to the voice parsing model to not assume the user said minutes. Strange new paradigm of having to wrangle models.

Expanding the timers skill into a reminders skill was a natural next step, so I did that next. The timers and reminders share a lot of code, but the reminders have a date parsing function that can handle more complex timeframes, as well as a note that gets read when the reminder goes off. I also added a function to list out all active timers and reminders, which is really helpful for testing and will be helpful in general once I have more skills.

The main extra functionality here is that the reminders need to persist through server restarts, so I set up a simple json file that the reminders get written to. In the future I'll probably set up a small SQLite database or something, but this is fine for now.

One issue that I'm having is that the model doesn't always hear the notes correctly. I think this is a combination of the model not being great at parsing speech and also the fact that I don't have a great microphone setup on the satellite. I'm not interested in buying a bunch of hardware right now, so for now I just stay really close to the mic when setting a reminder, and expect that sometimes the note will be misheard, until I can get around to wrangling the model, improving the mic, etc.