I did pick some unique options, but in general i am pretty happy, because my design is pretty straightforward.
When it came to the high level design i didnt want to use an AI Realtime kit, because that would frankly be pretty lame.
My goal was to have a custom voice, and the bland AI ones didnt seem cool. Then i remembered the GlaDOS voice, from the portal video game.
I had it running on my GPU as a self hosted option, but that wouldnt cut it because the home automation would be reliant on the status of my PC.
This means i had to find an audio provider who would be able to do this for cheap and with an API. Fish audio seemed like a good choice, so i went with that
For STT they offer pretty cheap rates and it came out as the best choice.
When it came to the actual intelligence i had to make a choice:
- use a custom trained model
- add examples in the instructions
Not having a high end GPU makes the latter a better choice because its easy to set up and customisable-er. This also allows me to switch to more
voice options without retraining the AI on the patterns of the other characters.
When first making this i had picked a cheap GPT model, because i didnt have the reason to pay for a high end AI, as it would only have to do actual tasks without talking.
Furthermore it was easy to track, easy to integrate etc. In playtesting this was a bit slow, we will come to that in the next part though.
My friend told me about Groq AI and after some checking i found out they boasted a speed of 500 tokens/sec, which definetly would change the game.
What finally hooked me, was that it was FREE... Like hats off, fast and free? Count me in!