Month 4- Milestone 3

This week work on gpu kubernetes cluster and getting the right balance of gpu for whisper and the llm where challenging. I finally was able to get everything going after I swapped the power supply as I was getting sudden power off while fine tuning gpu usage. After the power supply swap I was able to sustain 30 concurrent calls by a running a script to simulate a user and holding the call up to 50s. The gpu performed excellent with more headroom available on the gpu cluster for more simulataneous users. The cpu cluster since it runs more services had its challenges as well. I will need to add more nodes and scale the cluster further.. Overall happy with the progress despite the fine tuning challenges. The remaining issue is to get audio forks to work well with web sockets. Although my python load testing scripts show everything is working I have yet to get to an E2E scenario. This will be my focus point this week.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *