Author: admin

Month5 Week 1

The milestone of receiving audio from the ai pipeline has been reached. It needs fiurther optimization for speed but end to end works! After replcaing the B2BUA last further work ensued to get a stasis application to orchestrate audio fro the user and direct it into the kubernetes cluster with the GPU. After tweaking the python code I founf success to something usable. The key issues is chunking, sentence formation, silence and user talk and ai response to make the experience meaningful. Next week would be wiring in short/long term memory to keep conversations context aware….

November 2, 2025
Month 4- Milestone 4

This week was a busy one swapping the Back2Back User Agent. Iterating and looking for outside help and performing research led me to consult with the Lead Developer at https://fusionpbx.com/

I became a member to support the fusion project by comiting to 100 dolars a month to support the project but most importantly had a 1:1 discussion. It was a delightful discusion where we interchanged pain points and commited to support each other in our endearvor to solve this correctly. We shared common vision of user privacy and what AI is meant for.

Our solution use Ai by augmenting humans and lifting them in memorable experiences not displace them.

He directed me to pursue a specific strategy which low and behold he was working on at that moment in time. He also mentioned there is no documentation with AI as this is at the bleeding edge and I would need to develop my own I shared my architecture and design he acknowledge that I was ahead of his efforts and that I was going in the right direction….

October 25, 2025
Month 4- Milestone 3

This week work on gpu kubernetes cluster and getting the right balance of gpu for whisper and the llm where challenging. I finally was able to get everything going after I swapped the power supply as I was getting sudden power off while fine tuning gpu usage. After the power supply swap I was able to sustain 30 concurrent calls by a running a script to simulate a user and holding the call up to 50s. The gpu performed excellent with more headroom available on the gpu cluster for more simulataneous users. The cpu cluster since it runs more services had its challenges as well. I will need to add more nodes and scale the cluster further.. Overall happy with the progress despite the fine tuning challenges. The remaining issue is to get audio forks to work well with web sockets. Although my python load testing scripts show everything is working I have yet to get to an E2E scenario. This will be my focus point this week.

October 19, 2025
Month 4- Milestone 2

This week a lot of progress to reach the end to end milestone has been achieved ingress on all clusters work some minor issues reagrding analytics one of supabase services seems to be running into a race condition related to ddl migrations and tables it expects to create. Happy with the progress of a multicluster distributed system that is ready to do what it was designed to do. Forking of audio will happen this week and a demo coming shortly of the capability of the system. Humbled to be able to create something meaningful for businesses leveraging AI to attend customers in a responsible and ethical way… Stay tuned….

October 11, 2025
Month 4- Milestone 1

Part-1 Costs: This week lots of work went into deployments and changing my strategies as AWS costs kept on increasing. So I set up dual IPSEC tunnels to my home lab and moved all the infra to run on my home lab.

Part-2 E2E: The following tasks that I worked on was forking the audio and begin to connect each subcluster as I move closer to reaching the E2E milestone and work on the automation pieces of the voicerag app.

October 4, 2025
Month 3 Milestone 4

This week many advancements in deployments have happened site is up comprised of four K8 Clusters:

Voicerag App cluster – Front end SAAS, Backend Postgres Db, Flyway DDL Scripts

Voicerag AI Cluster- vLLM, short term memory (open router) is up,

Voicerag Audio Cluster – STT, TTS is up

Voicerag Supabase Cluster– is up

Connecting thse clusters are up next to pass forked audio to the audio Cluster and bouncing data and voice to LLM and back is up next…

September 28, 2025
Month 3 Milestone 3

As we near the months end I have battled competing flyway migrations and consolidated 27 ddl scripts to be put into the supabase persistency store this has proved challenging but interesting I am at the tail end of this contextual engine deployment and hope by next week I can finally move to end end testing. Fingers crossed….

September 22, 2025
Month 3 Milestone 2

Complexities with context. Supabase the long term memory persistency store has competing migrations from two services namely number one the database and another applcation called storage from the supabse stack. This complexity introduces race conditions on how ddl scripts afect schema and tables for two different services.

As I move to add short term memory (redis) and long term memory (supabase) to the stack these complexities added delays for end to end workflow demo.

As a workarounnd I will extract these ddl sql scripts and add them to the sql migration engine to maintain a source of truth for the database regardless if the db needs shema updates from the db or another app that uses the db (supabase). I am still navigating these issues and addressing those problem staements but convinced with pushing back the e2e demo a week will provide the time necessary to solve the context engine memory issue.

Nonetheless developement is at a fast pace and voicerag has something coming soon. I am ver excited about this product and was approached this week by an existing customer to build this for them without them knowing I was working on this.

Expect a video for e2e on milestone 3…..

September 15, 2025
Month 3 Milestone 1

This week a lot of research went into the context engine and the ai portion of the app. To make the ai apps scalable and robust from the get go six services where built comprised of two kubernetes clusters. First up the cpu k8 cluster receives all services required for the llm engine to run which dont require a gpu. Five services where assigned to this cluster namely:

-audio-processor (STT)
-redis (short term memory)
-supabase (llm optimized persistency store based on postgresql)
-tts
-webhook

On the other gpu cluster the LLM model selected is mistral a fined tuned model that is trained on multimodal conversations. It is empathic and patient to deal with the nuances of human speach.
It supports barge in and interruptions to loop in a human when required or interruptions from the user natural to human conversation.

This is the territory of the unknown where research and connecting the dots of these micro services has been instrumental to get a functioning end to end workflow. I am planning to complete e2e testing this week as I close out my third sprint.

Plans are to demo this intial test next week.

September 8, 2025
Milestone Month 2

This week has been a challenge navigating complexities with freeswitch I finally got inbound and outbound working. YAY!

September 1, 2025