nvidia ai

In this blog post, we will explore how to host your very own ChatGPT using the powerful combination of Ollama and OpenWebUI, all powered by VMware Cloud Foundation. By leveraging these cutting-edge technologies, you’ll be able to create a seamless and interactive chatbot experience that will impress your users. Get ready to dive into the world of AI and virtualization as we walk you through the steps to set up your own ChatGPT. Exciting times are ahead, so let’s get started on this journey together!

This walkthrough will only guide you through how to setup Ollama and Open WebUI – you will need to provide your own Linux VM, for my deployment I used Ubuntu 22.04.

Next blog post we will go into customizing and adding onto Ollama and OpenWebUI with for example Automatic1111 and Diffusion and Image Generation LLMs.

The Hardware:

2 x Intel Platinum 8158 3.0GHz 12 Cores
1 x Nvidia Tesla P40 24GB DDR5
1 x Dell PERC H740P RAID Card
4 x 32GB Samsung DDR4 2666MHz (128GB)
2 x 50Gb/s Mellanox Connectx-4 Data Traffic
4 x 10Gb/s X710 NDC for NSX Overlay
1 x Boss S1 w/ M.2 SSD for ESXi Boot
2 x 2000 watt PSUs
8 x 800GB SAS SSD – Capacity Storage
2 x 280GB Intel Optane Storage – Fast Storage

The Virtual Machine:

Deploy a Ubuntu or any choice of Debian distro if you want to utilize the commands I have apart of this walkthrough.
Enabling Hardware Device Passthrough for the GPU also apart of your VMX add the following in:

pciPassthru.use64bitMMIO=”TRUE”
pciPassthru.64bitMMIOSizeGB=”128″

Once the Virtual Machine is deployed, you will want to ensure that your Server or Desktop Hardware is prepared to have a GPU, in my Dell PowerEdge R740XD I have a NVIDIA Tesla P40.

The Specifications:

This will vary, for my initial deployment I setup 8 vCPUs with the Automatic CPU Topology enabled, but this will all depend on your use case, since I have powerful CPUs and lots of memory I can increase the resource allocation later on.

The Software:

Download & Install Ollama:

curl -fsSL https://ollama.com/install.sh | sh

Time to Shutdown The Virtual Machine and Pass Through the NVIDIA Tesla P40

With the above requirements satisfied, two entries must be added to the VM’s VMX file, either by modifying the file directly or by using the vSphere client to add these capabilities. The first entry is:

pciPassthru.use64bitMMIO=”TRUE”

Specifying the second entry requires a simple calculation. Sum the GPU memory sizes of all GPU devices(*) you intend to pass into the VM and then round up to the next power of two. For example, to use passthrough with 4 16 GB A2 devices, the value would be: 32 + 32 = 64, rounded up to the next power of two to yield 128. Use this value in the second entry:

pciPassthru.64bitMMIOSizeGB=”128″

Add Docker’s official GPG key:

sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc

Add the repository to Apt sources:

echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update

Next, we will install docker and all its dependencies:

sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

Now we will setup the docker container for Open WebUI to run, on port 11434 – Yeah the port number looks like (LLAMA haha)

docker run -d --network=host -v open-webui:/app/backend/data -e OLLAMA_BASE_URL=http://127.0.0.1:11434 --name open-webui --restart always ghcr.io/open-webui/open-webui:main

After you run the command above, then your webserver should start running – Open up your browser and login! You will be directed to setup a username / email. Once that is done, have fun and enjoy your own private AI!

Now, you will be presented the dashboard for your very own ChatGPT Privately Hosted!

Enjoy! 🙂

Depending on what LLMs you want to pull here is a example of how to do it via CLI

ollama pull aya

There we pulled aya LLM!

Well That’s a Wrap…

Lets Recap! VMware Explore event was an absolute blast! We immersed ourselves in the fun-filled adventures of a multi cloud universe, diving headfirst into tons of Artificial Intelligence content and exploring the endless possibilities of multi-cloud technology. From mind-boggling demos to thought-provoking sessions, this event truly ignited our excitement for the future. With the incredible innovations and breakthroughs showcased, we can’t wait to see where this journey takes us next. The world of technology is evolving at lightning speed, and VMware Explore has undoubtedly left us buzzing with anticipation for what lies ahead in this exhilarating multi cloud universe. Stay Tuned for next VMware Explore 2024 in Las Vegas.

Previous Post of VMware Explore’s General Session here

Celebrating VMware’s 25 Years of Innovation

Introducing VMware Private AI Foundation

During the general session VMware CEO, Raghu Raghuram, and NVIDIA CEO, Jensen Huang, announced a new partnership to offer the next generation of Artificial Intelligence and VMware together, with this innovation with Generative AI and Large Language Models, this will allow organizations to embark the new era of computing and assisted computing to optimize their business’ needs and requirements. With NVIDIA AI Enterprise and VMware Cloud Foundation together, these two solutions will expand the VMware portfolio tremendously.

The Hub

The Hub! Fun filled adventures of: VMware Communities, VMTN Theater, Broadcast Booth, VMware {code} and VMware Rewards and so much more, cant forgot the cozy relaxing area with bean bags! This was the best place to kick back and relax with your fellow Explorer’s before attending your next session!

Incredible Friendships

My time spent with these incredible VMware VCDX’s (Mark Gabryjelski #023 – left and right – John Arrasjid #001) been nothing short of exhilarating! Getting to dive deep into the world of virtualization and cloud computing with the best in the field has been an absolute dream come true. I’ve gained invaluable insights, learned new techniques, and made lasting friendships that will undoubtedly propel my career forward. I am truly grateful for the opportunity to spend quality time with these masters of their craft, and I cannot wait to apply all that I’ve learned. The future is looking brighter than ever, and I’m thrilled to continue this exciting journey in the world of VMware!

My Favorite Sessions!

Keep the Attacker Out: Infrastructure Security Hardening and Auditing
Why and How to Apply the DISA vSphere STIGs and Other Hardening
Elevate Your Application Modernization Journey with Developer-Ready Cloud

VMware Communities

I want to thank the VMware Communities especially Corey Romero, for allowing me to be apart of the vExpert Team as well coming to VMware Explore 2023 as a Blogger! This year’s explore was beyond amazing, meeting many different executives and being able to meet engineers. architects and creating long lasting connections and friendships!

Highlights of VMware Explore 2023 – Las Vegas (Gallery)

VMware Explore 2023 has been filled with tons of joy and laughter and connections, from meeting Raghu Raghuram, CEO of VMware and Sumit Dhawan, President of VMware it was awesome moment. From exploring Las Vegas and The Venetian Resort. I am excited for VMware Explore 2024, here again in Las Vegas next year!