Getting Started with NVIDIA Triton Server


Getting Started with NVIDIA TRITON Server
NVIDIA Triton Inference Server

What is an Inference Server?

The role of an Inference Server is to accept user input data and pass it to an underlying trained model in the required format and return the results. It is also widely known as a Prediction Server as the results are nothing but predictions (in most cases).

The NVIDIA Triton server is a gold standard that standardizes AI model deployment and execution across every workload and it is important to know how it works internally for your custom or off the shelf models.

Getting Started with is an excellent tool that helps you to run Large Language Models (LLMs) locally on your computer like Llama2

In this article, I decided to test whether Ollama can work  with my consumer grade GPU - MSI GTX Super 1660


A list of things to remember when deploying a Large Language Model (LLM) on Production



 A small checklist that intends to make your  LLM deployment easier.

 This checklist is intended to help you get started with deploying your own Model or an open source one like Llama2

Free Kubernetes cluster on Oracle Cloud using k3s



Oracle Cloud has a very generous forever free tier. I have been using k3s on my Raspberry Pi 4 machines for sometime in my local home network and it is working amazingly well.

I was very keen to deploy k3s for free on a cloud provider as a backup to my local clusters and finally managed to do that recently.


US Elections 2020 - The best way to keep track of what is happening


US Elections 2020
US Elections 2020 is too close to call

The US Elections 2020 is one of the most widely watched event in the world.

We look at the ways the Tech / Media Companies are trying to make sense of the data.

Microsoft Clarity - An easy to use User Behavior Analytics Tool

Microsoft Clarity

Microsoft Clarity is a free service that helps users to understand how users are interacting with their websites.

The services offers the ability to record user sessions and also generates a Heat Map for clicks.

Using Gitlab for backing up data to AWS S3 and Google Cloud Storage

Gitlab has a generous free tier for CICD called "Gitlab Pipelines", that can be used to store build artifacts (or anything) in AWS S3 and Google Cloud Storage for free.

Grab the code here: GitlabToCloud.

© 2007 - Protection Status
The content is copyrighted to Sundeep Machado

Note: The author is not responsible for damages related to improper use of software, techniques, tips and copyright claims.