How to add a custom Guardrail like GLiGuard to your AI Gateway

Adding GLIGuard to LiteLLM AI Gateway
Adding GLIGuard to LiteLLM AI Gateway


Guardrails for a Large language model (LLM) are rule based safety controls that validate the input and output of a model. They basically act like a gatekeeper between a user and a Large language model.  

GLiGuard is an open-source, ultra-fast and very light weight AI guardrail that has only 300 million parameters. It is available on HuggingFace and can be easily integrated on any AI Gateway like LiteLLM.




Getting Started with Nvidia AIPerf tool for LLM Performance testing

Getting started with Nvidia AIPerf
Getting started with Nvidia AIPerf

The new Nvidia AIPerf tool is an excellent free tool for LLM Performance testing. You can customise it as per your needs and is a massive upgrade to other tools especially if you use Nvidia GPUs.




How to Reduce CPU Spikes for AI Summarization

Reduce CPU spikes - AI Summarization
Reduce CPU spikes - AI Summarization

Summarization aims to compress a lengthy source document into a concise format while retaining its core components and key ideas.

However, when you are hosting your own LLM, handling CPU spikes (in the absence of a GPU) can be your biggest concern.



How to enable NVIDIA GPU workloads on k3s cluster

 
GPU workloads on k3s
GPU workloads on k3s

K3s is a highly available, certified Kubernetes distribution designed for production workloads. It can also be used for AI workloads.

 By default, k3s nodes do not recognize GPUs. In this article, we will enable k3s to work with a GPU. 



How to deploy Google's latest Embedding Model embeddinggemma-300m on Nvidia Triton Server

Embedding Gemma 300M on Nvidia Triton
EmbeddingGemma on NVIDIA Triton Server    

An embedding model is needed to generate embeddings (vector representations that help an LLM to understand things like text, images etc). 

Google recently released EmbeddingGemma-300M (a whooping 300 million parameter model) which has very low requirements to run. We test this assumption in this article.


Getting Started with Cloudflare's new AI Crawl Control

 

AI Crawl Control 
AI Crawl Control From Cloudflare
  

  Cloudflare has recently announced a new feature  called AI Crawl Control . 

  

 



How to do a Performance Test for AI Models hosted on NVIDIA infrastructure?

 

 
Nvidia

A performance test is extremely important for an Artificial Intelligence (A.I.) Model just as it is important for an e-commerce website like Amazon.





© 2007 - DMCA.com Protection Status
The content is copyrighted to Sundeep Machado


Note: The author is not responsible for damages related to improper use of software, techniques, tips and copyright claims.