The AI Catalyst
Posts
How does the Large language models (LLMs) work under the hood?

How does the Large language models (LLMs) work under the hood?

Just the very basics!!

Naveen Krishnan
May 29, 2024 • Estimated Reading Time: 3 minutes

✅ The Basics:
- In a nutshell, LLMs like ChatGPT are autocomplete systems.
- It gets the input as a text and uses deep learning to predict the most likely next word (token), the next word and so on.

✅ The Training :
To understand and predict the next token, LLMs are put through rigorous pre-training with loads of datasets (think of it as many thousand gigabits of data)

✅ The Architecture:
LLM applications use the traditional client-server model.
➡ The client (a website or an app) sends the user’s text input to the server.
➡ The server sends it through the LLM to generate a response.
➡ The response gets sent back to the client to display to the user.

But the real magic behind these LLMs is, how these models are adapted and fine-tuned for a specific use case. It's like giving the artist direction - "Hey, we need you to paint something inspirational for a children's hospital." With some guidance and fine-tuning, that same talent can be channelled into something truly impactful.

The competent but unrefined LLM models are refined with the below few techniques to specific needs and goals:

➡ Prompting: Carefully crafting the input to steer the model’s output to a desired direction.
➡ Constitutional AI: Instilling the ethics and principles via training
➡ RLHF: Reinforcement learning is like to optimise for quality. The model is given feedback iteratively on a good or bad output.

- Large language models (LLMs) like GPT-4 are incredibly impressive feats of deep learning like a raw diamond with brilliant potential.
- But it's the meticulous process of cutting, polishing, and setting that gemstone into a custom AI assistant that allows its true brilliance to shine. 🔥

#ai #artificialintelligence #llm #finetuning #rlhf #promptengineering #prompts