Member-only story
LLM finetuning
Why and How of various model tuning and optimization techniques
Fine-tuning a Large Language Model (LLM) involves adapting a pre-trained model to perform better on a specific task or domain by training it on a smaller, task-specific dataset. Fine-tuning not only lowers computational costs but also leverages state-of-the-art models without the need to build them from scratch.
There are many different techniques to fine-tune an LLM like standard fine-tuning, using adapters, Prefix-tuning, Chain of thought prompting, sequential regularization, zero-shot and so on.
For sake of simplicity, we can divide the various fine-tuning techniques into two categories:
Instruction tuning: The goal is to train the model to follow instructions better by providing a dataset of prompts and responses.
{
"instruction": "...",
"output": "...",
}
Preference optimization: is a process or technique used to align a model’s behavior or outputs with specific preferences, usually those of users.
{
"instruction": "...",
"preferred": "...",
"rejected": "...",
}
Instruction tuning can be used on its own or it can be combined with preference optimization
Instruction tuning