Fine-Tuning LLMs
What is Fine-tuning?
Fine-tuning is taking a pre-trained model and training at least one internal model parameter (i.e. weight). In the context of LLMs, Fine-tuning means transforming a general-purpose base model into a specialized model for a particular use case.
Why Fine-tune?
Generally, a smaller (fine-tuned) model can often outperform larger (more expensive) models on the set of tasks on which it was trained.
How to Fine-tune?
Self-supervised Learning
Self-supervised learning consists of training a model based on the inherent structure of the training data. In this case, a potential use of this approach is prediction.
Supervised Learning
Supervised Learning may be the most popular way to fine-tune a model. This approach involves training a model on input-output pairs for a particular task.
The key step in supervised learning is curating a dataset. We can apply Prompt Engineering techniques to design our own datasets or adopt the existing ones.
A high-level procedure for supervised model fine-tuning should be like:
- Choose fine-tuning task
- Prepare training dataset
- Choose a base model
- Fine-tune model via supervised learning
- Evaluate model performance
When it comes to fine-tune a model with huge amount of parameters, it’s important to choose the right strategy for fine-tuning. Here are 3 generic options:
- Retrain all parameters: Train all internal model parameters which is also called full parameter tuning. It’s intuitive and conceptually simple, but it’s also computationally expensive. Besides, it cannot overcome the problem of ‘forgetting’.
- Transfer Learning (TL): Preserve the useful features the model has learned from past training while applying the model to a new task. This approach will leave most of the parameters untouched, which may not be computationally expensive. However, it still cannot mitigate the problem of ‘forgetting’.
- Parameter Efficient Fine-tuning (PEFT): PEFT involves augmenting a base model with a relatively small number of trainable parameters. One of PEFT methods is the popular LoRA(Low-Rank Adaptation).
Reinforcement Learning
Reinforcement Learning (RL) is another way to fine-tune a model. RL uses a reward model to guide the training of the base model. The basic idea is to train the reward model to score language model completions. In this case, the model can understand the preference of human.