0

From my understanding, for the pretraining of GPT model, we need to do next token prediction task.

In this case,

Input -> The GPT models are general-purpose language models that can perform ... (2048 tokens)

Output-> GPT models are general-purpose language models that can perform a ... (2048 tokens)

Using the next token prediction task, I think we can do pre-training. However, I do not know how to do fine-tuning the GPT model for Q and A sentences.

For example, my question and answer are as follows,

Q) What is the GPT model?

A) The GPT models are general-purpose language models that can perform a broad range of tasks from creating original content to write code, summarizing text, and extracting data from documents.

In this case, do Question becomes the input of the GPT model and Answer becomes the output of the model, then do fine-tuning? or

We concatenate the question and answer as below, and do fine-tuning (in this case, this task becomes the next token prediction) ?

Input -> What is the GPT model? The GPT models are general-purpose language models that can perform a broad range of tasks from creating original content to write code, summarizing text, and extracting data from documents.

Output-> is the GPT model? The GPT models are general-purpose language models that can perform a broad range of tasks from creating original content to write code, summarizing text, and extracting data from documents. [Padding]

I want to know which one is correct for fine-tuning.

I need your support.

Kyuwan
  • 1

0 Answers0