Highest Voted 'gpt' Questions - Data Science Stack Exchange

10

votes

2 answers

Does BERT has any advantage over GPT3?

I have read a couple of documents that explain in detail about the greater edge that GPT-3(Generative Pre-trained Transformer-3) has over BERT(Bidirectional Encoder Representation from Transformers). So am curious to know whether BERT scores better…

nlp bert gpt

asked Sep 12 '20 at 04:37

Bipin

203
1
2
8

10

votes

1 answer

How is GPT able to handle large vocabularies?

From what I understand, GPT and GPT-2 are trained to predict the $N^{th}$ word in a sentence given the previous $N-1$ words. When the vocabulary size is very large (100k+ words) how is it able to generate any meaningful prediction? Shouldn't it…

deep-learning nlp gpt

asked Jul 11 '20 at 03:33

AAC

499
2
5
13

9

votes

1 answer

How to summarize a long text using GPT-3

What is the best way to summarize a long text that exceeds 4096 token limit (like a podcast transcript for example)? As I understand I need to split the text into chunks to summarize, and then concatenate the results and summarize those. Is there…

gpt automatic-summarization

asked Jan 12 '23 at 09:15

Poma

193
1
5

8

votes

1 answer

What tokenizer does OpenAI's GPT3 API use?

I'm building an application for the API, but I would like to be able to count the number of tokens my prompt will use, before I submit an API call. Currently I often submit prompts that yield a 'too-many-tokens' error. The closest I got to an answer…

python-3.x tokenization gpt

asked Jul 08 '21 at 18:07

Herman Autore

83
1
3

8

votes

1 answer

How does an LLM "parameter" relate to a "weight" in a neural network?

I keep reading about how the latest and greatest LLMs have billions of parameters. As someone who is more familiar with standard neural nets but is trying to better understand LLMs, I'm curious if a LLM parameter is the same as a NN weight i.e. is…

machine-learning nlp terminology gpt

asked Apr 06 '23 at 21:53

slim_wizard

83
1
4

8

votes

1 answer

BERT vs GPT architectural, conceptual and implemetational differences

In the BERT paper, I learnt that BERT is encoder-only model, that is it involves only transformer encoder blocks. In the GPT paper, I learnt that GPT is decoder-only model, that is it involves only transformer decoder blocks. I was guessing whats…

machine-learning nlp bert transformer gpt

asked Nov 26 '21 at 21:22

Rnj

205
2
7

7

votes

5 answers

ChatGPT's Architecture - Decoder Only? Or Encoder-Decoder?

Does ChatGPT use an encoder-decoder architecture, or a decoder-only architecture? I have been coming across Medium and TowardsDataScience articles suggesting that it has an encoder-decoder architecture (see sources below): --…

nlp language-model gpt

asked Feb 03 '23 at 08:57

user141493

191
1
1
8

7

votes

1 answer

How Exactly Does In-Context Few-Shot Learning Actually Work in Theory (Under the Hood), Despite only Having a "Few" Support Examples to "Train On"?

Recent models like the GPT-3 Language Model (Brown et al., 2020) and the Flamingo Visual-Language Model (Alayrac et al., 2022) use in-context few-shot learning. The models are able to make highly accurate predictions even when only presented with a…

nlp computer-vision language-model gpt deepmind

asked Oct 24 '22 at 23:26

user141493

191
1
1
8

5

votes

5 answers

Is using GPT-4 to label data advisable?

If I have a lot of text data that needs to be labeled (e.g. sentiment analysis), and given the high accuracy of GPT-4, could I use it to label data? Or would that introduce bias or some other issues?

machine-learning gpt labelling

asked Apr 05 '23 at 19:53

cookiecutter

63
2

4

votes

1 answer

What's the right input for gpt-2 in NLP

I'm fine-tuning pre-trained gpt-2 for text summarization. The dataset contains 'text' and 'reference summary'. So my question is how to add special tokens to get the right input format. Currently I'm thinking doing like this: example1 text …

nlp data-science-model transformer gpt

asked Dec 11 '20 at 17:29

yuqiong11

61
1
2

4

votes

2 answers

ChatGPT: How to use long texts in prompt?

I like the website chatpdf.com a lot. You can upload a PDF file and then discuss the textual content of the file with the file "itself". It uses ChatGPT. I would like to program something similar. But I wonder how to use the content of long PDF…

transformer gpt tokenization chatbot

asked Mar 18 '23 at 12:46

meyer_mit_ai

63
1
1
5

4

votes

2 answers

Does fine-tuning require retraining the entire model?

Would it be necessary to retrain the entire model if we were to perform fine-tuning? Let's say we somehow got the GPT-3 model from OpenAI (I know GPT-3 is closed source). Would anyone with access to a couple of RTX 3080 GPUs be able to fine tune it…

machine-learning deep-learning transformer gpt

asked Nov 17 '22 at 18:50

Exploring

125
8

3

votes

2 answers

Does the transformer decoder reuse previous tokens' intermediate states like GPT2?

I recently read Jay Alammar's blogpost about GPT-2 (http://jalammar.github.io/illustrated-gpt2/) which I found quite clear appart from one point : He explains that the decoder of GPT-2 processes input tokens one at a time, only actively processing…

nlp transformer gpt

asked Mar 25 '20 at 15:44

Johncowk

195
1
6

3

votes

2 answers

How to generate a sentence with exactly N words?

Thanks to GPT2 pretrained model now it is possible to generate meaningful sequence of words with or without prefix. However a sentence should end with a proper endings (.,!,?). I am just wondering how to generate a sentence (with proper ending) of…

nlp bert ai text-generation gpt

asked Mar 03 '20 at 09:30

user185597

31
2

3

votes

0 answers

Fine tune gpt2 via huggingface API for domain specific LM

i am using the script in the examples folder to fine-tune the LM for a bot meant to deal with insurance related queries. So if someone were to type "i am looking to modify my ..." , the autocomplete suggestions would be " modify my name ", "modify…

language-model gpt

asked Dec 28 '19 at 11:10

Vikram Murthy

328
1
10

Questions tagged [gpt]