Highest Voted 'pretraining' Questions - Data Science Stack Exchange

5

votes

2 answers

Further Training a pre-trained LLM

My goal is to use the general knowledge and language understanding of a pre-trained LLM and to continue training on a smaller domain specific corpus to improve the model's knowledge on the domain. What is the best practice approach here without…

asked Jun 12 '23 at 09:57

Arthuro

81
3

3

votes

1 answer

Are there any objections to using the same (unlabelled) data for pre-training of a BERT-Based model and the downstream task?

I'm looking to train an Electra model using unlabelled data in a specific field. Are there any objections to using the same data for unsupervised learning and then using the same data downstream for the supervised learning task?

nlp bert pretraining

asked Aug 11 '20 at 04:16

user103134

31
1

2

votes

1 answer

Fine-tuning pre-trained Word2Vec model with Gensim 4.0

With Gensim < 4.0, we can retrain a word2vec model using the following code: model = Word2Vec.load_word2vec_format("GoogleNews-vectors-negative300.bin", binary=True) model.train(my_corpus, total_examples=len(my_corpus),…

word2vec transfer-learning gensim pretraining

asked Jul 07 '21 at 12:34

NST

51
4

2

votes

2 answers

Does finetuning BERT involving updating all of the parameters or just the final classification layer?

Currently learning and reading about transformer models, I get that during the pretraining stage the BERT model is trained on a large corpus via MLM and NSP. But during finetuning, for example trying to classify sentiment based on another text, are…

nlp bert transformer finetuning pretraining

asked Sep 04 '20 at 20:54

spnc

21
2

2

votes

0 answers

how to improve recall by retraining a model on its feedback

I am creating a supervised model using sensitive and scarce data. For the sake of discussion, I've simiplified the problem statement by assuming that I'm creating a model for identifying dogs. Let's say I am creating a model to identify dogs in…

machine-learning machine-learning-model reinforcement-learning accuracy pretraining

asked Mar 13 '22 at 17:54

learnlifelong

33
3

1

vote

0 answers

Working on an image classification project (microscopic images) , have some doubts

Currently, I am working on an image classification project. The data set contains very high resolution images taken via an electron microscope. Hence, I have few and limited instances. I have done EDA and made up a deep CNN to go about it. The…

deep-learning cnn image-classification data-augmentation pretraining

asked Jun 01 '21 at 09:11

Aditi

11
1

1

vote

1 answer

What the differences between self-supervised/semi-supervised in NLP?

GPT-1 mentions both Semi-supervised learning and Unsupervised pre-training but it seems like the same to me. Moreoever, "Semi-supervised Sequence Learning" of Dai and Le also more like self-supervised learning. So what the key differences between…

nlp semi-supervised-learning pretraining

asked May 27 '21 at 04:00

Inhyeok Yoo

33
4

1

vote

2 answers

Would there be any reason to pretrain BERT on specific texts?

So the official BERT English model is trained on Wikipedia and BookCurpos (source). Now, for example, let's say I want to use BERT for Movies tag recommendation. Is there any reason for me to pretrain a new BERT model from scratch on movie-related…

bert transfer-learning language-model pretraining

asked Feb 07 '21 at 20:21

Moradnejad

265
1
2
6

1

vote

1 answer

How to access GPT-3, BERT or alike?

I am interested in accessing NLP models mentioned in scientific papers, to replicate some results and experiment. But I only see waiting lists https://openai.com/blog/openai-api/ and licenses granted in large commercial deals…

nlp gpt pretraining

asked Jan 22 '21 at 10:03

user305883

165
9

1

vote

2 answers

Deploying multiple pre-trained model (tar.gz files) on Sagemaker in a single endpoint

We have followed the following steps: Trained 5 TensorFlow models in local machine using 5 different training sets. Saved those in .h5 format. Converted those into tar.gz (Model1.tar.gz,...Model5.tar.gz) and uploaded it in the S3…

tensorflow machine-learning-model aws sagemaker pretraining

asked Aug 28 '20 at 16:51

Subh2608

13
1
4

1

vote

2 answers

Semantic segmentation with greyscale images

I'm trying to reproduce a research with greyscale images instead of colour images. I have found that there are pre-trained networks, like VGG16, with ImageNet. But that dataset has colour images, and I can't use it because I'm going to use greyscale…

dataset cnn vgg16 pretraining

asked Jul 31 '20 at 08:21

VansFannel

229
1
11

1

vote

0 answers

Is there practice to train language-to-code transformer (multi-modal transformer) using uni-modal pretrained models-transformers?

Language-to-code transformation/generation require multiple skills - language and reasoning skills to digest the core problem from the natural language specification. And programming language knowledge. There are separate pret-trained models for the…

machine-learning-model transformer pretraining

asked Jun 30 '22 at 08:46

TomR

141
4

1

vote

0 answers

Is it possible to "fine-tune" a pre-trained logistic regression model?

Fine tuning is a concept commonly used in deep learning. We may have a pre-trained model and then fine-tune it to our specific task. Does that apply to simple models, such as logistic regression? For example, let's say I have a dataset with…

scikit-learn logistic-regression finetuning pretraining

asked May 17 '22 at 16:57

eduardokapp

111
2

1

vote

1 answer

Pretrained vs. finetuned model

I have a doubt regarding terminology. When dealing with huggingface transformer models, I often read about "using pretrained models for classification" vs. "fine-tuning a pretrained model for classification." I fail to understand what the exact…

transformer transfer-learning finetuning pretraining

asked May 16 '22 at 09:05

lazarea

289
1
11

1

vote

1 answer

test data is not a good representation of train data

I have predefined train and test sets. On generating some statistics like value_counts and checking the unique values, I feel that there is a 'lot' of difference between the distributions of the variables. What should be done with this? Suppose if I…

machine-learning data-cleaning training preprocessing pretraining

asked Nov 01 '21 at 17:23

letdatado

13
3

Questions tagged [pretraining]