Questions tagged [pretraining]
32 questions
5
votes
2 answers
Further Training a pre-trained LLM
My goal is to use the general knowledge and language understanding of a pre-trained LLM and to continue training on a smaller domain specific corpus to improve the model's knowledge on the domain. What is the best practice approach here without…
Arthuro
- 81
- 3
3
votes
1 answer
Are there any objections to using the same (unlabelled) data for pre-training of a BERT-Based model and the downstream task?
I'm looking to train an Electra model using unlabelled data in a specific field. Are there any objections to using the same data for unsupervised learning and then using the same data downstream for the supervised learning task?
user103134
- 31
- 1
2
votes
1 answer
Fine-tuning pre-trained Word2Vec model with Gensim 4.0
With Gensim < 4.0, we can retrain a word2vec model using the following code:
model = Word2Vec.load_word2vec_format("GoogleNews-vectors-negative300.bin", binary=True)
model.train(my_corpus, total_examples=len(my_corpus),…
NST
- 51
- 4
2
votes
2 answers
Does finetuning BERT involving updating all of the parameters or just the final classification layer?
Currently learning and reading about transformer models, I get that during the pretraining stage the BERT model is trained on a large corpus via MLM and NSP. But during finetuning, for example trying to classify sentiment based on another text, are…
spnc
- 21
- 2
2
votes
0 answers
how to improve recall by retraining a model on its feedback
I am creating a supervised model using sensitive and scarce data. For the sake of discussion, I've simiplified the problem statement by assuming that I'm creating a model for identifying dogs.
Let's say I am creating a model to identify dogs in…
learnlifelong
- 33
- 3
1
vote
0 answers
Working on an image classification project (microscopic images) , have some doubts
Currently, I am working on an image classification project. The data set contains very high resolution images taken via an electron microscope. Hence, I have few and limited instances.
I have done EDA and made up a deep CNN to go about it. The…
Aditi
- 11
- 1
1
vote
1 answer
What the differences between self-supervised/semi-supervised in NLP?
GPT-1 mentions both Semi-supervised learning and Unsupervised pre-training but it seems like the same to me. Moreoever, "Semi-supervised Sequence Learning" of Dai and Le also more like self-supervised learning. So what the key differences between…
Inhyeok Yoo
- 33
- 4
1
vote
2 answers
Would there be any reason to pretrain BERT on specific texts?
So the official BERT English model is trained on Wikipedia and BookCurpos (source).
Now, for example, let's say I want to use BERT for Movies tag recommendation. Is there any reason for me to pretrain a new BERT model from scratch on movie-related…
Moradnejad
- 265
- 1
- 2
- 6
1
vote
1 answer
How to access GPT-3, BERT or alike?
I am interested in accessing NLP models mentioned in scientific papers, to replicate some results and experiment.
But I only see waiting lists https://openai.com/blog/openai-api/ and licenses granted in large commercial deals…
user305883
- 165
- 9
1
vote
2 answers
Deploying multiple pre-trained model (tar.gz files) on Sagemaker in a single endpoint
We have followed the following steps:
Trained 5 TensorFlow models in local machine using 5 different training sets.
Saved those in .h5 format.
Converted those into tar.gz (Model1.tar.gz,...Model5.tar.gz) and uploaded it in the S3…
Subh2608
- 13
- 1
- 4
1
vote
2 answers
Semantic segmentation with greyscale images
I'm trying to reproduce a research with greyscale images instead of colour images.
I have found that there are pre-trained networks, like VGG16, with ImageNet. But that dataset has colour images, and I can't use it because I'm going to use greyscale…
VansFannel
- 229
- 1
- 11
1
vote
0 answers
Is there practice to train language-to-code transformer (multi-modal transformer) using uni-modal pretrained models-transformers?
Language-to-code transformation/generation require multiple skills - language and reasoning skills to digest the core problem from the natural language specification. And programming language knowledge. There are separate pret-trained models for the…
TomR
- 141
- 4
1
vote
0 answers
Is it possible to "fine-tune" a pre-trained logistic regression model?
Fine tuning is a concept commonly used in deep learning. We may have a pre-trained model and then fine-tune it to our specific task.
Does that apply to simple models, such as logistic regression?
For example, let's say I have a dataset with…
eduardokapp
- 111
- 2
1
vote
1 answer
Pretrained vs. finetuned model
I have a doubt regarding terminology. When dealing with huggingface transformer models, I often read about "using pretrained models for classification" vs. "fine-tuning a pretrained model for classification."
I fail to understand what the exact…
lazarea
- 289
- 1
- 11
1
vote
1 answer
test data is not a good representation of train data
I have predefined train and test sets. On generating some statistics like value_counts and checking the unique values, I feel that there is a 'lot' of difference between the distributions of the variables.
What should be done with this?
Suppose if I…
letdatado
- 13
- 3