Questions tagged [pretraining]

32 questions
5
votes
2 answers

Further Training a pre-trained LLM

My goal is to use the general knowledge and language understanding of a pre-trained LLM and to continue training on a smaller domain specific corpus to improve the model's knowledge on the domain. What is the best practice approach here without…
3
votes
1 answer

Are there any objections to using the same (unlabelled) data for pre-training of a BERT-Based model and the downstream task?

I'm looking to train an Electra model using unlabelled data in a specific field. Are there any objections to using the same data for unsupervised learning and then using the same data downstream for the supervised learning task?
user103134
  • 31
  • 1
2
votes
1 answer

Fine-tuning pre-trained Word2Vec model with Gensim 4.0

With Gensim < 4.0, we can retrain a word2vec model using the following code: model = Word2Vec.load_word2vec_format("GoogleNews-vectors-negative300.bin", binary=True) model.train(my_corpus, total_examples=len(my_corpus),…
NST
  • 51
  • 4
2
votes
2 answers

Does finetuning BERT involving updating all of the parameters or just the final classification layer?

Currently learning and reading about transformer models, I get that during the pretraining stage the BERT model is trained on a large corpus via MLM and NSP. But during finetuning, for example trying to classify sentiment based on another text, are…
spnc
  • 21
  • 2
2
votes
0 answers

how to improve recall by retraining a model on its feedback

I am creating a supervised model using sensitive and scarce data. For the sake of discussion, I've simiplified the problem statement by assuming that I'm creating a model for identifying dogs. Let's say I am creating a model to identify dogs in…
1
vote
0 answers

Working on an image classification project (microscopic images) , have some doubts

Currently, I am working on an image classification project. The data set contains very high resolution images taken via an electron microscope. Hence, I have few and limited instances. I have done EDA and made up a deep CNN to go about it. The…
1
vote
1 answer

What the differences between self-supervised/semi-supervised in NLP?

GPT-1 mentions both Semi-supervised learning and Unsupervised pre-training but it seems like the same to me. Moreoever, "Semi-supervised Sequence Learning" of Dai and Le also more like self-supervised learning. So what the key differences between…
1
vote
2 answers

Would there be any reason to pretrain BERT on specific texts?

So the official BERT English model is trained on Wikipedia and BookCurpos (source). Now, for example, let's say I want to use BERT for Movies tag recommendation. Is there any reason for me to pretrain a new BERT model from scratch on movie-related…
Moradnejad
  • 265
  • 1
  • 2
  • 6
1
vote
1 answer

How to access GPT-3, BERT or alike?

I am interested in accessing NLP models mentioned in scientific papers, to replicate some results and experiment. But I only see waiting lists https://openai.com/blog/openai-api/ and licenses granted in large commercial deals…
user305883
  • 165
  • 9
1
vote
2 answers

Deploying multiple pre-trained model (tar.gz files) on Sagemaker in a single endpoint

We have followed the following steps: Trained 5 TensorFlow models in local machine using 5 different training sets. Saved those in .h5 format. Converted those into tar.gz (Model1.tar.gz,...Model5.tar.gz) and uploaded it in the S3…
1
vote
2 answers

Semantic segmentation with greyscale images

I'm trying to reproduce a research with greyscale images instead of colour images. I have found that there are pre-trained networks, like VGG16, with ImageNet. But that dataset has colour images, and I can't use it because I'm going to use greyscale…
VansFannel
  • 229
  • 1
  • 11
1
vote
0 answers

Is there practice to train language-to-code transformer (multi-modal transformer) using uni-modal pretrained models-transformers?

Language-to-code transformation/generation require multiple skills - language and reasoning skills to digest the core problem from the natural language specification. And programming language knowledge. There are separate pret-trained models for the…
TomR
  • 141
  • 4
1
vote
0 answers

Is it possible to "fine-tune" a pre-trained logistic regression model?

Fine tuning is a concept commonly used in deep learning. We may have a pre-trained model and then fine-tune it to our specific task. Does that apply to simple models, such as logistic regression? For example, let's say I have a dataset with…
1
vote
1 answer

Pretrained vs. finetuned model

I have a doubt regarding terminology. When dealing with huggingface transformer models, I often read about "using pretrained models for classification" vs. "fine-tuning a pretrained model for classification." I fail to understand what the exact…
lazarea
  • 289
  • 1
  • 11
1
vote
1 answer

test data is not a good representation of train data

I have predefined train and test sets. On generating some statistics like value_counts and checking the unique values, I feel that there is a 'lot' of difference between the distributions of the variables. What should be done with this? Suppose if I…
1
2 3