Is it fundamentally correct to training text classification (sentiment analysis) model with pre-trained word vectors; first with the locked embedding layer, and then train again with locked additional layers and unlocked embedding layer?
as you know there are so many words that not exist in the pre-trained word vectors, so by creating embedding layer weights with zeros elements, the model would miss these words same as out of vocabulary words?
or even how about the other same ways; define callback to lock and unlock layers on the end of each epoch sequentially, or even lock only the dense classifier layers and unlock the embedding layer and the other ones.
thanks for your companion