My task is creating a model for QA-purposes. I have only ~200 samples on a specific domain of questions. Using a pretrained like DeBERTa without any further changes results in f1 scores of ~35%.
To further improve this, I tried to train the model on my additional data. This very quickly results in a non-usable model with scores of 1-3%. Freezing all / most layers but the last just delays the decrease. I also tried opendelta's BitFit, Lora or Adapter models which change the original model in a small way and freeze all other parameters. This unfortunately also didn't help the performance.
Any ideas on what else I could try or do?