0

My task is creating a model for QA-purposes. I have only ~200 samples on a specific domain of questions. Using a pretrained like DeBERTa without any further changes results in f1 scores of ~35%.

To further improve this, I tried to train the model on my additional data. This very quickly results in a non-usable model with scores of 1-3%. Freezing all / most layers but the last just delays the decrease. I also tried opendelta's BitFit, Lora or Adapter models which change the original model in a small way and freeze all other parameters. This unfortunately also didn't help the performance.

Any ideas on what else I could try or do?

  • 1
    Either your test and training data are very different, in which case you need to get better training data, or you are doing your fine tuning wrong, in which case you need to learn more. – Valentas Jun 08 '23 at 19:16

0 Answers0