5

I am trying to train custom entities using Spacy. During the training process I am getting number of values of LOSS, score etc. What is the meaning of these values

============================= Training pipeline =============================
ℹ Pipeline: ['tok2vec', 'ner']
ℹ Initial learn rate: 0.001
E    #       LOSS TOK2VEC  LOSS NER  ENTS_F  ENTS_P  ENTS_R  SCORE 
---  ------  ------------  --------  ------  ------  ------  ------
  0       0          0.00    278.46    0.00    0.00    0.00    0.00
 20     200       3647.33  10920.67   91.75   93.68   89.90    0.92
 40     400         92.82    679.78   98.21   99.48   96.97    0.98
 60     600         66.59    274.91   98.98  100.00   97.98    0.99
 80     800         87.59    252.62   98.98   99.49   98.48    0.99
Aniiya0978
  • 173
  • 4

1 Answers1

3

The values for LOSS TOK2VEC and LOSS NER are the loss values for the token-to-vector and named entity recognition steps in your pipeline. The ENTS_F, ENTS_P, and ENTS_R column indicate the values for the F-score, precision, and recall for the named entities task (see also the items under the 'Accuracy Evaluation' block on this link. The score column shows the overall score of the pipeline, which may or may not be a weighted more to specific subtasks.

Oxbowerce
  • 7,077
  • 2
  • 8
  • 22
  • Good! Please, in this case: "E # LOSS TOK2VEC LOSS MORPH... POS_ACC MORPH_ACC SCORE" What do "POS_ACC " and "MORPH_ACC" mean? What is the difference? Thanks. – user140259 Dec 20 '21 at 20:56
  • `POS_ACC` refers to the accuracy of your model for the [part-of-speech tag](https://spacy.io/usage/linguistic-features#pos-tagging), and `MORPH_ACC` refers to the accuracy of your model on the [morphological features](https://spacy.io/usage/linguistic-features#morphology). – Oxbowerce Dec 21 '21 at 08:27
  • Is there a way to get the losses / scored for a *loaded* model (i.e, for a previously trained model, not one that is training right now)? – Yoav Vollansky Mar 03 '22 at 13:25
  • What are the `E` and `#` columns? Where can I find the docs about these columns? What is the formula for `SCORE`? @oxbowerce – SteveS Sep 14 '22 at 20:19
  • @SteveS The `E` and `#` columns are for the number of epochs and number of steps respectively (see also [the source code](https://github.com/explosion/spaCy/blob/6723d76f24a55f24ef1632ac8be46567a984d0ef/spacy/training/loggers.py)). WIth regards to the `SCORE` column, how this is calculated depends on the score weights defined in the config file, for more info on this see [the documentation on weighted scores](https://spacy.io/usage/training#metrics). – Oxbowerce Sep 15 '22 at 07:53
  • Thanks a lot @Oxbowerce!!!! – SteveS Sep 15 '22 at 08:56