Mention some of the evaluation metrics commonly used which are proposed by NER forums and explain them.
Many evaluation metrics have been developed to obtain better accuracy and increase the overall performance of any approach of NER. Some of them which are developed and brought forward by NER forums are-
a) CoNLL- Computational Natural Language Learning
b) ACE- Automatic Content Extraction
c) MUC- Message Understanding Conference
d) SemEval- Semantic Evaluation
Computational Natural Language Learning.
This evaluation takes all the three metrics- precision, recall and F1-score into account and they are evaluated based on the below mentioned scenarios.
a) Match between entity type and surface string
b) System hypothesized an entity
c) Missing an entity by system.
Message Understanding Conference
Message Understanding Conference is a model evaluation system which compares the response of a system against the golden annotation. The responses are evaluated based on the following metrics.
a) Correct (COR) where both the responses are same
b) Incorrect (INC) where the responses do not match
c) Partial (PAR) where responses are somewhat similar
d) Missing (MIS) where response of golden annotation does not appear
SemEval system.
SemEval is an evaluation technique used in NER which performs semantic analysis and they are implemented to explore the meaning of words in a sentence or a document. They introduced four different ways for evaluation. They are
Strict evaluation- exact match between surface string and entity
Exact Evaluation- exact match of surface string
Partial Evaluation- partial match of surface string
Type Evaluation- overlap between entities of system and gold annotation.
The below representation shows how the SemEval system works.