What are the issues faced in Named Entity Recognition? How to overcome the problem faced in supervised named entity recognition system?
The extraction of the entities can be very difficult when it comes to the factors such as languages, text genre, types of domain, entity type, etc. In English language, NER is easy because the capitalization clue can easily identify the entities present in the data. But in other languages, it becomes really difficult to interpret the entities because of the absence of these factors. Such issues are overcome by developing various rule-based approaches where entities in different languages can be easily interpreted. Apart from this, there are other challenges such as variation in spellings, non-local dependencies, capitalization issues, text ambiguity, etc. which affect the performance of NER.
Supervised named entity recognition systems require large corporations of annotated labels for classifying named entities. The process of annotation of large data required for training is very time consuming and a lot of domain knowledge is required to perform this task. To overcome such issues, use of unsupervised labelling can help reduce time by considering seed examples as a small amount of labelled data for further classification.