TY - JOUR
T1 - Named entity recognition in Chinese medical records based on cascaded conditional random field
AU - Yan, Yang
AU - Wen, Dun Wei
AU - Wang, Yun Ji
AU - Wang, Ke
N1 - Publisher Copyright:
©, 2014, Editorial Board of Jilin University. All right reserved.
PY - 2014/11/1
Y1 - 2014/11/1
N2 - A new method for named entity recognition in Chinese medical records based on cascaded Conditional Random Fields (CRFs) is proposed. The first layer of the cascaded CRFs is used to identify the basic named entities of body parts and diseases. Then, the identified results are fed to the second layer for recognition of nested named entities for complex diseases and clinical symptoms. A new combination feature, composed of part-of-speech features and named entity features, is defined. This new feature together with the character features, word boundary features and context features in a sentence are taken as the feature set of the second layer. In the experiments based on CRF++, the proposed method yields a 3% higher F-score than cascaded CRF without the combination feature. Moreover, compared to single layer CRF method, it yields a 7% higher F-score, a significant increase in overall performance.
AB - A new method for named entity recognition in Chinese medical records based on cascaded Conditional Random Fields (CRFs) is proposed. The first layer of the cascaded CRFs is used to identify the basic named entities of body parts and diseases. Then, the identified results are fed to the second layer for recognition of nested named entities for complex diseases and clinical symptoms. A new combination feature, composed of part-of-speech features and named entity features, is defined. This new feature together with the character features, word boundary features and context features in a sentence are taken as the feature set of the second layer. In the experiments based on CRF++, the proposed method yields a 3% higher F-score than cascaded CRF without the combination feature. Moreover, compared to single layer CRF method, it yields a 7% higher F-score, a significant increase in overall performance.
KW - Cascaded conditional random field
KW - Chinese medical records
KW - Conditional random field
KW - Information processing
KW - Named entity recognition
UR - http://www.scopus.com/inward/record.url?scp=84912527492&partnerID=8YFLogxK
U2 - 10.13229/j.cnki.jdxbgxb201406047
DO - 10.13229/j.cnki.jdxbgxb201406047
M3 - Journal Article
AN - SCOPUS:84912527492
SN - 1671-5497
VL - 44
SP - 1843
EP - 1848
JO - Jilin Daxue Xuebao (Gongxueban)/Journal of Jilin University (Engineering and Technology Edition)
JF - Jilin Daxue Xuebao (Gongxueban)/Journal of Jilin University (Engineering and Technology Edition)
IS - 6
ER -