Automatic annotation for medical texts based on hidden topic and semantic tree

Bo Li, Dun Wei Wen, Ke Wang, Jing Xin Liu

Research output: Contribution to journalJournal Articlepeer-review


Medical texts lack quantifiable data structure, thus text keyword model based processing method is not practicable. On the basis of research on latent semantic association between words and keywords tree structure, a semantic analysis model based on latent semantic tree was constructed for medical text data mining. Furthermore, the hidden topic is associated with latent semantic research, and a text processing method was designed based on potential Dirichlet allocation and latent semantic tree model, which can form certain readable automatic annotation according to different types of medical texts. This automatic annotation has lower subjectivity, higher accuracy and readability than the keywords model method. Besides, it can assist medical doctors with text notation and classification, reducing their workload. Program results show that this method can be applied to medical image views and to form diagnosis opinion, patient medical records, produce symptomatic prescription. The semantic matching degree for annotation is 67.7%, and the readability of the text can reach 60.02%.

Original languageEnglish
Pages (from-to)234-239
Number of pages6
JournalJilin Daxue Xuebao (Gongxueban)/Journal of Jilin University (Engineering and Technology Edition)
Issue number1
Publication statusPublished - Jan. 2012


  • Automatic annotation
  • Information processing
  • Latent Dirichlet allocation
  • Latent semantic analysis
  • Medical texts
  • Semantic tree


Dive into the research topics of 'Automatic annotation for medical texts based on hidden topic and semantic tree'. Together they form a unique fingerprint.

Cite this