TY - GEN
T1 - CUP classification based on a tree structure with MiRNA feature selection
AU - Zhang, Xiaoxue
AU - Wen, Dunwei
AU - Wang, Ke
AU - Yang, Yinan
PY - 2013
Y1 - 2013
N2 - Given the low sensitivity of identifying the origin of cancer tissues using miRNAs in previous research, we adopt a decision tree structure to build a new SVM based model for identifying a variety of Cancer of Unknown Primary Origin (CUP). We use an information gain based feature selection method provided by Weka to select miRNAs and combine them with previously recognized features to determine several most useful miRNAs. Next we design a layer-by-layer classification tree based on the expression levels of these selected miRNAs. Then we use a polynomial kernel SVM classifier, which is more effective in dealing with binary classification problem, for classification at each node of the decision tree structure. In our experiments, a final overall sensitivity of the test set reached 87%, and the sensitivity of identifying the metastatic samples in the test set significantly increased by 9%. The 10-fold cross-validation on this model shows that the sensitivity of the test set is not less than the sensitivity of the training set, indicating that the model has good generalization ability. Additionally, the use of general feature selection makes the approach of this paper more adaptable and suitable for other areas.
AB - Given the low sensitivity of identifying the origin of cancer tissues using miRNAs in previous research, we adopt a decision tree structure to build a new SVM based model for identifying a variety of Cancer of Unknown Primary Origin (CUP). We use an information gain based feature selection method provided by Weka to select miRNAs and combine them with previously recognized features to determine several most useful miRNAs. Next we design a layer-by-layer classification tree based on the expression levels of these selected miRNAs. Then we use a polynomial kernel SVM classifier, which is more effective in dealing with binary classification problem, for classification at each node of the decision tree structure. In our experiments, a final overall sensitivity of the test set reached 87%, and the sensitivity of identifying the metastatic samples in the test set significantly increased by 9%. The 10-fold cross-validation on this model shows that the sensitivity of the test set is not less than the sensitivity of the training set, indicating that the model has good generalization ability. Additionally, the use of general feature selection makes the approach of this paper more adaptable and suitable for other areas.
KW - CUP
KW - Feature selection
KW - SVM
KW - Sensitivity
KW - miRNAs
UR - http://www.scopus.com/inward/record.url?scp=84894115864&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-45114-0_38
DO - 10.1007/978-3-642-45114-0_38
M3 - Published Conference contribution
AN - SCOPUS:84894115864
SN - 9783642451133
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 485
EP - 496
BT - Advances in Artificial Intelligence and Its Applications - 12th Mexican International Conference on Artificial Intelligence, MICAI 2013, Proceedings
T2 - 12th Mexican International Conference on Artificial Intelligence, MICAI 2013
Y2 - 24 November 2013 through 30 November 2013
ER -