2006 DecisionTreesForHierarchicalMultilabelClassification

From GM-RKB
Jump to: navigation, search

Subject Headings: Multilabel Classification, Bioinformatics

Notes

Cited By

Quotes

Abstract

Hierarchical multilabel classification (HMC) is an extension of binary classification where an instance can be labelled with multiple classes that are organised in a hierarchy. A well-known application of this kind of problem is gene function prediction. A gene can have multiple functions at the same time, and these functions are hierarchically organised: a gene predicted to have a certain class should also be predicted to have all its superclasses, as given by the hierarchy. A straightforward approach to solve this problem would be to learn a binary classifier for each class separately and then to combine the predictions. However, this has several disadvantages: (1) learning is not very efficient, since a separate classifier has to be learned for each class, (2) binary classifiers have known problems with skewed class distributions and (3) the hierarchy constraint, implying that a class should be predicted along with all its superclasses, is not automatically fulfilled. The obvious alternative is to learn a single model that predicts all the different classes at once. In this paper we propose a method for learning decision trees that predicts for each instance a set of classes instead of a single class.

References

  • [1] H. Blockeel, L. De Raedt, and J. Ramon. Top-down induction of clustering trees. In: Proceedings of the 15th International Conference on Machine Learning, pages 55–63, 1998.
  • [2] Hendrik Blockeel, Leander Schietgat, Jan Struyf, Saˇso Dˇzeroski, and Amanda Clare. Decision trees for hierarchical multilabel classification: A case study in functional genomics. In: Proceedings of the 10th European Conference on Principles and Practice of Knowledge Discovery in Databases, 2006.
  • [3] Leo Breiman, J.H. Friedman, R.A. Olshen, and C.J. Stone. Classification and Regression Trees. Wadsworth, Belmont, 1984.
  • [4] A. Clare. Machine learning and data mining for yeast functional genomics. PhD thesis, University of Wales, Aberystwyth, 2003.,


 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2006 DecisionTreesForHierarchicalMultilabelClassificationHendrik Blockeel
Leander Schietgat
Jan Struyf
Sašo Džeroski
Amanda Clare
Decision Trees for Hierarchical Multilabel Classification: A case study in functional genomicsProceedings of 10th European Conference on Principles and Practice of Knowledge Discovery in Databaseshttp://www.cs.kuleuven.be/~jan/papers/HMCBNAIC.pdf10.1007/118716372006