2008 ANonParamSemiSupDiscrMeth

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Semi-Supervised Algorithm, Discretization Task.

Notes

Cited By

Quotes

Abstract

Semi-supervised classification methods aim to exploit labelled and unlabelled examples to train a predictive model. Most of these approaches make assumptions on the distribution of classes. This article first proposes a new semi-supervised discretization method which adopts very low informative prior on data. This method discretizes the numerical domain of a continuous input variable, while keeping the information relative to the prediction of classes. Then, an in-depth comparison of this semi-supervised method with the original supervised MODL approach is presented. We demonstrate that the semi-supervised approach is asymptotically equivalent to the supervised approach, improved with a post-optimization of the intervals bounds location.

References

  • Berger J (2006) The case of objective Bayesian analysis. Bayesian Anal 1(3):385–402
  • Blum A, Mitchell T (1998) Combining Labeled and Unlabeled Data with Co-training. In: COLT ’98: Proceedings of the eleventh annual conference on Computational learning theory. ACM Press, New York, pp 92–100
  • Boullé M (2005) A Bayes optimal approach for partitioning the values of categorical attributes. J Mach Learn Res 6:1431–1452
  • Boullé M (2006) MODL: a Bayes optimal discretization method for continuous attributes. Mach Learn 65(1):131–165
  • Catlett J (1991) On changing continuous attributes into ordered discrete attributes. In: EWSL-91: Proceedings of the European working session on learning on machine learning. Springer, New York, pp 164–178
  • Chapelle O, Schölkopf B, Zien A (2007) Semi-supervised learning. MIT Press, Cambridge
  • Dougherty J, Kohavi R, Sahami M (1995) Supervised and unsupervised discretization of continuous features. In: International conference on machine learning, pp 194–202
  • Fawcett T (2003) Roc graphs: notes and practical considerations for data mining researchers. Technical Report HPL-2003-4, HP Labs. http://citeseer.ist.psu.edu/fawcett03roc.html 123A. Bondu et al.
  • Fayyad U, Irani K (1992) On the handling of continuous-valued attributes in decision tree generation. Mach Learn 8:87–102
  • Fayyad U, Piatetsky-Shapiro G, Smyth P (1996) From data mining to knowledge discovery: an overview. Adv Knowl Discov Data Min 1–34
  • Fujino A, Ueda N, Saito K (2007) A hybrid generative/discriminative approach to text classification with additional information. Inf Process Manage 43:379–392
  • Holte R (1993) Very simple classification rules perform well on most commonly used datasets. Mach Learn 11:63–91
  • Jin R, Breitbart Y, Muoh C (2009) Data discretization unification. Knowl Inf Syst 19(1):1–29
  • Kohavi R, Sahami M (1996) Error-based and entropy-based discretization of continuous features. In: Proceedings of the second International Conference on knowledge discovery and data mining, pp 114–119
  • Langley P, Iba W, Thomas K (1992) An analysis of Bayesian classifiers. In: Press A (ed) Tenth national conference on artificial intelligence, pp 223–228
  • Liu H, Hussain F, Tan C, Dash M (2002) Discretization: an enabling technique. Data Min Knowl Discov 6(4):393–423
  • Maeireizo B, Litman D, Hwa R (2004) Analyzing the effectiveness and applicability of co-training. In: ACL ’04: the companion proceedings of the 42nd annual meeting of the association for computational linguistics
  • Newman DJ, Hettich S, Blake CL, Merz CJ (1998) UCI repository of machine learning databases. Department of Information and Computer Sciences, University of California, Irvine. http://www.ics.uci.edu/~mlearn/MLRepository.html
  • Pyle D (1999) Data preparation for data mining. Morgan Kaufmann, San Francisco, p 19
  • Rissanen J (1978) Modeling by shortest data description. Automatica 14:465–471
  • Rosenberg C, Hebert M, Schneiderman H (2005) Semi-supervised self-training of object detection models. In: Seventh IEEE workshop on applications of computer vision
  • Settles B (2009) Active learning literature survey. Computer Sciences Technical Report 1648, University of Wisconsin–Madison
  • Shannon C (1948) A mathematical theory of communication. Key papers in the development of information theory
  • Sugiyama M, Krauledat M, Müller K (2007) Covariate shift adaptation by importance weighted cross validation. J Mach Learn Res 8:985–1005
  • Sugiyama M, Müller K (2005) Model selection under covariate shift. In: ICANN, International conference on computational on artificial neural networks: formal models and their applications
  • Wu X, Kumar V, Quinlan JR, Ghosh J, Yang Q, Motoda H, McLachlan GJ, Ng A, Liu B, Yu PY, Zhou Z, Steinbach M, Hand DJ, Steinberg D (2008) Top 10 algorithms in data mining. Knowl Inf Syst 14(1)
  • Zhou ZH, Li M (2009) Semi-supervised learning by disagreement. Knowl Inf Syst doi:10.1007/ s10115-009-0209-z
  • Zighed D, Rakotomalala R (2000) Graphes d’induction. Hermes, France.

,

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2008 ANonParamSemiSupDiscrMethAlexis Bondu
Marc Boullé
Vincent Lemaire
A Non-parametric Semi-supervised Discretization Methodhttp://perso.rd.francetelecom.fr/boulle/publications/BonduEtAl09.pdf