1996 UnifyingInstAndRuleBasedInduction

(Domingos, 1996) ⇒ Pedro Domingos. (1996). “Unifying Instance-based and Rule-based Induction.” In: Machine Learning, 24(2). doi:10.1023/A:1018006431188

Subject Headings: RISE System, Concept Learning, Multi-Strategy Learning, Rule Induction, Instance-based Learning.

Notes

Cited By

250 http://scholar.google.com/scholar?q=%22Unifying+Instance-Based+and+Rule-Based+Induction%22+1996

Quotes

Author Keywords

Concept learning - multi-strategy learning - rule induction - instance-based learning - nearest-neighbor classification - case-based reasoning

Abstract

Several well-developed approaches to inductive learning now exist, but each has specific limitations that are hard to overcome. Multi-strategy learning attempts to tackle this problem by combining multiple methods in one algorithm. This article describes a unification of two widely-used empirical approaches: rule induction and instance-based learning. In the new algorithm, instances are treated as maximally specific rules, and classification is performed using a best-match strategy. Rules are learned by gradually generalizing instances until no improvement in apparent accuracy is obtained. Theoretical analysis shows this approach to be efficient. It is implemented in the RISE 3.1 system. In an extensive empirical study, RISE consistently achieves higher accuracies than state-of-the-art representatives of both its parent approaches (PEBLS and CN2), as well as a decision tree learner (C4.5). Lesion studies show that each of RISElsquos components is essential to this performance. Most significantly, in 14 of the 30 domains studied, RISE is more accurate than the best of PEBLS and CN2, showing that a significant synergy can be obtained by combining multiple empirical methods.

References

Aha, D. W. (1990). A study of instance-based learning algorithms for supervised learning tasks: Mathematical, empirical, and psychological evaluations (Technical Report 90–42). Irvine, CA: University of California at Irvine, Department of Information and Computer Science.
Aha, D. W. (Ed.) (in press). Special issue on lazy learning. Artificial Intelligence Review.
Aha, D. W., & Bankert, R. L. (1994). Feature selection for case-based classification of cloud types: An empirical comparison. Proceedings of the 1994 AAAI Workshop on Case-based Reasoning (pp. 106–112). Seattle, WA: AAAI.
Aha, D. W., & Goldstone, R. L. (1992). Concept learning and flexible weighting. Proceedings of the Fourteenth Annual Conference of the Cognitive Science Society (pp. 534–539). Bloomington, IN: Lawrence Erlbaum.
Aha, D. W., Kibler, D., & Albert, M. K. (1991). Instance-based learning algorithms. Machine Learning, 6, 37–66.
Atkeson, C. G., Moore, A. W., & Schaal, S. (in press). Locally weighted learning. Artificial Intelligence Review.
Belew, R. K., McInerney, J., & Schraudolph, N. N. (1992). Evolving networks: Using the genetic algorithm with connectionist learning. In C. G. Langton, J. Taylor, J. D. Farmer, & S. Rasmussen (Eds.), Artificial Life II. Redwood City, CA: Addison-Wesley.
Biberman, Y. (1994). A context similarity measure. Proceedings of the Ninth European Conference on Machine Learning (pp. 49–63). Catania, Italy: Springer-Verlag.
Booker, L. B., Goldberg, D. E., & Holland, J. H. (1989). Classifier systems and genetic algorithms. Artificial Intelligence, 40, 235–282.
Brodley, C. E. (1995). Recursive automatic bias selection for classifier construction. Machine Learning, 20, 63–94.
Buntine, W. (1989). Learning classification rules using Bayes. Proceedings of the Sixth International Workshop on Machine Learning (pp. 94–98). Ithaca, NY: Morgan Kaufmann.
Cameron-Jones, R. M. (1992). Minimum description length instance-based learning. Proceedings of the Fifth Australian Joint Conference on Artificial Intelligence (pp. 368–373). Hobart, Australia: World Scientific.
Catlett, J. (1991). Megainduction: A test flight. Proceedings of the Eighth International Conference on Machine Learning (pp. 589–604). Evanston, IL: Morgan Kaufmann.
Clark, P., & Boswell, R. (1991). Rule induction with CN2: Some recent improvements. Proceedings of the Sixth European Working Session on Learning (pp. 151–163). Porto, Portugal: Springer-Verlag.
Clark, P., & Niblett, T. (1989). The CN2 induction algorithm. Machine Learning, 3, 261–283.
Cohen, W. W. (1995). Fast effective rule induction. Proceedings of the Twelfth International Conference on Machine Learning (pp. 115–123). Tahoe City, CA: Morgan Kaufmann.
Cost, S.,& Salzberg, S. (1993). A weighted nearest neighbor algorithm for learning with symbolic features. Machine Learning, 10, 57–78.
Cover, T. M., & Hart, P. E. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13, 21–27.
DeGroot, M. H. (1986). Probability and statistics (Second Edition). Reading, MA: Addison-Wesley.
Pedro Domingos (1994). The RISE system: Conquering without separating Proceedings of the Sixth IEEE International Conference on Tools with Artificial Intelligence (pp. 704–707). New Orleans, LA: IEEE Computer Society Press.
Pedro Domingos (1995a). Rule induction and intance-based learning: A unified approach. Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence (pp. 1226–1232). Montreal, Canada: Morgan Kaufmann.
Pedro Domingos (1985b). The RISE 2.0 system: A case study in multistrategy learning (Technical Report 95–2). Irvine, CA: University of California at Irvine, Department of Information and Computer Science.
Pedro Domingos (1995c). Two-way induction. Proceedings of the Seventh IEEE International Conference on Tools with Artificial Intelligence (pp. 182–189). Herndon, VA: IEEE Computer Society Press.
Pedro Domingos (in press). Context-sensitive feature selection for lazy learners. Artificial Intelligence Review.
Duda, R. O., & Hart, P. E. (1973). Pattern classification and scene analysis. New York, NY: Wiley.
Golding, A. R., & Rosenbloom, P. S. (1991). Improving rule-based systems through case-based reasoning. Proceedings of the Ninth National Conference on Artificial Intelligence (pp. 22–27). Anaheim, CA: AAAI Press.
Holte, R. C. (1993). Very simple classification rules perform well on most commonly used datasets. Machine Learning, 11, 63–91.
Holte, R. C., Acker, L. E., & Porter, B. W. (1989). Concept learning and the problem of small disjuncts. Proceedings of the Eleventh International Joint Conference on Artificial Intelligence (pp. 813–818). Detroit, MI: Morgan Kaufmann.
Kelly, J. D., & Davis, L. (1991). A hybrid genetic algorithm for classification. Proceedings of the Twelfth International Joint Conference on Artificial Intelligence (pp. 645–650). Sydney, Australia: Morgan Kaufmann.
Ron Kohavi, & Li, C. (1995). Oblivious decision trees, graphs, and top-down pruning. Proceedings of the Eourteenth International Joint Conference on Artificial Intelligence (pp. 1071–1077). Montreal, Canada: Morgan Kaufmann.
Kolodner, J. (1993). Case-based reasoning. San Mateo, CA: Morgan Kaufmann.
Michalski, R. S. (1983). A theory and methodology of inductive learning. Artificial Intelligence, 20, 111–161.
Michalski, R. S., Mozetic, I., Hong, J., & Lavrac, N. (1986). The multi-purpose incremental learning system AQ 15 and its testing application to three medical domains. Proceedings of the Fifth National Conference on Artificial Intelligence (pp. 1041–1045). Philadelphia, PA: AAAI Press.
Michalski, R. S., & Tecuci, G. (Eds.), (1993). Proceedings of the Second International Workshop on Multistrategy Learning. Harpers Ferry, VA: Office of Naval Research/George Mason University.
Michalski, R. S., & Tecuci, G. (Eds.) (1994).Machine learning: A multistrategy approach. San Mateo, CA: Morgan Kaufmann.
Mitchell, T. M. (1980). The need for biases in learning generalizations (Technical Report). New Brunswick, NJ: Rutgers University, Computer Science Department.
Mohri, T., and Tanaka, H. (1994). An optimal weighting criterion of case indexing for both numeric and symbolic attributes. Proceedings of the 1994 AAAI Workshop on Case-based Reasoning (pp. 123–127). Seattle, WA: AAAI.
Murphy, P. M., & Aha, D. W. (1995). UCI repository of machine learning databases (Machine-readable data repository). Irvine, CA. University of California, Department of Information and Computer Science.
Niblett, T. (1987). Constructing decision trees in noisy domains. Proceedings of the Second European Working Session on Learning (pp. 67–78). Bled, Yugoslavia: Sigma.
Oliveira, A. L., and Sangiovanni-Vincentelli, A. (1995). Inferring reduced ordered decision graphs of minimum description length. Proceedings of the Twelfth International Conference on Machine Learning (pp. 421–429). Tahoe City, CA: Morgan Kaufmann.
Ourston, D., & Mooney, R. J. (1994). Theory refinement combining analytical and empirical methods. Artificial Intelligence, 66, 273–309.
Pagallo, G., & Haussler, D. (1990). Boolean feature discovery in empirical learning. Machine Learning. 3, 71–99.
Pazzani, M., & Kibler, D. (1992). The utility of knowledge in inductive learning. Machine Learning, 9, 57–94.
J. Ross Quinlan (1986). Induction of decision trees. Machine Learning, 1, 81–106.
J. Ross Quinlan (1987). Generating production rules from decision trees. Proceedings of the Tenth International Joint Conference on Artificial Intelligence (pp. 304–307). Milan, Italy: Morgan Kaufmann.
J. Ross Quinlan (1990). Learning logical definitions from relations. Machine Learning, 5, 239–266.
J. Ross Quinlan (1993a). C4.5: Programs for Machine Learning. San Mateo, CA: Morgan Kaufmann.
J. Ross Quinlan (1993b). Combining instance-based and model-based learning. Proceedings of the Tenth International Conference on Machine Learning (pp. 236–243). Amherst, MA: Morgan Kaufmann.
Rao, R. B., Gordon, D., & Spears, W. (1995). For every generalization action, is there really an equal and opposite reaction? Analysis of the conservation law for generalization performance. Proceedings of the Twelfth International Conference on Machine Learning (pp. 471–479). Tahoe City, CA: Morgan Kaufmann.
Rendell, L. (1986). A general framework for induction and a study of selective induction. Machine Learning. 1, 177–226.
Riesbeck, C. K., & Schank, R. C. (1989). Inside case-based reasoning. Hillsdale, NJ: Lawrence Erlbaum.
Rivest, R. L. (1987). Learning decision lists. Machine Learning. 2, 229–246.
Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning internal representations by error propagation In D. E. Rumelhart & James L. McClelland (Eds.), Parallel distributed processing: Explorations in the microstructure of cognition (Vol. 2). Cambridge, MA: MIT Press.
Salzberg, S. (1991). A nearest hyperrectangle learning method. Machine Learning, 6, 251–276.
Schaffer, C. (1994a). Cross-validation, stacking, and bi-level stacking: Meta-methods for classification learning. In P. Cheeseman & R. W. Oldford (Eds.), Selecting models from data: Artificial intelligence and statistics IV. New York, NY: Springer-Verlag.
Schaffer, C. (1994b). A conservation law for generalization performance. Proceedings of the Eleventh International Conference on Machine Learning (pp. 259–265). New Brunswick, NJ: Morgan Kaufmann.
Scott, P. D., & Sage, K. H. (1992). Why generalize? Hybrid representations and instance-based learning. Proceedings of the Tenth European Conference on Artificial Intelligence (pp. 484–486). Vienna, Austria: Wiley.
Smyth, P. R., Goodman, M., & Higgins, C. (1990). A hybrid rule-based/Bayesian classifier. Proceedings of the Seventh European Conference on Artificial Intelligence (pp. 610–615). Stockholm, Sweden: Springer-Verlag.
Smyth, P., Gray, A., & Fayyad, U. (1995). Retrofitting decision tree classifiers using kernel density e estimation. Proceedings of the Twelfth International Conference on Machine Learning (pp. 506–514). Tahoe City, CA: Morgan Kaufmann.
Stanfill, C., & Waltz, D. (1986). Toward memory-based reasoning. Communications of the ACM, 29, 1213–1228.
Ting, K. M. (1994). Discretization of continous-valued attributes and instance-based learning (Technical Report. 491). Sydney, Australia: Basser Department of Computer Science, University of Sydney.
Towell, G. G., & Shavlik, J. W. (1994). Knowledge-based artificial neural networks. Artificial Intelligence, 70, 119–165.
Townsen-Weber, T., & Kibler, D. (1994). Instance-based prediction of continous values. Proceedings of the 1994 AAAI Workshop on Case-based Reasoning (pp. 30–35). Seattle, WA: AAAI.
Utgoff, P. E. (1989a). Incremental induction of decision trees. Machine Learning, 4, 161–186.
Utgoff, P. E. (1989b). Perceptron trees: A case study in hybrid concept representations. Connection Science, 1, 377–391.
Wettschereck, D. (1994). A hybrid nearest-neighbor and nearest-hyperrectangle algorithm. Proceedings of the Ninth European Conference on Machine Learning (pp. 323–335). Catania, Italy: Springer-Verlag.
Wettschereck, D., & Dietterich, T. (1995). An experimental comparison of the nearest-neighbor and nearest-hyperrectangle algorithms. Machine Learning, 19, 5–27.
Wnek, J., & R. S. Michalski (1994). Hypothesis-driven constructive induction in AQ17-HCI: A method and experiments. Machine Learning, 14, 139–168.
Zhang, J. (1990). A method that combines inductive learning with exemplar-based learning. Proceedings of the Second IEEE International Conference on Tools for Artificial Intelligence (pp. 31–37). San Jose, CA: IEEE Computer Society Press.

,

	Author	volume	Date Value	title	type	journal	titleUrl	doi	note	year
1996 UnifyingInstAndRuleBasedInduction	Pedro Domingos			Unifying Instance-based and Rule-based Induction		Machine Learning (ML) Subject Area	http://www.cs.washington.edu/homes/pedrod/papers/mlj96.pdf	10.1023/A:1018006431188		1996