Learning from Examples Module (LEM) Rule Induction Algorithm

AKA: LEM Induction Algorithm.
Context:
- It can usually be implemented by LEM Induction System to solve a LEM Induction Task.
Example(s):
- LEM1 Algorithm,
- LEM2 Algrotihm,
- DomLEM Algorithm,
- ELEM Algorithm,
- LERS Algorithm,
- MLEM2 Algorithm.
- MODLEM Algorithm,
- …
Counter-Example(s):
See: Pattern Mining Algorithm, Decision Tree Induction Algorithm, Inductive Logic Programming, If-Then Rule, First-Order Logic Rule.

References

(Grzymala-Busse, 2009) ⇒ Jerzy W. Grzymala-Busse. (2009). “Rule Induction.” In: Maimon O., Rokach L. (eds) Data Mining and Knowledge Discovery Handbook. ISBN:978-0-387-09822-7, 978-0-387-09823-4. doi:10.1007/978-0-387-09823-4_13
QUOTE: In general, rule induction algorithms may be categorized as global and local. In global rule induction algorithms the search space is the set of all attribute values, while in local rule induction algorithms the search space is the set of attribute-value pairs.
There exist many rule induction algorithms, we will discuss only three representative algorithms, all inducing discriminant rule sets. The first is an example of a global rule induction algorithm called LEM1 (Learning from Examples Module version 1).
(...) The algorithm LEM1, a component of the data mining system LERS (Learning from Examples using Rough Sets), is based on some rough set definitions Pawlak (1982)^[1], Pawlak (1991) ^[2], Pawlak et al. (1995)^[3].

(...) An idea of blocks of attribute-value pairs is used in the rule induction algorithm LEM2 (Learning from Examples Module, version 2), another component of LERS. The option LEM2 of LERS is most frequently used since -- in most cases -- it gives better results. LEM2 explores the search space of attribute-value pairs. (...)

**Table. 1**. Classification results obtained with different algorithms.
|}
Obtained results indicate high efficiency of MODLEM algorithm in case of non-discretized data. The obtained classification accurateness, estimated with 10-fold cross validation technique, was 87 %. Classification accuracy obtained with LEM2 algorithm was, in this case, 24 %, while in case of EXPLORE algorithm, in 21 %. In the case of initial digitalization conducted with help of LEM2 and MODLEM algorithms, identical results were obtained. The lowest accuracy was obtained with EXPLORE algorithm.
No	Initial discretization	Induction Algorithm	Number of obtained rules	Percentage of correctly classified examples [%]	Percentage of incorrectly classified examples [%]	Percentage of non-classified examples [%]
1.	None	LEM2	178	24	32	44
2.		MODLEM	35	87	2	11
3.		EXPLORE	5	21	76	3
4.	Local Method	LEM2	56	91	9	0
5.		MODLEM	46	91	9	0
6.		EXPLORE	300	74	26	0

(Grzymala-Busse, 2003) ⇒ Jerzy W. Grzymala-Busse (2003) "A Comparison Of Three Strategies To Rule Induction From Data With Numerical Attributes". Proceedings of the International Workshop on Rough Sets in Knowledge Discovery (RSKD 2003). Electronic Notes in Theoretical Computer Science, 82(4), 132-140. DOI: 10.1016/S1571-0661(04)80712-6
- QUOTE: Our main objective was to compare two discretization techniques, both based on cluster analysis, with a new rule induction algorithm called MLEM2, in which discretization is performed simultaneously with rule induction. The MLEM2 algorithm is an extension of the existing LEM2 rule induction algorithm. The LEM2 algorithm works correctly only for symbolic attributes and is a part of the LERS data mining system. For the two strategies, based on cluster analysis, rules were induced by the LEM2 algorithm. Our results show that MLEM2 outperformed both strategies based on cluster analysis, in terms of complexity (size of rule sets) and, more importantly, error rates.

(Grzymala-Busse, 2002) ⇒ Jerzy W. Grzymala-Busse (2002). "MLEM2: A New Algorithm For Rule Induction From Imperfect Data". In: Proceedings of the 9th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, (IPMU 2002), pages 243– 250.

(Grzymala‐Busse & Stefanowski, 2001) ⇒ Jerzy W. Grzymala-Busse, and Jerzy Stefanowski (2001). "Three Discretization Methods For Rule Induction". Proceedings of the International Journal of Intelligent Systems, 16(1), 29-38.
- QUOTE: We present a new approach to manipulate numerical data. Numerical attributes are not discretized before performing rule induction. Instead, a modified version of LEM2, called MODLEM, is applied directly to data with numerical attributes. Discretization and rule induction is performed simultaneously. Two versions of MODLEM, using different measures to evaluate elementary conditions: class entropy and Laplacian accuracy, are presented. We evaluated all of these approaches experimentally. Rule sets induced by both versions of MODLEM were compared with rule sets obtained in traditional way, i.e., discretization based on conditional entropy first and then rule induction by the ‘pure’ LEM2. For MODLEM and preliminary discretization plus LEM2 the same system was used for classifying testing data

(Greco et al., 2001) ⇒ Salvatore Greco, Benedetto Matarazzo, Roman Slowinski, and Jerzy Stefanowski (2001). "An Algorithm for Induction of Decision Rules Consistent with the Dominance Principle" In: Ziarko W., Yao Y. (eds) Rough Sets and Current Trends in Computing. RSCTC 2000. Lecture Notes in Computer Science, vol 2005.
- QUOTE: Induction of decision rules within the dominance-based rough set approach to the multiple-criteria sorting decision problem is discussed in this paper. We introduce an algorithm called DOMLEM that induces a minimal set of generalized decision rules consistent with the dominance principle.

(Grzymala-Busse, 1992) ⇒ Jerzy W. Grzymala-Busse (1992). "LERS - A System For Learning From Examples Based On Rough Sets". In: Intelligent Decision Support. Handbook of Applications and Advances of the Rough Sets Theory, ed. by R. Slowinski, Kluwer Academic Publishers, Dordrecht, Boston, London, 1992, 3–18. DOI:10.1007/978-94-015-7975-9_1

(Chan & Grzymala-Busse, 1991) ⇒ C.C. Chan, and Jerzy W. Grzymala-Busse (1991). “On The Attribute Redundancy And The Learning Programs ID3, PRISM, and LEM2". Department of Computer Science, University of Kansas, TR-91-14, December 1991, 20 pp.

↑ (Pawlak, 1982) ⇒ Z. Pawlak (1992). “Rough Sets". International Journal of Computer and Information Sciences 1982; 11: 341–356.
↑ (Pawlak, 1991) ⇒ Z. Pawlak. “Rough Sets. Theoretical Aspects of Reasoning about Data". Kluwer Academic Publishers.
↑ (Pawlak et al., 1995) ⇒ Z. Pawlak, J.W. Grzymala-Busse, R. Slowinski and W. Ziarko (1995). “Rough Sets". Communications of the ACM 1995; 38: 88– 95