2011 LearningRelationalPatterns

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Relational Pattern; Relational Pattern Language; Relational Pattern Learning System; Polynomial Time; Recursive Relation; Formal Language; Target Language; Regular Language

Notes

Cited By

Quotes

Abstract

Patterns provide a simple, yet powerful means of describing formal languages. However, for many applications, neither patterns nor their generalized versions of typed patterns are expressive enough. This paper extends the model of (typed) patterns by allowing relations between the variables in a pattern. The resulting formal languages are called Relational Pattern Languages (RPLs). We study the problem of learning RPLs from positive data (text) as well as the membership problem for RPLs. These problems are not solvable or not efficiently solvable in general, but we prove positive results for interesting subproblems. We further introduce a new model of learning from a restricted pool of potential text]]s. Probabilistic assumptions on the process that generates words from patterns make the appearance of some words in the text more likely than that of other words. We prove that, in our new model, a large subclass of RPLs can be learned with high confidence, by effectively restricting the set of likely candidate patterns to a finite set after processing a single positive example.

1 Introduction

2 Learning (Typed) Pattern Languages

3 Relational Patterns

In order to model interdependencies between the substitutions of variables, we introduce relations between variables into the definition of patterns.

Definition 3 Let [math]\displaystyle{ R }[/math] be a set of relations over [math]\displaystyle{ \Sigma^* }[/math]. Then, for any [math]\displaystyle{ n \in N_+ }[/math], [math]\displaystyle{ R_n }[/math] denotes the set of [math]\displaystyle{ n }[/math]-ary relations in [math]\displaystyle{ R }[/math]. A relational pattern with respect to [math]\displaystyle{ \Sigma }[/math] and [math]\displaystyle{ R }[/math] is a pair [math]\displaystyle{ (p,v_R) }[/math] where [math]\displaystyle{ p }[/math] is a pattern over [math]\displaystyle{ \Sigma }[/math] and [math]\displaystyle{ v_R \subseteq \{(r,y_1,\cdots,y_n) | n \in N_+, r \in R_n, }[/math] and [math]\displaystyle{ y_1,\cdots,y_n }[/math] are variables in [math]\displaystyle{ p\} }[/math]. The set of relational patterns with respect to [math]\displaystyle{ R }[/math] will be denoted by [math]\displaystyle{ Pat_{\Sigma, R} }[/math].

The set of all possible substitutions for [math]\displaystyle{ (p,v_R) }[/math] is denoted [math]\displaystyle{ \Theta_{(p,v_R),\Sigma} }[/math] It contains all substitutions [math]\displaystyle{ \theta \in \Theta_{\Sigma} }[/math] that fulfill, for all [math]\displaystyle{ n \in N_+ }[/math]:

[math]\displaystyle{ \forall\, r\in R_n \; \forall \,y_1,\dots,y_n \in X \Big[r, y_1,\cdots, y_n \in v_R \Rightarrow \left(\theta(y_1), \cdots, \theta(y_n)\right) \in r\Big] }[/math]

The language of [math]\displaystyle{ (p,v_R) }[/math], denoted by [math]\displaystyle{ L(p,v_R) }[/math], is defined as [math]\displaystyle{ \{w \in \Sigma^* | \exists \theta \in \Theta_{(p,v_R),\Sigma}: \theta(p) = w\} }[/math]. The set of all languages of relational patterns with respect to [math]\displaystyle{ R }[/math] will be denoted by [math]\displaystyle{ \mathcal{L}_{\Sigma, R} }[/math].

For instance, [math]\displaystyle{ r=\{(w_1,w_2)|w_1,w_2 \in \Sigma^* \wedge |w_1|=|w_2| \} }[/math] is a binary relation, which, applied to two variables [math]\displaystyle{ x_1 }[/math] and [math]\displaystyle{ x_2 }[/math] in a relational pattern [math]\displaystyle{ (p,v_R) }[/math], ensures that the substitutions of [math]\displaystyle{ x_1 }[/math] and [math]\displaystyle{ x_2 }[/math] generating words from [math]\displaystyle{ p }[/math] always have the same length. Formally, this is done by including [math]\displaystyle{ (r, x_1, x_2) }[/math] in [math]\displaystyle{ v_R }[/math]

We assume, without loss of generality, that for every variable [math]\displaystyle{ x }[/math] occurring in a relational pattern [math]\displaystyle{ (p,v_R) }[/math], there is exactly one [math]\displaystyle{ r \in R_1 }[/math] such that [math]\displaystyle{ (r, x) \in v_R }[/math]. In fact, this unary relation [math]\displaystyle{ r }[/math] represents the type of variable [math]\displaystyle{ x }[/math]. If there is no [math]\displaystyle{ r \in R_1 }[/math] with [math]\displaystyle{ (r,x) \in v_R }[/math], we can include [math]\displaystyle{ (r_*, x) }[/math] in [math]\displaystyle{ (p,v_R) }[/math], where [math]\displaystyle{ w \in r_* \leftrightarrow w \in \Sigma^* }[/math]. If [math]\displaystyle{ R_1 }[/math] contains several [math]\displaystyle{ r_i }[/math] (for [math]\displaystyle{ i }[/math] in some index set [math]\displaystyle{ I }[/math]) with [math]\displaystyle{ (r_i,x) \in v_R }[/math], we can replace them by the single relation [math]\displaystyle{ (r_{\cap I},x) }[/math] where [math]\displaystyle{ w \in r_{\cap I} \leftrightarrow \forall i \in I [w \in r_i] }[/math]. We will use the terms “type” and “unary relation” interchangeably. Similarly, without loss of generality, each set of [math]\displaystyle{ n }[/math] variables is included in [math]\displaystyle{ v_R }[/math] with at most one [math]\displaystyle{ n }[/math]-ary relation. We further assume that relational patterns do not contain any variable twice. This is no restriction, since repetition of variables can be expressed by an equality relation between two distinct variables.

We are only going to consider the case that [math]\displaystyle{ R }[/math] is finite, which seems sufficient for many practical applications. It is easy to see that [math]\displaystyle{ \mathcal{L}_{\Sigma,NE} }[/math] and [math]\displaystyle{ \mathcal{L}_{\Sigma,E} }[/math], as well as the class of typed pattern languages over finitely many types, are subclasses of [math]\displaystyle{ \mathcal{L}_{\Sigma,R} }[/math], for respective suitably defined finite sets [math]\displaystyle{ R }[/math].

The gain in expressiveness shows for example in [math]\displaystyle{ L_1 }[/math], which, by Proposition 2, cannot be expressed as a typed pattern language using only regular type languages. Using relational patterns, regular types are sufficient to describe [math]\displaystyle{ L_1 }[/math].

Proposition 4 There is a finite set [math]\displaystyle{ R }[/math] of relations such that [math]\displaystyle{ R_1 }[/math] contains only regular languages and [math]\displaystyle{ L_1 \in \mathcal{L}_{\Sigma,R} }[/math].

Proof. If [math]\displaystyle{ R = \{r_1,r_2,r\} }[/math], [math]\displaystyle{ r_1 = \{a^i | i \geq 1\} }[/math], [math]\displaystyle{ r_2 = \{b^i | i \geq 1\} }[/math], [math]\displaystyle{ r = \{(w_1,w_2)| |w_1| = |w_2|\} }[/math], and [math]\displaystyle{ v_R = \{(r_1,x_1), (r_2,x_2), (r,x_1,x_2)\} }[/math] then [math]\displaystyle{ L(x_1x_2,v_R) = L_1 }[/math].

Since erasing pattern languages can be expressed as relational pattern languages, Reidenbach’s non-learnability results for erasing pattern languages [10] immediately yield the following theorem.

Theorem 5 There is a finite alphabet [math]\displaystyle{ \Sigma }[/math] and finite set [math]\displaystyle{ R }[/math] of recursive relations such that [math]\displaystyle{ \mathcal{L}_{\Sigma,R} }[/math] is not learnable in the limit from positive data.

However, if we disallow empty substitutions, we get positive learnability results for any set of recursive relations.

Theorem 6 Let [math]\displaystyle{ R }[/math] be a finite set of recursive relations with [math]\displaystyle{ r\subseteq \Sigma^+ }[/math] for all [math]\displaystyle{ r \in R_1 }[/math]. Then [math]\displaystyle{ \mathcal{L}_{\Sigma,R} }[/math] is learnable in the limit from positive data.

To prove this, we use a well-known result due to Angluin [2], according to which every indexable class of languages that has finite thickness is learnable in the limit from positive data. A class [math]\displaystyle{ \mathcal{L} }[/math] of languages is indexable if there is an enumeration [math]\displaystyle{ (L_i)_{i\in \mathbb{N}} }[/math] with [math]\displaystyle{ \mathcal{L}= \{Li|i \in \mathbb{N}\} }[/math] and an effective procedure [math]\displaystyle{ d }[/math] that decides, given any [math]\displaystyle{ i \in \mathbb{N} }[/math] and [math]\displaystyle{ w \in \Sigma^* }[/math], whether or not [math]\displaystyle{ w \in L_i }[/math] [2]. [math]\displaystyle{ (L_i)_{i\in \mathbb{N}} }[/math] then called an indexing for [math]\displaystyle{ \mathcal{L} }[/math]. An indexable class [math]\displaystyle{ \mathcal{L} }[/math] has finite thickness if, for every [math]\displaystyle{ w \in \Sigma^* }[/math], the set of languages in [math]\displaystyle{ \mathcal{L} }[/math] that contain [math]\displaystyle{ w }[/math] is finite. One can establish both indexability and finite thickness to prove Theorem 6. The proofs of both these properties use standard techniques and are omitted because of space constraints. It should be noted though that our results show in particular, that Theorem 6 can be witnessed by a learner that returns relational patterns as its hypotheses -- a desirable feature from an application point of view, since relational patterns provide an intuitive representation of a language.

Lemma 7 Let [math]\displaystyle{ R }[/math] be a finite set of recursive relations with [math]\displaystyle{ r \subseteq \Sigma^+ }[/math] for all [math]\displaystyle{ r \in R_1 }[/math]. There exists an effective enumeration [math]\displaystyle{ f : \mathbb{N} \to Pat_{\Sigma,R} }[/math] of all relational patterns over [math]\displaystyle{ R }[/math] such that [math]\displaystyle{ (L(f(i)))_{i\in \mathbb{N}} }[/math] is an indexing for [math]\displaystyle{ \mathcal{L}_{\Sigma, R} }[/math].

Lemma 8 Let [math]\displaystyle{ R }[/math] be a finite set of recursive relations with [math]\displaystyle{ r \subseteq \Sigma^+ }[/math] for all [math]\displaystyle{ r \in R_1 }[/math]. Then [math]\displaystyle{ \mathcal{L}_{\Sigma, R} }[/math] has finite thickness.

In fact, Lemma 8 can be strengthened. An indexable class[math]\displaystyle{ \mathcal{L} }[/math] is said to have recursive finite thickness [8] if there is an [[indexing [math]\displaystyle{ (L_i)_{i\in\mathbb{N}} }[/math] for [math]\displaystyle{ \mathcal{L} }[/math] and a recursive procedure [math]\displaystyle{ c }[/math] such that, for any [math]\displaystyle{ w \in \Sigma^* }[/math], [math]\displaystyle{ c(w) }[/math] is a finite subset of [math]\displaystyle{ \mathbb{N} }[/math] fulfilling [math]\displaystyle{ [w \in L_i \leftrightarrow \exists j \in c(w) [L_j = L_i]], }[/math] i.e., for every word [math]\displaystyle{ w }[/math] a finite list of indices for all languages in [math]\displaystyle{ \mathcal{L} }[/math] containing [math]\displaystyle{ w }[/math] can be effectively determined.

Theorem 9 Let [math]\displaystyle{ R }[/math] be a finite set of recursive relations with [math]\displaystyle{ r \subseteq \Sigma^+ }[/math] for all [math]\displaystyle{ r \in R_1 }[/math]. Then [math]\displaystyle{ \mathcal{L}_{\Sigma, R} }[/math] has recursive finite thickness.

The proof is omitted due to space constraints. Theorem 9 has some nice consequences, which follow immediately from the literature on recursive finite thickness [8,7]. For the following corollary, note that an iterative learner [14] is restricted to learn without access to prior data at any point in time. Its hypothesis on a text segment [math]\displaystyle{ (\tau(O), \cdots , \tau(n)) }[/math] is determined only by [math]\displaystyle{ \tau(n) }[/math] and its hypothesis generated on [math]\displaystyle{ (\tau(O), \cdots , \tau(n-1)) }[/math] (or a dummy hypothesis in case [math]\displaystyle{ n = 0 }[/math]).

Corollary 10 Let [math]\displaystyle{ R }[/math] be a finite set of recursive relations with [math]\displaystyle{ r \subseteq \Sigma^+ }[/math] for all [math]\displaystyle{ r \in R_1 }[/math]. Let [math]\displaystyle{ \kappa \in \mathbb{N} }[/math]. Then the class of all unions of up to [math]\displaystyle{ \kappa }[/math] languages from [math]\displaystyle{ \mathcal{L}_{\Sigma, R} }[/math] ; is learnable in the limit from positive data using an iterative learner.

4 The Membership Problem

5 Probabilistic Relational Patterns

6 Conclusion

References

;

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2011 LearningRelationalPatternsSandra Zilles
Michael Geilke
Learning Relational Patterns/10.1007/978-3-642-24412-4_102011