2013 MiningHighUtilityEpisodesinComp

From GM-RKB
Jump to navigation Jump to search

Subject Headings:

Notes

Cited By

Quotes

Author Keywords

Abstract

Frequent episode mining (FEM) is an interesting research topic in data mining with wide range of applications. However, the traditional framework of FEM treats all events as having the same importance / utility and assumes that a same type of event appears at most once at any time point. These simplifying assumptions do not reflect the characteristics of scenarios in real applications and thus the useful information of episodes in terms of utilities such as profits is lost. Furthermore, most studies on FEM focused on mining episodes in simple event sequences and few considered the scenario of complex event sequences, where different events can occur simultaneously. To address these issues, in this paper, we incorporate the concept of utility into episode mining and address a new problem of mining high utility episodes from complex event sequences, which has not been explored so far. In the proposed framework, the importance / utility of different events is considered and multiple events can appear simultaneously. Several novel features are incorporated into the proposed framework to resolve the challenges raised by this new problem, such as the absence of anti-monotone property and the huge set of candidate episodes. Moreover, an efficient algorithm named UP-Span (Utility ePisodes mining by Spanning prefixes) is proposed for mining high utility episodes with several strategies incorporated for pruning the search space to achieve high efficiency. Experimental results on real and synthetic datasets show that UP-Span has excellent performance and serves as an effective solution to the new problem of mining high utility episodes from complex event sequences.

References

;

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2013 MiningHighUtilityEpisodesinCompPhilip S. Yu
Vincent S. Tseng
Cheng-Wei Wu
Yu-Feng Lin
Mining High Utility Episodes in Complex Event Sequences10.1145/2487575.24876542013