2009 ExtendingAutocompletiontoTolera

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Auto-Completion Task; Q-Gram Based Autocompletion; Edit Distance Matching Algorithm.

Notes

Cited By

Quotes

Keywords

Abstract

Autocompletion is a useful feature when a user is doing a look up from a table of records. With every letter being typed, autocompletion displays strings that are present in the table containing as their prefix the search string typed so far. Just as there is a need for making the lookup operation tolerant to typing errors, we argue that autocompletion also needs to be error-tolerant. In this paper, we take a first step towards addressing this problem. We capture input typing errors via edit distance. We show that a naive approach of invoking an offline edit distance matching algorithm at each step performs poorly and present more efficient algorithms. Our empirical evaluation demonstrates the effectiveness of our algorithms.

1. Introduction

2. Autocompletion: State Of The Art

At its core, autocompletion is about suggesting valid completions of a partially entered lookup string with the intention of minimizing and guiding the users typing. The concept of autocompletion is not new. There are various approaches studied in prior work on autocompletion. There is a vast body of work on predictive autocompletion 4, 8, 16 where the idea is to use information retrieval techniques, language models and learning to suggest potential completions.

In contrast, our focus is on the scenario where there is a table T of strings being looked up and thus completions are suggested based on matches in T. This is the form of autocompletion supported by http://www.amazon.com and Yahoo Finance for instance. The most common form of such autocompletion is based on exact matching. In this section, we formalize some of the key concepts involved in exact autocompletion.

2.1 Autocompletion Interface

2.2 Autocompletion Strategy

2.3 Efficiency Considerations

3. Incorporating Error Tolerance

3.1 Edit Distance Based Matching

3.2 k-Extension

3.3 Properties of k-Extensions

3.3.1 Relating Extensions to Prefixes
3.3.2 Pairwise Extension Distance

3.4 Baseline Algorithm for Edit-Tolerant Autocompletion

4. Q-Gram Based Algorithm

4.1 Review of Offline Q-Gram Based Lookup

4.2 Q-Gram Based Autocompletion

5. Trie-Based Algorithm

5.1 Full k-Extension Strategy

5.2 Buffered k-Extension Strategy

5.3 Analysis

5.3.1 Autocompletion
5.3.2 Edit Distance Matching

5.4 Top-l Semantics

6. Experiments

6.1 Experimental Setting

6.1.1 Implementation Details
6.1.2 Data

6.2 Effectiveness of Error Tolerance in Autocompletion

6.3 Autocompletion Strategy

6.4 Efficiency: Trie Vs NGram

6.5 Per-Character Response Time

6.6 Online Vs Offline

6.7 Overhead of Error Tolerance

6.8 Effect of Data Size

7. Related Work

8. Conclusions

References

;

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2009 ExtendingAutocompletiontoToleraSurajit Chaudhuri
Raghav Kaushik
Extending Autocompletion to Tolerate Errors10.1145/1559845.15599192009