D-Dupe System

From GM-RKB
Jump to navigation Jump to search

The D-Dupe System is an Interactive Entity Record Deduplication System.



References

  • http://www.cs.umd.edu/projects/linqs/ddupe/
    • Visualizing and analyzing social networks is a challenging problem that has been receiving growing attention. An important first step, before analysis can begin, is ensuring that the data is accurate. A common data quality problem is that the data may inadvertently contain several distinct references to the same underlying entity; the process of reconciling these references is called entity resolution. D-Dupe is an interactive tool that combines data mining algorithms for entity resolution with a task-specific network visualization. Users cope with complexity of cleaning large networks by focusing on a small subnetwork containing a potential duplicate pair. The subnetwork highlights relationships in the social network, making the common relationships easy to visually identify. D-Dupe users resolve ambiguities either by merging nodes or by marking them distinct. The entity resolution process is iterative: as pairs of nodes are resolved, additional duplicates may be revealed; therefore, resolution decisions are often chained together. We give examples of how users can flexibly apply sequences of actions to produce a high quality entity resolution result.
  • (Kang et al., 2008) ⇒ Hyunmo Kang, Lise Getoor, Ben Shneiderman, Mustafa Bilgic, and Louis Licamele. (2008). “Interactive Entity Resolution in Relational Data: A Visual Analytic Tool and Its Evaluation.” In: IEEE Transactions on Visualization and Computer Graphics, Volume 14, Number 5, (TVCG 2008).
  • C-Group: A Visual Analytic Tool for Pairwise Analysis of Dynamic Group Membership

Hyunmo Kang, Lise Getoor, Lisa Singh Proceedings of IEEE Symposium on Visual Analytics Science and Technology 2007 (VAST '07).

  • GeoDDupe: A Novel Interface for Interactive Entity Resolution in Geospatial Data

Hyunmo Kang, Vivek Sehgal, Lise Getoor Proceedings of Information Visualisation, pp.489-496, 2007 (IV '07).

  • D-Dupe: An Interactive Tool for Entity Resolution in Social Networks

Mustafa Bilgic, Louis Licamele, Lise Getoor, Ben Shneiderman Proceedings of IEEE Symposium on Visual Analytics Science and Technology 2006 (VAST '06).