2013 UsingCoVisitationNetworksforDet

From GM-RKB
Jump to navigation Jump to search

Subject Headings:

Notes

Cited By

Quotes

Author Keywords

Abstract

Data generated by observing the actions of web browsers across the internet is being used at an ever increasing rate for both building models and making decisions. In fact, a quarter of the industry-track papers for KDD in 2012 were based on data generated by online actions. The models, analytics and decisions they inform all stem from the assumption that observed data captures the intent of users. However, a large portion of these observed actions are not intentional, and are effectively polluting the models. Much of this observed activity is either generated by robots traversing the internet or the result of unintended actions of real users. These non-intentional actions observed in the web logs severely bias both analytics and the models created from the data. In this paper, we will show examples of how non-intentional traffic that is produced by fraudulent activities adversely affects both general analytics and predictive models, and propose an approach using co-visitation networks to identify sites that have large amounts of this fraudulent traffic. We will then show how this approach, along with a second stage classifier that identifies non-intentional traffic at the browser level, is deployed in production at Media6Degrees (m6d), a targeting technology company for display advertising. This deployed product acts both to filter out the fraudulent traffic from the input data and to insure that we don't serve ads during unintended website visits.

References

;

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2013 UsingCoVisitationNetworksforDetFoster Provost
Brian Dalessandro
Rod Hook
Claudia Perlich
Ori Stitelman
Troy Raeder
Using Co-visitation Networks for Detecting Large Scale Online Display Advertising Exchange Fraud10.1145/2487575.24882072013