2009 ModelingAndDataMiningInBlogosphere

Jump to: navigation, search

Subject Areas: Statistical Modeling Task, Data Mining Task, Blogosphere.




1. Modeling Blogosphere

  • … Blogosphere provides a conducive platform to build the virtual communities of special interests. It reshapes business models [3], assists viral marketing [4], provides trend analysis and sales prediction [5,6], aids counter-terrorism efforts [7], and acts as grassroot information sources [8].
  • Past few years have observed a phenomenal growth in the blogosphere. Technorati (http://technorati.com/blogging/state-of-the-blogosphere/) published a report on the growth of the blogosphere. The report mentioned that the blogosphere is consistently doubling every 5 months for the last 4 years and the size was estimated to be approximately 133 million blogs by December 2008. Furthermore, 2 new blogs or roughly 18.6 new blog posts are added to the blogosphere every second. …

1.1 Modeling Essentials

  • The blogosphere consists of two main graph structures - a blog network and a post network. A post network is formed by considering the links between blog posts, and ignoring the blogs to which they belong. In a post network, the nodes represent individual blog posts, and edges represent the links between them. A post network gives a microscopic view of the blogosphere and helps in discerning “high-resolution” details like blog post level interactions, communication patterns in blog post interactions, authoritative blog post based on links, etc. A blog network is formed by collapsing those individual nodes in the post network that belong to a single blog, to a single node. By doing so links between the blog posts that belong to a single blog disappear and links between blog posts of different blogs are agglomerated and weighted accordingly. A blog network gives a macroscopic view of the blogosphere and helps in observing “low-resolution” details like blog level interactions, communication patterns in blog-blog interactions, authoritative blogs based on links, etc. Both post and blog networks are directed graph.

Blog Clustering and Community Discovery

Influence and Trust

Spam Filtering in Blogosphere

Data Collection and Evaluation


 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2009 ModelingAndDataMiningInBlogosphereNitin Agarwal
Huan Liu
Modeling and Data Mining in Blogosphere10.2200/S00213ED1V01Y200907DMK0012009
AuthorNitin Agarwal + and Huan Liu +
doi10.2200/S00213ED1V01Y200907DMK001 +
titleModeling and Data Mining in Blogosphere +
year2009 +