Social Network

Basic Model

The Forest Fire model was firstly proposed in [1] and [2]. It shows its applications in [3] and [4]. The basic setting of the model is as follows. The model has two parameters, a forward burning probability $latex p$ and a backward burning ratio $latex r$. Consider a node $latex v$ joining the network at time $latex t > 1$, and let $latex G_{t}$ be the graph constructed thus far. $latex G_{1}$ will consist of just a single node. Node $latex v$ forms outlinks to nodes in $latex G_{t}$ according to the following process:

(a) $latex v$ first chooses an ambassador node $latex w$ uniformly at random and forms a link to $latex w$.

(b) Generate two random numbers, $latex x$ and $latex y$, that are geometrically distributed with means $latex \frac{p}{1-p}$ and $latex \frac{rp}{1- rp}$, respectively. Node $latex v$ selects $latex x$ outlinks and $latex y$ in-links of $latex w$ incident to nodes that were not yet visited. Let $latex w_{1}, w_{2},\cdots, w_{x+y}$ denotes the other ends of these selected links. If not enough in or outlinks are available, $latex v$ selects as many as it can.

(c) $latex v$ forms outlinks to $latex w_{1}, w_{2} , \cdots, w_{x+y}$ and then applies step (b) recursively to each of $latex w_{1}, w_{2} , \cdots, w_{x+y}$. As the process continues, nodes cannot be visited a second time, preventing the construction from cycling.

Thus, the burning of links in the Forest Fire model begins at $latex w$, spreads to $latex w_{1}, w_{2} , \cdots, w_{x+y}$ and proceeds recursively until it dies out. The original authors claim that this model can preserve the following three properties of a graph as growing:

• Heavy-tailed in-degree and out-degree distribution
• Densification power law
• Shrinking diameter

Application to Graph Sampling

In [3], the authors described how to apply Forest Fire model to graph sampling as follows. We first choose node $latex v$ uniformly at random. We then generate a random number $latex x$ that is geometrically distributed with mean $latex \frac{p_{f}}{1-p_{f}}$. Node $latex v$ selects $latex x$ out-links incident to nodes that were not yet visited. Let $latex w_{1}, w_{2}, \cdots, w_{x}$ denote the other ends of these selected links. We then apply this step recursively to each of $latex w_{1}, w_{2}, \cdots, w_{x}$ until enough nodes have been burned. As the process continues, nodes cannot be visited a second time, preventing the construction from cycling. If the fire dies, then we restart it, i.e. select new node $latex v$ uniformly at random. We call the parameter $latex p_{f}$ the forward burning probability. In [4], the authors start from a set of “seed” users, which are chosen in three ways: 1) totally random 2) geo-location random 3) randomly according to user activities.

Remarks

Forest Fire model is indeed an ad-hoc model. As stated in [2], it lacks of rigorous analysis of the model itself.

Reference

[1] J. Leskovec, J. Kleinberg, and C. Faloutsos. Graphs over time: densification laws, shrinking diameters and possible explanations. In Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, KDD ’05, pages 177–187, New York, NY, USA, 2005. ACM.

[2] J. Leskovec, J. Kleinberg, and C. Faloutsos. Graph evolution: Densification and shrinking diameters. ACM Transactions on Knowledge Discovery from Data (TKDD), 1, March 2007.

[3] J. Leskovec and C. Faloutsos. Sampling from large graphs. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’06, pages 631–636, New York, NY, USA, 2006. ACM.

[4] M. De Choudhury, Y.-R. Lin, H. Sundaram, K. S. Candan, L. Xie, and A. Kelliher. How Does the Data Sampling Strategy Impact the Discovery of Information Diffusion in Social Media? In Proceedings of the 4th International AAAI Conference on Weblogs and Social Media, May 2010.