Random Walk Concepts in Networks
The Average First Passage Time
Imagine something is diffusing through a network starting with some seed node \(i\) following a series of discrete time steps. If the graph is connected, the thing will eventually reach every node in the network. However, depending on the connectivity structure of the graph, it will reach some nodes (e.g., those at a smaller graph theoretic distance from the seed node) sooner than others.
Here’s a function called first.pass1 that records the minimum number of steps that it takes for something to get to each node in a graph starting from a given seed node:
The function takes a transition matrix \(\mathbf{P} = \mathbf{D}^{-1}\mathbf{A}\)—where \(\mathbf{D}^{-1}\) is a matrix containing the inverse of the degrees of each node along the diagonals and zeros everywhere else—as the primary input (of the sort we discussed when talking about status and prestige) and returns a vector containing the time step at which the thing that diffused through the network got to the \(j^{th}\) node. It works like this:
Line 2 initializes the “first pass” vector
fpto all zeros. Line 3 initializes the step counterkand line 4 initializes the scalarito the whatever node was specified as the seed note in theseedargument of the function.Lines 5-10 implement a little
whileloop that runs until there are no more zeros in thefpvector. Line 6 samples a random neighbor of nodeiwith probability specified by the relevant entry in the transition matrix (\(p_{ij} = \frac{1}{d_i}\)), and then turns that nodejinto the new sender nodei, marking that as the time that the thing that is diffusing through the network got tojusing markerk.Line 11 names the
fpvector using the column names of the transition matrix, and line 12 returns the vector.
Let’s test the first.pass1 function out using the undirected version of the friendship network from the Krackhardt high-tech managers data:
library(networkdata)
library(igraph)
g <- as_undirected(ht_friends, mode = "collapse") #undirected graph
A <- as.matrix(as_adjacency_matrix(g)) #adjacency matrix
D <- diag(rowSums(A)^-1) #inverse of degree matrix
P <- D %*% A #transition matrix
set.seed(456)
first.pass1(P) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
0 2 137 24 14 28 43 36 8 9 16 29 15 26 7 1 25 33 6 32
21
12
We can see for instance that the rumor got to node 16 in just one time step, but that it took 26 time steps to get to node 14 and 137 (!) to get to node 3.
So it seems like this, the minimum time it takes for something that starts with me to get to you (Fouss, Pirotte, and Saerens 2004), is a good measure of the proximity between me and you in the graph.
However, we wouldn’t want to use just one run of the diffusion process to calculate this proximity measure. Instead a better approach is to use the average time it takes for something to get to the other nodes when it starts from a given node, across a large number of realizations of the diffusion process.
To do that, we can just replicate the first.pass1 function some number of times (e.g., \(n = 100\)) and take an average:
1 2 3 4 5 6 7 8 9 10 11 12 13
0.00 16.14 32.88 21.24 17.17 24.35 59.98 27.83 25.85 17.75 10.16 15.78 86.78
14 15 16 17 18 19 20 21
36.32 15.43 32.94 8.65 44.59 19.17 31.47 33.85
We can see that according to this measure, called the average first passage time (Fouss, Saerens, and Shimbo 2016, 36), when we start with node 1, things get relatively quickly to node 17 (\(\hat{t} = 8.7\)) but they take forever to get to node 13 (\(\hat{t} = 86.8\)).
We can of course, write a wrapper around the first.pass1 function to compute the average first passage time from one node to another using every node in the graph as the initial seed, and put all of the results in a matrix \(\mathbf{M}\) where each entry \(m_{ij}\) is the average first passage time between nodes \(i\) and \(j\):
And the result (showing the first ten rows and columns of \(\matbf{M}\)) is:
1 2 3 4 5 6 7 8 9 10
1 0.0 16.1 32.9 21.2 17.2 24.4 60.0 27.8 25.9 17.8
2 18.1 0.0 23.4 24.0 14.6 20.5 60.6 32.6 32.6 16.6
3 20.6 22.2 0.0 19.9 17.1 29.1 63.1 29.8 26.8 19.0
4 14.1 14.1 27.4 0.0 18.1 21.7 57.8 28.9 22.7 21.8
5 16.9 16.7 30.0 25.1 0.0 25.0 52.9 36.5 19.9 24.1
6 19.5 18.1 24.4 28.2 15.2 0.0 41.5 38.6 19.5 25.2
7 20.7 15.9 26.8 24.8 16.3 14.2 0.0 33.0 26.1 23.7
8 14.1 14.9 27.2 16.6 17.2 29.2 73.3 0.0 23.8 19.1
9 18.1 16.1 27.5 23.9 16.3 20.7 52.1 32.7 0.0 20.1
10 20.0 18.7 26.2 23.3 16.2 23.8 64.7 27.0 21.2 0.0
Note that the \(\mathbf{M}\) matrix is asymmetric. That is, the average number of steps that it takes for something to get from me to you (\(m_{ij}\)), is not necessarily the same as the average number of steps that it takes for something to get from you to me (\(m_{ji}\)).
The accuracy of the average first passage time estimate we get depends on the number of replications we use to get the average. The bigger, the more accurate. However, it can take a lot of computing time if we increased n to a giant number.
Computing the AFPT by Iterating Until Convergence
There’s a more efficient approach to computing the entries in the average first passage time matrix \(\mathbf{M}\) that involves iterating through a matrix using the information in the transition matrix \(\mathbf{P}\). The basic idea is that the average first passage time for a random walker that starts at node \(i\) and ends at node \(j\) is given by:
\[ m_{ij} = 1 + \sum^N_{j \neq k}p_{ik}m_{k,j} \tag{1}\]
With the proviso that \(m_{ii}\) is always equal to zero. What this tells us is that the average first passage time between any two nodes in the graph \(i\) and \(j\), is given by one plus the product of the probability that something can transition from a sender node \(i\) to an intermediary node \(k\)—given by \(p_{ik}\)—and \(m_{kj}\) which is the average first passage time from that intermediary node \(k\) to the destination node \(j\).
Note that since we need to know \(m_{kj}\) to get \(m_{ij}\), then this opens up a chicken or the egg problem that we can solve through iteration like we did for the status scores in the prestige lesson.
Here’s how to proceed. We first start with a null value for all the entries of the \(\mathbf{M}\) matrix (e.g., \(m_{ij} = 0\) for all \(i\) and \(j\)), compute an initial round of \(m_{ij}\) estimates for all pairs of nodes \(i\) and \(j\) using Equation 1, recompute \(m_{ij}\) using those new values, rinse, repeat, and stop after we don’t get any big changes between successive versions of the \(\mathbf{M}\) matrix.
A function that implement this idea goes like this:
first.pass3 <- function(w) {
n <- nrow(w)
nodes <- 1:n
m <- matrix(0, n, n) #initializing M matrix values
d <- 1 #initialize delta
while(d > 1e-06) { #do while delta is bigger than a tiny number
old.m <- m
for (i in nodes) { #loop through every node
for (j in nodes[-i]) { #loop through every node except i
m[i,j] <- 1
for (k in nodes[-c(i,j)]) { #loop through every node except i and j
m[i,j] <- m[i,j] + (w[i,k]*old.m[k,j]) #update M matrix values
}
}
}
d <- abs(sum(abs(m) - abs(old.m))) #delta equals difference between current and previous M matrix
} #end while loop
return(m)
}This function takes the probability transition matrix \(\mathbf{P}\) as input and returns the \(\mathbf{M}\) matrix of average first passage times between every pair of nodes in the network.
And the result is:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 0.0 15.0 28.5 21.1 17.7 25.4 59.7 29.9 28.5 21.2
[2,] 17.0 0.0 29.0 22.3 16.1 23.2 59.0 33.8 28.2 21.8
[3,] 19.1 17.7 0.0 25.7 16.6 25.4 57.6 34.1 27.4 18.6
[4,] 15.1 14.4 29.1 0.0 18.0 25.6 59.9 28.6 28.7 20.9
[5,] 19.4 15.8 27.6 25.7 0.0 24.8 58.4 34.5 25.1 19.9
[6,] 19.3 15.0 28.6 25.3 16.9 0.0 50.5 35.0 24.2 21.8
[7,] 19.9 17.3 27.1 26.1 17.0 16.9 0.0 35.2 27.1 22.2
[8,] 14.9 16.8 28.3 19.5 17.8 26.1 59.9 0.0 28.0 17.7
[9,] 19.4 17.2 27.7 25.6 14.4 21.3 57.8 34.1 0.0 18.4
[10,] 18.7 17.3 25.4 24.3 15.7 25.4 59.4 30.2 24.9 0.0
Which are the exact (up to rounding error) average first passage times between each pair of nodes in the network.
Computing the AFPT Using the Laplacian Matrix
Finally, there is a way to use matrix algebra magic to compute the average first passage time at the limit (\(n \approx \infty\)) in closed form without averaging or iterations.
To do that, first we need to compute a matrix called the graph Laplacian, which is defined as:
\[ \mathbf{L} = \mathbf{D} - \mathbf{A} \tag{2}\]
Where \(\mathbf{D}\) is a diagonal matrix containing the degrees of each node in the graph along the diagonals and zeroes everywhere else.
In R we can compute \(\mathbf{L}\) like this:
Once we have \(\mathbf{L}\), we need to compute a variation known by the less than memorable name of the Moore-Penrose Pseudo-Inverse of the Laplacian, written as \(\mathbf{L}^+\), which is given by:
\[ \mathbf{L}^+ = \left(\mathbf{L} - \frac{\mathbf{E}}{N}\right) ^{-1}- \frac{\mathbf{E}}{N} \tag{3}\]
Where \(\mathbf{E}\) is the all ones matrix of the same dimensions as \(\mathbf{L}\).
So despite the terrible name, \(\mathbf{L}^+\) is easy to compute in R:
As we can see, E is just a matrix containing the inverse of the number of nodes in the network in every cell (\(\frac{1}{N}\)). We use the function solve to obtain the matrix inverse of the expression in parentheses in Equation 3.
What’s in the \(\mathbf{L}^+\) matrix? Let’s see the first ten rows and columns:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 0.112 0.004 -0.011 0.015 -0.012 -0.013 -0.025 0.015 -0.013 -0.008
[2,] 0.004 0.099 -0.015 0.007 -0.002 0.001 -0.021 -0.010 -0.012 -0.012
[3,] -0.011 -0.015 0.167 -0.017 -0.007 -0.015 -0.014 -0.014 -0.009 0.006
[4,] 0.015 0.007 -0.017 0.147 -0.015 -0.015 -0.028 0.022 -0.016 -0.007
[5,] -0.012 -0.002 -0.007 -0.015 0.099 -0.010 -0.019 -0.016 0.007 -0.001
[6,] -0.013 0.001 -0.015 -0.015 -0.010 0.145 0.030 -0.020 0.011 -0.014
[7,] -0.025 -0.021 -0.014 -0.028 -0.019 0.030 0.341 -0.030 -0.016 -0.025
[8,] 0.015 -0.010 -0.014 0.022 -0.016 -0.020 -0.030 0.200 -0.014 0.011
[9,] -0.013 -0.012 -0.009 -0.016 0.007 0.011 -0.016 -0.014 0.164 0.007
[10,] -0.008 -0.012 0.006 -0.007 -0.001 -0.014 -0.025 0.011 0.007 0.125
As we can see, some of the entries of \(\mathbf{L}^+\) are positive and others are negative. More interestingly, we can see that \(\mathbf{L}^+\) is a symmetric matrix: \(l^+_{ij} = l^+_{ij}\).
Even more interestingly, \(\mathbf{L}^+\) is doubly centered—similar to the modularity matrix considered here. That means both the rows and column sums of the matrix are equal to the all zeros vector:
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
The average first passage time between every pair of nodes in the network \(m_{ij}\) expressed in terms of the entries of \(\mathbf{L}^+\) and the degree vector \(\mathbf{d}\), is then given by (Fouss, Saerens, and Shimbo 2016, 158):
\[ m_{ij} = \sum_{k=1}^n \left(l^+_{jj} - l^+_{ij} + l^+_{ik} - l^+_{kj}\right)d_k \tag{4}\]
Where \(d_k\) is the degree of node \(k\).
So let’s try it out:
nodes <- 1:nrow(A) #node id vector
d <- rowSums(A) #degree vector
m <- matrix(0, nrow(A), nrow(A)) #initializing M to all zeros
for (i in nodes) {
for (j in nodes) {
for (k in nodes) {
m[i,j] <- m[i,j] + ((L.p[j,j] - L.p[i,j] + L.p[i,k] - L.p[k,j])*d[k])
}
}
}
round(m[1:10, 1:10], 1) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 0.0 15.0 28.5 21.1 17.7 25.4 59.7 29.9 28.5 21.2
[2,] 17.0 0.0 29.0 22.3 16.1 23.2 59.0 33.8 28.2 21.8
[3,] 19.1 17.7 0.0 25.7 16.6 25.4 57.6 34.1 27.4 18.6
[4,] 15.1 14.4 29.1 0.0 18.0 25.6 59.9 28.6 28.7 20.9
[5,] 19.4 15.8 27.6 25.7 0.0 24.8 58.4 34.5 25.1 19.9
[6,] 19.3 15.0 28.6 25.3 16.9 0.0 50.5 35.0 24.2 21.8
[7,] 19.9 17.3 27.1 26.1 17.0 16.9 0.0 35.2 27.1 22.2
[8,] 14.9 16.8 28.3 19.5 17.8 26.1 59.9 0.0 28.0 17.7
[9,] 19.4 17.2 27.7 25.6 14.4 21.3 57.8 34.1 0.0 18.4
[10,] 18.7 17.3 25.4 24.3 15.7 25.4 59.4 30.2 24.9 0.0
Which produces exactly the same results we obtained earlier via iteration.
Computing the AFPT Using a Big Matrix Formula
There is an even more straightforward way to compute the average first passage time that does not involve looping at all, but simple matrix operations. We can do that because we can express Equation 4 in matrix form like this:
\[ \mathbf{M} = \left[vol(A)\mathbf{e}(\mathbf{l}^+_{d})^T - vol(A)\mathbf{L}^+\right] + \left[(\mathbf{L}^+\mathbf{d})\mathbf{e}^T -\mathbf{e}(\mathbf{d}^T\mathbf{L}^+)\right] \]
Now, this formula looks long and monstrous but it is composed of simple quantities we know and love. We have already been introduced to \(\mathbf{L}^+\), while \(\mathbf{e}\) is a column vector of ones with as many rows as the number of nodes in the graph, \(\mathbf{d}\) is a vector of the same length as the number of nodes in the graph containing the degrees of each node at each position, \(vol(A) = \sum_{i} d_{i}\) is just the sum of the non-zero entries in the adjacency matrix, and \(\mathbf{l}^+_{d}\) is a vector containing the diagonal entries of \(\mathbf{L}^+\) at each position.
The following R code constructs \(\mathbf{M}\) step by step:
And now for the big reveal:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 0.0 15.0 28.5 21.1 17.7 25.4 59.7 29.9 28.5 21.2
[2,] 17.0 0.0 29.0 22.3 16.1 23.2 59.0 33.8 28.2 21.8
[3,] 19.1 17.7 0.0 25.7 16.6 25.4 57.6 34.1 27.4 18.6
[4,] 15.1 14.4 29.1 0.0 18.0 25.6 59.9 28.6 28.7 20.9
[5,] 19.4 15.8 27.6 25.7 0.0 24.8 58.4 34.5 25.1 19.9
[6,] 19.3 15.0 28.6 25.3 16.9 0.0 50.5 35.0 24.2 21.8
[7,] 19.9 17.3 27.1 26.1 17.0 16.9 0.0 35.2 27.1 22.2
[8,] 14.9 16.8 28.3 19.5 17.8 26.1 59.9 0.0 28.0 17.7
[9,] 19.4 17.2 27.7 25.6 14.4 21.3 57.8 34.1 0.0 18.4
[10,] 18.7 17.3 25.4 24.3 15.7 25.4 59.4 30.2 24.9 0.0
Which once again, gives us the results we seek.
As we saw earlier, the interpretation of the entries of the \(\mathbf{M}\) matrix is also straightforward. For instance, if we start a message from node 2, we should expect it to take an average of \(m_{2, 6} =\) 23.2 steps to get to node 6 by following random walks in the network.
The Average Commute Distance
Once we have the average first passage time, we can compute another important quantity called the average commute distance between two nodes \(i\) and \(j\) (\(n_{ij}\)). This is the number of steps it takes for a random walker to start at node \(i\), reach another specific node \(j\) and then get back to the original node \(i\) (hence commuting, like going from home to work and then back from work to home again).
It turns out that \(n_{ij}\) is pretty simple to compute, once we know the average first passage time between every pair of nodes \(m_{ij}\), since it is given by:
\[ n_{ij} = m_{ij} + m_{ji} \]
So the entries of the Average Commute Distance matrix \(\mathbf{N}\) are just the entries of \(\mathbf{M}\) on the upper triangle added to the corresponding entries in the lower triangle. In R we can obtain that by just adding \(\mathbf{M}\) to its transpose \(\mathbf{M}^T\):
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 0.0 32.0 47.6 36.2 37.1 44.7 79.6 44.8 47.9 39.9
[2,] 32.0 0.0 46.7 36.7 31.9 38.2 76.3 50.6 45.5 39.0
[3,] 47.6 46.7 0.0 54.8 44.2 54.0 84.7 62.4 55.0 44.0
[4,] 36.2 36.7 54.8 0.0 43.7 50.9 86.0 48.1 54.4 45.2
[5,] 37.1 31.9 44.2 43.7 0.0 41.8 75.4 52.3 39.5 35.6
[6,] 44.7 38.2 54.0 50.9 41.8 0.0 67.4 61.1 45.5 47.2
[7,] 79.6 76.3 84.7 86.0 75.4 67.4 0.0 95.1 84.9 81.6
[8,] 44.8 50.6 62.4 48.1 52.3 61.1 95.1 0.0 62.1 47.9
[9,] 47.9 45.5 55.0 54.4 39.5 45.5 84.9 62.1 0.0 43.3
[10,] 39.9 39.0 44.0 45.2 35.6 47.2 81.6 47.9 43.3 0.0
Note that the average commute distance matrix entries are symmetric, which makes sense, for it takes as long for something to go from to you to me and back to you as from me to you and back to me. The average commute distance between pairs of nodes thus functions as a similarity metric; the lower the average commute distance, the closer or more similar two nodes are in the network.
Of course, we can also use math to compute \(\mathbf{N}\) directly from the entries of \(\mathbf{L}^+\) using a recombination of the same ingredients we used before to compute the average first passage time. It goes like this:
\[ n_{ij} = vol(A) \left(l^+_{ii} + l^+_{jj} + 2l^+_{ij}\right) \]
Which in (more complicated) matrix form looks like:
\[ \mathbf{N} = vol(A)\left[\mathbf{l}^+_{d}\mathbf{e}^T + \mathbf{e}\mathbf{l}^+_{d} - 2\mathbf{L}^+\right] \]
But which is actually a much less monstrous and simpler expression than before.
The following R code constructs the \(\mathbf{N}\) matrix of average commute distances between each pair of nodes in the network step by step:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 0.0 32.0 47.6 36.2 37.1 44.7 79.6 44.8 47.9 39.9
[2,] 32.0 0.0 46.7 36.7 31.9 38.2 76.3 50.6 45.5 39.0
[3,] 47.6 46.7 0.0 54.8 44.2 54.0 84.7 62.4 55.0 44.0
[4,] 36.2 36.7 54.8 0.0 43.7 50.9 86.0 48.1 54.4 45.2
[5,] 37.1 31.9 44.2 43.7 0.0 41.8 75.4 52.3 39.5 35.6
[6,] 44.7 38.2 54.0 50.9 41.8 0.0 67.4 61.1 45.5 47.2
[7,] 79.6 76.3 84.7 86.0 75.4 67.4 0.0 95.1 84.9 81.6
[8,] 44.8 50.6 62.4 48.1 52.3 61.1 95.1 0.0 62.1 47.9
[9,] 47.9 45.5 55.0 54.4 39.5 45.5 84.9 62.1 0.0 43.3
[10,] 39.9 39.0 44.0 45.2 35.6 47.2 81.6 47.9 43.3 0.0
Which as you can see, gives us the results we seek.
The interpretation of the entries of the \(\mathbf{N}\) matrix is also straightforward. For instance, if a message starts at node 4, it would take an average of 54.4 steps for it to get to node 9 and then back to the original node 4 by following a random walk in the network governed by the node-to-node probabilities stored in the transition matrix \(\mathbf{P}\).
Interestingly, we can obtain the average commute time distance between any pair nodes yet another way. For instance from the above matrix, we know the average commute time distances between nodes 3 and 8 is 62.38.
Let’s construct two vectors full of zeros of the same length as the number of nodes in the graph, except they have a one in the third and eighth spot respectively:
Fouss, Pirotte, and Saerens (2004) show that the average commute time distance between nodes 3 and 8 is also given by:
Neat!