Random Walk Concepts in Networks

The Average First Passage Time

Imagine something is diffusing through a network starting with some seed node \(i\) following a series of discrete time steps. If the graph is connected, the thing will eventually reach every node in the network. However, depending on the connectivity structure of the graph, it will reach some nodes (e.g., those at a smaller graph theoretic distance from the seed node) sooner than others.

Here’s a function called first.pass1 that records the minimum number of steps that it takes for something to get to each node in a graph starting from a given seed node:

   first.pass1 <- function(w, seed = 1) {
      fp <- rep(0, ncol(w))
      k <- 1
      i <- seed
      while(sum(fp[-seed] == 0) != 0) {
         j <- sample(c(1:ncol(w)), 1, prob = w[i, ]) 
         if (fp[j] == 0 & j != seed) {fp[j] <- k}
         i <- j
         k <- k + 1
         }
   names(fp) <- 1:ncol(w)
   return(fp)
   }

The function takes a transition matrix \(\mathbf{P}\) as the primary input (of the sort we discussed when talking about status and prestige) and returns a vector containing the time step at which the thing that diffused through the network got to the \(j^{th}\) node.

Let’s test it out using the friendship network from the Krackhardt high-tech managers data:

   library(networkdata)
   library(igraph)
   g <- as_undirected(ht_friends, mode = "collapse")
   A <- as.matrix(as_adjacency_matrix(g))
   P <- diag(1/rowSums(A)) %*% A    
   set.seed(456)
   first.pass1(P)
  1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18  19  20 
  0   2 137  24  14  28  43  36   8   9  16  29  15  26   7   1  25  33   6  32 
 21 
 12 

We can see for instance that the rumor got to node 16 in just one time step, but that it took 24 time steps to get to node 3 and 137 (!) to get to node 3.

So it seems like this, the minimum time it takes for something that starts with me to get to you (Fouss, Pirotte, and Saerens 2004), is a good measure of the proximity between me and you in the graph.

Averaging Across Realizations

However, we wouldn’t want to use just one run of the diffusion process to calculate this proximity measure. Instead a better approach is to use the average time it takes for something to get to the other nodes when it starts from a given node.

To do that, we can just replicate the first.pass1 function some number of times (e.g., \(n = 100\)) and take an average:

   set.seed(456)
   first.res <- replicate(100, first.pass1(P))
   round(rowMeans(first.res), 2)
    1     2     3     4     5     6     7     8     9    10    11    12    13 
 0.00 16.14 32.88 21.24 17.17 24.35 59.98 27.83 25.85 17.75 10.16 15.78 86.78 
   14    15    16    17    18    19    20    21 
36.32 15.43 32.94  8.65 44.59 19.17 31.47 33.85 

We can see that according to this measure, called the average first passage time (Fouss, Saerens, and Shimbo 2016, 36), when we start with node 1, things get relatively quickly to node 17 (\(\hat{t} = 8.7\)) but they take forever to get to node 13 (\(\hat{t} = 86.8\)).

We can of course, write a wrapper around the first.pass1 function to compute the average first passage time from one node to another using every node in the graph as the initial seed:

   first.pass2 <- function(w, n) {
      m <- rowMeans(replicate(n, first.pass1(w)))
      for (i in 2:nrow(w)) {
         m <- rbind(m, rowMeans(replicate(n, first.pass1(w, seed = i))))
      }
   rownames(m) <- 1:nrow(w)
   return(m)
   }

And the result (showing the first ten rows and columns) is:

   set.seed(456)
   round(first.pass2(P, 100), 1)[1:10, 1:10]
      1    2    3    4    5    6    7    8    9   10
1   0.0 16.1 32.9 21.2 17.2 24.4 60.0 27.8 25.9 17.8
2  18.1  0.0 23.4 24.0 14.6 20.5 60.6 32.6 32.6 16.6
3  20.6 22.2  0.0 19.9 17.1 29.1 63.1 29.8 26.8 19.0
4  14.1 14.1 27.4  0.0 18.1 21.7 57.8 28.9 22.7 21.8
5  16.9 16.7 30.0 25.1  0.0 25.0 52.9 36.5 19.9 24.1
6  19.5 18.1 24.4 28.2 15.2  0.0 41.5 38.6 19.5 25.2
7  20.7 15.9 26.8 24.8 16.3 14.2  0.0 33.0 26.1 23.7
8  14.1 14.9 27.2 16.6 17.2 29.2 73.3  0.0 23.8 19.1
9  18.1 16.1 27.5 23.9 16.3 20.7 52.1 32.7  0.0 20.1
10 20.0 18.7 26.2 23.3 16.2 23.8 64.7 27.0 21.2  0.0

The accuracy of the average first passage time estimate we get depends on the number of replications we use to get the average. The bigger, the more accurate. However, it can take a lot of computing time if we increased n to a giant number.

Iterating Until Convergence

There’s another approach to computing the average first passage time that involves iterating through a matrix using the information in the transition matrix. The basic idea is that the average first passage time for a random walker that starts at node \(i\) and ends at node \(j\) is given by:

\[ m_{ij} = 1 + \sum^N_{j \neq k}p_{ik}m_{k,j} \]

With the proviso that \(m_{ii}\) is always equal to zero. What this tells us is that the average first passage time between any two nodes in the graph \(i\) and \(j\), is given by one plus the product probability that something can transition from a sender node \(i\) to an intermediary node \(k\)—given by \(p_{ik}\)—and \(m_{kj}\) which is the average first passage time from that intermediary node \(k\) to the destination node \(j\).

Note that since we need to know \(m_{kj}\) to get \(m_{ij}\), then this opens up a chicken or the egg problem that we can solve through iteration like we did for the status scores in the prestige lesson. That is, we first start with a null value for all the entries of the \(\mathbf{M}\) matrix (e.g., \(m_{ij} = 0\) for all \(i\) and \(j\)), compute an initial round of \(m_{ij}\) estimates for all pairs of nodes \(i\) and \(j\) using the equation above, recompute \(m_{ij}\) using those new values, rinse, repeat, and stop after we don’t get any changes between successive versions of the \(\mathbf{M}\) matrix.

A function that implement this idea goes like this:

   first.pass3 <- function(w) {
      n <- nrow(w)
      nodes <- 1:n
      m <- matrix(0, n, n) #initializing M matrix values
      colnames(m) <- 1:n
      rownames(m) <- 1:n
      d <- 1
      z <- 1
      while(d > 1e-05) {
      old.m <- m
      for (i in nodes) { #loop through every node
         for (j in nodes[-i]) { #loop through every node except i
            m[i,j] <- 1
            for (k in nodes[-j]) { #loop through every node except i and j
               m[i,j] <- m[i,j] + (w[i,k]*old.m[k,j]) #update M matrix values
               }
            }
         }
      z <- z + 1
      d <- abs(sum(abs(m) - abs(old.m))) #difference between current and previous M matrix
      } #end while loop
      return(m)
   }

This function takes the probability transition matrix \(\mathbf{P}\) as input and returns \(\mathbf{M}\) a matrix of average first passage times between every pair of nodes in the network.

And the result is:

   round(first.pass3(P), 1)[1:10, 1:10]
      1    2    3    4    5    6    7    8    9   10
1   0.0 15.0 28.5 21.1 17.7 25.4 59.7 29.9 28.5 21.2
2  17.0  0.0 29.0 22.3 16.1 23.2 59.0 33.8 28.2 21.8
3  19.1 17.7  0.0 25.7 16.6 25.4 57.6 34.1 27.4 18.6
4  15.1 14.4 29.1  0.0 18.0 25.6 59.9 28.6 28.7 20.9
5  19.4 15.8 27.6 25.7  0.0 24.8 58.4 34.5 25.1 19.9
6  19.3 15.0 28.6 25.3 16.9  0.0 50.5 35.0 24.2 21.8
7  19.9 17.3 27.1 26.1 17.0 16.9  0.0 35.2 27.1 22.2
8  14.9 16.8 28.3 19.5 17.8 26.1 59.9  0.0 28.0 17.7
9  19.4 17.2 27.7 25.6 14.4 21.3 57.8 34.1  0.0 18.4
10 18.7 17.3 25.4 24.3 15.7 25.4 59.4 30.2 24.9  0.0

Which are values pretty close to the ones we obtained by averaging earlier.

Using the Laplacian Matrix

Finally, there is a way to use matrix algebra magic to compute the average first pass at the limit (\(n \approx \infty\)) in closed form without averaging or iterations.

To do that, first we need to compute a matrix called the graph Laplacian, which is defined as:

\[ \mathbf{L} = \mathbf{D} - \mathbf{A} \]

Where \(\mathbf{D}\) is a diagonal matrix containing the degrees of each node in the graph along the diagonals and zeroes everywhere else.

In R we can compute \(\mathbf{L}\) like this:

   D <- diag(rowSums(A))
   L <- D - A

Once we have \(\mathbf{L}\), we need to compute a variation known by the less than memorable name of the Moore-Penrose Pseudo-Inverse of the Laplacian, written as \(\mathbf{L}^+\).

Despite the terrible name, \(\mathbf{L}^+\) is easy to compute:

   E <- matrix(1/nrow(A), nrow(A), nrow(A))
   L.plus <- solve(L - E) + E

As we can see, E is just a matrix containing the inverse of the number of nodes in the network in every cell.

Now, the average first passage time between every pair of nodes in the network, contained in the matrix \(\mathbf{M}\) is given by:

\[ \mathbf{M} = vol(A)(\mathbf{e}\mathbf{L}^+_{diag})^T - vol(A)\mathbf{L}^+ + (\mathbf{L}^+\mathbf{d})\mathbf{e}^T -\mathbf{e}(\mathbf{d}^T\mathbf{L}^+) \]

Now, this formula looks long and monstrous but it is composed of simple quantities we know and love. We have already been introduced to \(\mathbf{L}^+\), while \(vol(A) = \sum_{ij} a_{ij}\) is just the sum of the non-zero entries in the adjacency matrix, \(\mathbf{e}\) is a column vector of ones with as many rows as the number of nodes in the graph, \(\mathbf{d}\) is a vector of the same length as the number of nodes in the graph containing the degrees of each node at each position, and \(\mathbf{L}^+_{diag}\) is a vector containing the diagonal entries of \(\mathbf{L}^+\) at each position.

The following R code constructs \(\mathbf{M}\) step by step:

   e <- matrix(1, nrow(A), 1)
   d <- rowSums(A)
   M <- sum(A) * t(e %*% diag(L.plus))
   M <- M - (sum(A) * L.plus)
   M <- M + (L.plus %*% d) %*% t(e)
   M <- M - (e %*% (t(d) %*% L.plus))
   rownames(M) <- 1:nrow(A)
   colnames(M) <- 1:nrow(A)
   M <- t(M)

And now for the big reveal:

   round(M, 1)[1:10, 1:10]
      1    2    3    4    5    6    7    8    9   10
1   0.0 14.9 27.7 20.6 17.3 24.5 56.1 28.8 27.6 20.6
2  17.1  0.0 28.4 22.0 15.8 22.3 55.5 32.8 27.6 21.3
3  19.9 18.3  0.0 26.0 16.9 25.3 54.7 33.7 27.3 18.8
4  15.6 14.8 28.8  0.0 18.1 25.1 56.8 27.9 28.4 20.8
5  19.9 16.1 27.3 25.6  0.0 24.3 55.3 33.8 24.7 19.8
6  20.2 15.8 28.8 25.8 17.5  0.0 47.8 34.8 24.3 22.1
7  23.6 20.8 30.0 29.2 20.2 19.6  0.0 37.7 29.9 25.2
8  16.0 17.8 28.7 20.1 18.5 26.3 57.4  0.0 28.4 18.2
9  20.2 17.9 27.7 26.0 14.7 21.2 55.0 33.7  0.0 18.6
10 19.3 17.8 25.3 24.4 15.9 25.1 56.4 29.7 24.7  0.0

Which shows entries pretty close in value to the ones we obtained by iteration and averaging.

The Average Commute Distance

Once we have the average first passage time, we can compute another important quantity called the average commute distance between two nodes \(i\) and \(j\) (\(n_{ij}\)). This is the number of steps it takes for a random walker to start at node \(i\), reach another specific node \(j\) and then get back to the original node \(i\) (hence commuting, like going fromm to work and back home again).

It turns out that \(n_{ij}\) is pretty simple to compute, once we know the average first passage time between every pair of nodes \(m_{ij}\), since it is given by:

\[ n_{ij} = m_{ij} + m_{ji} \]

So the Average Commute Distance is just the entries of \(\mathbf{M}\) on the upper triangle added to the corresponding entries in the lower triangle:

   N <- M + t(M)
   round(N[1:10, 1:10], 1)
      1    2    3    4    5    6    7    8    9   10
1   0.0 32.0 47.6 36.2 37.1 44.7 79.6 44.8 47.9 39.9
2  32.0  0.0 46.7 36.7 31.9 38.2 76.3 50.6 45.5 39.0
3  47.6 46.7  0.0 54.8 44.2 54.0 84.7 62.4 55.0 44.0
4  36.2 36.7 54.8  0.0 43.7 50.9 86.0 48.1 54.4 45.2
5  37.1 31.9 44.2 43.7  0.0 41.8 75.4 52.3 39.5 35.6
6  44.7 38.2 54.0 50.9 41.8  0.0 67.4 61.1 45.5 47.2
7  79.6 76.3 84.7 86.0 75.4 67.4  0.0 95.1 84.9 81.6
8  44.8 50.6 62.4 48.1 52.3 61.1 95.1  0.0 62.1 47.9
9  47.9 45.5 55.0 54.4 39.5 45.5 84.9 62.1  0.0 43.3
10 39.9 39.0 44.0 45.2 35.6 47.2 81.6 47.9 43.3  0.0

Note that the entries in this matrix are symmetric (it takes as long for something to go from to you to me and back to you as from me to you and back to me). They thus function as a similarity metric between nodes; the lower the average commute distance, the closer or more similar the two nodes are.

Of course, there’s also math to compute \(\mathbf{N}\) directly from \(\mathbf{L}^+\) using the same ingredients we used before. It goes like this:

\[ \mathbf{N} = (\mathbf{L}^+_{diag}\mathbf{e}^T + \mathbf{e}\mathbf{L}^+_{diag} - 2\mathbf{L}^+)vol(A) \]

Which is actually a much less monstrous and simpler expression than before.

The following R code constructs \(\mathbf{N}\) step by step:

   N <- diag(L.plus) %*% t(e)
   N <- N + (e %*% diag(L.plus))
   N <- N - (2 * L.plus)
   N <- N * sum(A)
   round(N, 1)[1:10, 1:10]
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]  0.0 32.0 47.6 36.2 37.1 44.7 79.6 44.8 47.9  39.9
 [2,] 32.0  0.0 46.7 36.7 31.9 38.2 76.3 50.6 45.5  39.0
 [3,] 47.6 46.7  0.0 54.8 44.2 54.0 84.7 62.4 55.0  44.0
 [4,] 36.2 36.7 54.8  0.0 43.7 50.9 86.0 48.1 54.4  45.2
 [5,] 37.1 31.9 44.2 43.7  0.0 41.8 75.4 52.3 39.5  35.6
 [6,] 44.7 38.2 54.0 50.9 41.8  0.0 67.4 61.1 45.5  47.2
 [7,] 79.6 76.3 84.7 86.0 75.4 67.4  0.0 95.1 84.9  81.6
 [8,] 44.8 50.6 62.4 48.1 52.3 61.1 95.1  0.0 62.1  47.9
 [9,] 47.9 45.5 55.0 54.4 39.5 45.5 84.9 62.1  0.0  43.3
[10,] 39.9 39.0 44.0 45.2 35.6 47.2 81.6 47.9 43.3   0.0

Which as you can see, gives us the results we seek.

Interestingly, we can obtain the average commute time distance between any pair nodes yet another way. For instance from the above matrix, we know the average commute time distances between nodes 3 and 8 is 62.38.

Let’s construct two vectors full of zeros of the same length as the number of nodes in the graph, except they have a one in the third and eighth spot respectively:

   n <- ncol(A)
   i <- rep(0, n)
   j <- rep(0, n)
   i[3] <- 1
   j[8] <- 1

Fouss, Pirotte, and Saerens (2004) show that the average commute time distance between nodes 3 and 8 is also given by:

   round(sum(A) * (t((i - j)) %*% L.plus %*% (i - j)), 2)
      [,1]
[1,] 62.38

Neat!

References

Fouss, François, Alain Pirotte, and Marco Saerens. 2004. “The Application of New Concepts of Dissimilarities Between Nodes of a Graph to Collaborative Filtering.” In Proc. Workshop on Statistical Approaches for Web Mining, 26–37.
Fouss, François, Marco Saerens, and Masashi Shimbo. 2016. Algorithms and Models for Network Data and Link Analysis. Cambridge University Press.