Status and Prestige
In the the centrality lecture notes, we saw how to compute the most popular centrality measures. Freeman’s “big three” have strong graph-theoretic foundation and do a good job of formalizing and quantifying the idea that a node is central if it is “well-placed” in the network, where being well-placed resolves into either being able to reach others (directly as with degree or indirectly as with closeness) or being able to intermediate between others (as with betweenness).
Networks as Prisms
There is, however, another strong and well-motivated intuition as to what it means to be “well-placed” in a network. Here the ties in the network are seen less as “pipes” that transmit stuff and more like “prisms” that reflect on you (Podolny 2001).
One way to think about this second version of well-placedness is that what is transmitted through the network is the network itself, or more accurately, the importance, status, and prestige of the people you are connected to, preferably flowing from them (high status people) to you.
Under this interpretation, actors get status and prestige in the network from being connected to prestigious and high status others. Those others, in turn, get their status from being connected to high status others, and so on ad infinitum.
One way of quantifying this idea goes like this. If \(\mathbf{x}\) is a vector containing the desired status scores, then the status of actor \(i\) should be equal to:
\[ x_i = \sum_{j} a_{ij}x_j \tag{1}\]
Where \(a_{ij} = 1\) if \(i\) is adjacent to \(j\) in the network. Note that this formula just sums up the status scores of all the others each actor is connected to.
In matrix notation, if \(\mathbf{x}\) is a column vector of status scores then:
\[ \mathbf{x} = A\mathbf{x} \]
Because \(\mathbf{A}\) is an \(n \times n\) matrix and \(\mathbf{x}\) is \(n \times 1\) column vector, the multiplication \(A\mathbf{x}\) will return another column vector of dimensions \(n \times 1\), in this case \(\mathbf{x}\) itself!
Note the problem that this formulation poses: \(\mathbf{x}\) appears on both sides of the equation, which means that in order to know the status of any one node we would need to know the status of the others, but calculating the status of the others depends on knowing the status of the focal node, and so on. There’s a chicken and the egg problem here.
Now, there is an obvious (to the math majors) mathematical solution to this problem, because there’s a class of solvable (under some mild conditions imposed on the matrix \(\mathbf{A}\)) linear algebra problems that take the form:
\[ \lambda\mathbf{x} = A\mathbf{x} \]
Where \(\lambda\) is just a plain old number (a scalar). Once again conditional of the aforementioned mild conditions being met, we can iteratively search for a value \(\lambda\), fix it, then fill up the \(\mathbf{x}\) vector with another set of values, fix those, search for a new \(\lambda\), and continue until we have values of \(\lambda\) and \(\mathbf{x}\) that make the above equality true.
When we do that successfully, we say that the value of \(\lambda\) we hit upon is an eigenvalue of the matrix \(\mathbf{A}\) and the values of the vector \(\mathbf{x}\) we came up with are an eigenvector of the same matrix (technically in the above equation a right eigenvector).
Eigenvalues and eigenvectors, like Don Quixote and Sancho Panza, come in pairs, because you need a unique combination of both to solve the equation. Typically, a given matrix (like an adjacency matrix) will have multiple \(\lambda/\mathbf{x}\) pairs that will solve the equation. Together the whole set \(\lambda/\mathbf{x}\) pairs that make the equation true are the eigenvalues and eigenvectors of the matrix.
Eigenvalues, Eigenvectors, Oh My!
Note that all of this obscure talk about eigenvalues and eigenvectors is just matrix linear algebra stuff. It has nothing to do with networks and social structure.
In contrast, because the big three centrality measures have a direct foundation in graph theory, and graph theory is an isomorphic model of social structures (points map to actors/people and lines map to relations) the “math” we do with graph theory is directly meaningful as a model of networks (the counts of the number of edges incident to a node is the count of other actors they someone is directly connected to).
Eigenvalues and eigenvectors are not a model of social structure in the way graph theory is (their first scientific application was in Chemistry and Physics). They are just a mechanical math fix to a circular equation problem.
This is why it’s a mistake to introduce network measures of status and prestige by jumping directly to the machinery of linear algebra (or worse talk about the idea of eigenvector centrality which means nothing to most people, and combines two obscure terms into one even more obscure compound term).
A better approach is to see if we can motivate the use of measures like the ones above using the simple model of the distribution of status and prestige we started with earlier. We will see that we can, and that doing that leads us back to solutions that are the mathematical equivalent of all the eigenvector stuff.
Distributing Status to Others
Let’s start with the simplest model of how people can get their status from the status of others in a network. It is the simplest because it is based on degree.
Imagine everyone has the same “quantum” of status to begin with (this can be stored in a vector containing the same number of length equals to number of actors in the network). Then, at each step, people “send” the same amount of status to all their alters in the network. Then we repeat, with everyone sending the amount of status they now have after receiving it from others. We repeat this many times. At the end of each step, we compute people’s new status scores using Equation 1. We stop doing this after the status scores of people stop changing across each iteration.
Let us see a real-life example at work.
We will use a data set collected by David Krackhardt on the friendships of 21 managers in a high tech company in the West coast (see the description here). The data are reported as directed ties (\(i\) nominates \(j\) as a friend) but we will constrain ties to be undirected:
The undirected friendship network is shown in Figure 1.
We then extract the adjacency matrix corresponding to this network:
And here’s a simple custom function using a while
loop that exemplifies the process of status distribution through the network we just talked about:
status1 <- function(w) {
x <- rep(1, nrow(w)) #initial status vector set to all ones of length equal to the number of nodes
d <- 1 #initial delta
k <- 0 #initializing counter
while (d > 1e-10) {
o.x <- x #old status scores
x <- w %*% o.x #new scores a function of old scores and adjacency matrix
x <- x/norm(x, type = "E") #normalizing new status scoress
d <- abs(sum(abs(x) - abs(o.x))) #delta between new and old scores
k <- k + 1 #incrementing while counter
}
return(as.vector(x))
}
Lines 2-4 initialize various quantities, most importantly the initial status vector for each node to just a series of ones:
Then lines 5-12 implement a little loop of how status is distributed through the network, with the most important piece of code being line 7 where the current status scores for each node are just the sum of the status scores of its neighbors computed one iteration earlier. The program stops when the difference between the old and the new scores is negligible (\(\delta < 10^{-10}\)) as checked in line 9.
Note the normalization step on line 8, where we divide the each status score by a normalized sum of all of the scores. This is required in order to prevent the sum of status scores from getting bigger and bigger indefinitely (in mathese, this is referred to as the sum “diverging”). In base R
, the type = "E"
normalization implements the Euclidean vector norm (also sometimes confusingly called the Frobenieus norm), by which we divide each value of the status scores by after each update.1
And here’s the resulting (row) vector of status scores for each node:
[1] 0.619 0.635 0.446 0.489 0.629 0.430 0.205 0.380 0.444 0.468 0.814 0.549
[13] 0.162 0.401 0.613 0.360 1.000 0.247 0.680 0.360 0.392
What if I told you that this vector is the same as that given by the leading (first) eigenvector of the adjacency matrix?
s.eig <- abs(eigen(A)$vector[, 1]) #computing the first eigenvector
s.eig <- s.eig/max(s.eig) #normalizing by maximum
round(s.eig, 3)
[1] 0.619 0.635 0.446 0.489 0.629 0.430 0.205 0.380 0.444 0.468 0.814 0.549
[13] 0.162 0.401 0.613 0.360 1.000 0.247 0.680 0.360 0.392
Which is of course what is computed by the eigen_centrality
function in igraph
:
[1] 0.619 0.635 0.446 0.489 0.629 0.430 0.205 0.380 0.444 0.468 0.814 0.549
[13] 0.162 0.401 0.613 0.360 1.000 0.247 0.680 0.360 0.392
So, the “eigenvector centralities” are just the limit scores produced by the status distribution process implemented in the status1
function!
When treated as a structural index of connectivity in a graph (i.e., a centrality measure) the eigenvector status scores induce an ordering of the nodes which we may be interested in looking at:
nodes <- 1:vcount(g)
eig.dat <- data.frame(Nodes = nodes, Eigen.Cent = s, Deg.Cent = degree(g))
eig.dat <- eig.dat[order(eig.dat$Eigen.Cent, decreasing = TRUE), ]
library(kableExtra)
kbl(eig.dat[1:10, ],
format = "html", align = "c", row.names = FALSE,
caption = "Top Ten Eigenvector Scores.",
digits = 3) %>%
kable_styling(bootstrap_options =
c("hover", "condensed", "responsive"))
Nodes | Eigen.Cent | Deg.Cent |
---|---|---|
17 | 1.000 | 18 |
11 | 0.814 | 14 |
19 | 0.680 | 10 |
2 | 0.635 | 10 |
5 | 0.629 | 10 |
1 | 0.619 | 9 |
15 | 0.613 | 9 |
12 | 0.549 | 8 |
4 | 0.489 | 7 |
10 | 0.468 | 8 |
As we can see, for instance, there is a strong correlation between degree and the eigenvector score. But they are not the same. For instance, nodes 19, 2, 5 are tied according to degree, but 19 is highest in the eigenvector scoring, indicating that even though they all have the same number of friends, 19 is connected to better-connected others.
As we will see, most other measures of prestige/status rank for nodes in networks are constructed using similar principles as the ones just described. What changes is the model assumptions of how status is distributed in the system. That’s why scary and non-intuitive stuff about eigenvectors or whatever is misleading.
Other measures of node status and prestige are designed such that they either change the quantum of status that is distributed through the network by making it dependent on some node characteristic (like degree) or differentiate between different routes of distribution in directed graphs, by for instance, differentiating status derived from outgoing links from that derived from incoming links.
Let’s see some examples of these alternative cases.
Combining Prestige and Similarity
In a neat paper, Alvarez-Socorro, Herrera-Almarza, and González-Dı́az (2015) combine ideas of node similarity analysis with prestige ranking to derive an interesting twist on the usual Eigenvector-based prestige score.
Their idea is simple and intuitive, linking the “getting status from others” idea we just talked about, but weighting those others by their similarity to ego. Their point is that linking to others who are not similar to you should give you more status than linking to others that are similar, which is the principle behind such fundamental network theories as Strength of Weak Ties (Granovetter 1973) and Structural Holes (Burt 1992).
How do we do it? First, we just need to compute one of the many similarity metrics, to come up with a dissimilarity matrix between nodes. We choose the Jaccard similarity, which can be computed using the following function:
The dissimilarity matrix for the Krackhardt managers friendship network is just:
Then Alvarez-Socorro, Herrera-Almarza, and González-Dı́az (2015) create a new matrix W
which equals the element wise product between the adjacency matrix and the distance matrix:
And the (dis)similarity weighted status scores are obtained by playing our status game on the W
matrix:
And here are the top ten nodes by this dissimilarity weighted prestige measure:
Nodes | Eigen.Cent | Deg.Cent |
---|---|---|
17 | 1.000 | 18 |
11 | 0.821 | 14 |
5 | 0.667 | 10 |
2 | 0.656 | 10 |
19 | 0.637 | 10 |
12 | 0.619 | 8 |
10 | 0.608 | 8 |
15 | 0.579 | 9 |
1 | 0.563 | 9 |
6 | 0.514 | 7 |
Note that while the top two nodes (17 and 11) does not change, taking dissimilarity into account alters the rankings. Node 19 drops from third to fifth place and node 1 from fifth to ninth, and node 5 goes from sixth to third, indicating that node 5 connects to well-connected others who are themselves not connected to their neighbors.
Of course, it is possible that in some settings, it might make sense to say that people receive more status points by being connected to others who are themselves connected to the people they are connected to (similar others). This situation would apply when status is generated not via brokerage, but by belonging to well-delimited cohesive groups.
To accommodate this possibility, we can generalize the above approach as follows. Recall that for Alvarez-Socorro, Herrera-Almarza, and González-Dı́az (2015) the W
matrix is given by:
\[ \mathbf{W} = \mathbf{A} \odot (1 - \mathbf{S}) \]
Where \(\mathbf{A}\) is the adjacency matrix and \(\mathbf{S}\) is a similarity matrix (making (\(1-\mathbf{S}\) a dissimilarity matrix)) and \(\odot\) is the element-wise matrix multiplication operator.
A generalization that accommodates a wider range of dependencies between status and similarity goes like this:
\[ \mathbf{W} = \mathbf{A} \odot \left[\mathbf{S}^\delta \odot (1 - \mathbf{S})^\gamma\right] \]
With the restriction that \(0 \geq \delta \leq 1\) and \(0 \geq \gamma \leq 1\).
When \(\delta = \gamma = 0\), then \(\mathbf{W} = \mathbf{A}\) and we recover the standard Eigenvector scores we computed earlier. When \(\delta = 0\) and \(\gamma > 0\), then we recover the dissimilarity-weighted eigenvector scores of Alvarez-Socorro, Herrera-Almarza, and González-Dı́az (2015). When \(\delta > 0\) and \(\gamma = 0\) then we compute scores that weigh the eigenvector scores by similarity rather than dissimilarity.
Here is a function that packages everything above into the generalized version:
And now we can run through the various possibilities.
Top-ten regular eigenvector:
17 11 19 2 5 1 15 12 4 10
1.000 0.814 0.680 0.635 0.629 0.619 0.613 0.549 0.489 0.468
Top-ten dissimilarity-weighted eigenvector:
s <- prestige.sim(g, gamma = 1)
names(s) <- 1:vcount(g)
round(sort(s/max(s), decreasing = TRUE)[1:10], 3)
17 11 5 2 19 12 10 15 1 6
1.000 0.821 0.667 0.656 0.637 0.619 0.608 0.579 0.563 0.514
Top-ten similarity-weighted eigenvector:
s <- prestige.sim(g, delta = 1)
names(s) <- 1:vcount(g)
round(sort(s/max(s), decreasing = TRUE)[1:10], 3)
17 11 19 1 15 2 5 4 12 3
1.000 0.830 0.768 0.714 0.660 0.581 0.565 0.526 0.412 0.375
Neat!
Bonacich Prestige in Directed Graphs
In a classic paper, Philip Bonacich (1972) noted the above connection between different ways people conceptualized status and prestige in networks and the leading eigenvector of the adjacency matrix. He then noted that we can extend similar ideas to the directed case.
Here, people get status from receiving nominations from high status others (i.e., those who receive a lot of nominations), whose partners also get status from receiving a lot of nominations from high status others, and so forth.
This means that in a directed system of relations, status distribution operates primarily via the indegree of each node, so that if \(\mathbf{A}\) is the asymmetric adjacency matrix corresponding to the directed graph, then if we play our status game on the transpose of this matrix \(\mathbf{A}^T\) we will get the scores we seek (Fouss, Saerens, and Shimbo 2016, 204).
Recall that in transposing the matrix of a directed graph, we change it from being a from/to matrix (nodes in the rows send ties to nodes in the columns) to a to/from matrix: Nodes in the rows receive ties from nodes in the columns. So we want to play our status game in this matrix, because we want to rank nodes according to their receipt of ties from high-status others.
Let’s see a real-life example, this time using the directed version of the Krackhardt friendship nomination network among the high-tech managers:
[1] 0.922 1.000 0.379 0.639 0.396 0.190 0.240 0.531 0.411 0.117 0.413 0.769
[13] 0.082 0.428 0.368 0.450 0.592 0.440 0.425 0.225 0.583
Which are the same scores we would have gotten using the eigen_centrality
function in igraph
with the argument directed
set to TRUE
:
[1] 0.922 1.000 0.379 0.639 0.396 0.190 0.240 0.531 0.411 0.117 0.413 0.769
[13] 0.082 0.428 0.368 0.450 0.592 0.440 0.425 0.225 0.583
And, like before, we can treat these scores as centrality measures and rank the nodes in the graph according to them.
Here are the top ten nodes:
Nodes | Eigen.Cent | In.Deg.Cent |
---|---|---|
2 | 1.000 | 10 |
1 | 0.922 | 8 |
12 | 0.769 | 8 |
4 | 0.639 | 5 |
17 | 0.592 | 6 |
21 | 0.583 | 5 |
8 | 0.531 | 5 |
16 | 0.450 | 4 |
18 | 0.440 | 4 |
14 | 0.428 | 5 |
While the top indegree centrality node (2) also gets the top Eigenvector Centrality scores, we see many cases of nodes with equal indegree centrality that get substantively different Eigenvector scores. So who you are connected matters in addition to how many incoming connections you have.
References
Footnotes
For a vector of numbers \(\mathbf{x}\) the Euclidean vector norm \(||\mathbf{x}||_2\) is given by: \(\sqrt{\sum x^2}\).↩︎