Introduction
In an insightful, and somewhat overlooked paper published in Social Networks, Kovacs (2010) introduced a generalized relational similarity measure based on iterated correlations, applicable to two-mode and one-mode network data. This post provides a brief description of Kovacs’s approach, and introduces an implementation of his proposed algorithm as a function for the R statistical computing environment.
According to Kovacs, a desirable generalized similarity measure must have two desirable properties.
- First, it should respect the principle of equivalence, such that it classifies actors as similar if they have similar relations to other objects who are themselves similar.
- Second, the similarity measure should respect the principle of duality (Breiger, 1974), such that it classifies actors as similar if they have similar relations to similar objects, and objects as similar if they have similar relations to similar actors within the same system.
The Correlation Distance as a Function of Row and Column Profiles
Kovacs’s main observation is that such a generalized relational similarity measure could be obtained by transforming the usual correlation distance measure. Particularly, Kovacs begins by noting that in the two-mode case, the correlation distance between any pair of actors
Where
The same approach can be used to find the correlation distance between any two column objects
Overall, increasingly positive values of the correlation distance indicate actor similarity, while negative values indicate actor dissimilarity, with values bounded in the
Kovacs noted that the correlation distance classifies actors as similar if they have similar relations to other objects, but fails to incorporate the inter-object similarities. That is, actors should receive more similarity ``points’’ if they connect to objects that are themselves similar. It thus fails to respect the principle of equivalence.
Generalized Similarities
Kovacs proposed that a generalized measure of relational similiarity could be obtained by tweaking the standard correlation distance formula. To this end, consider a matrix
Kovacs notes that if we have access to an analogous matrix of similarities between actors
Typically, people only have access to the network information and not exogenous indication of pre-existing similarities between actors or objects, required to compute Equation 2, and Equation 3. Kovacs’s (ingenious) solution is to use the duality property and compute “reflective similarities” by obtaining initial object similarities plugging
It is easy to see that Equation 2 and Equation 3 both respect the principle of equivalence. Actors count as similar to the extent they connect to similar objects (
Because the similarities are defined according to the dual projection method, in which the similarity of actors is based on the objects they choose, and the similarity of objects is based on the actors who choose them, Equation 2 and Equation 3 also respect the principle of duality.
E1 | E2 | E3 | E4 | E5 | E6 | E7 | E8 | E9 | E10 | E11 | E12 | E13 | E14 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
EVELYN | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 |
LAURA | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
THERESA | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 |
BRENDA | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
CHARLOTTE | 0 | 0 | 1 | 1 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
FRANCES | 0 | 0 | 1 | 1 | 1 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
ELEANOR | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
RUTH | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 |
VERNE | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 1 | 0 | 0 |
MYRA | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 1 | 0 | 0 |
KATHERINE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 1 | 1 | 1 |
SYLVIA | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 1 |
NORA | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 1 |
HELEN | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 1 | 1 | 0 | 0 |
OLIVIA | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 |
FLORA | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 |
PEARL | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 |
DOROTHY | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 |
Analysis of Southern Women Data
I implemented Kovacs’s algorithm as an R function called gen.sim.corr.abs
, the latest version of which can be found here. To show that it works as intended, I replicate Kovacs’s (2010) analysis of Davis et al.’s (1941) classic Southern Women two-mode network, shown in Table 1.
Figure 1 closely replicates Kovacs’s (2010, Figure 12b) using the current R implementation.The algorithm converged (using cmdscale
from the package stats
in R version 4.2.2.1
As Kovacs noted in the original paper, the distribution of points in the two-dimensional metric MDS space obtained using the generalized relational similarities corresponds to the generalized blockmodeling solution reported by Doreian et al. (2004, Table 4), with two large sets of similar actors separated from one another and from the more peripheral Dorothy and Pearl.
Figure 2 reports the corresponding analysis for the object (events) relational similarities. In the original paper, Kovacs (2010, p. 206), described these results verbally—no plot was provided—as follows:
The generalized similarity model provides a grouping for the events as well (not shown here). This grouping differs slightly from Doreian et al.’s grouping: although the (1, 2, 3, 4, 5) and (10, 11, 12, 13, 14) clusters emerge in the generalized similarity solution as well, the picture differs for events 6, 7, 8, and 9. Event 6 here is clustered together with (1, 2, 3, 4, 5), while events 7, 8 and 9 do not fall into any group but stand separately .
We can see that Figure 2 replicates Kocacs’s verbal description of these results to a tee.
Conclusion
Overall, this suggests that the current implementation of Kovacs’s generalized model of relational similarity works and produces results as intended. Hopefully having it available will increase the use of this relatively under-used relational technique for two-mode (and one-mode) data.
References
Breiger, R. L. (1974). The duality of persons and groups. Social forces, 53(2), 181-190.
Davis, A., Gardner, B. B., & Gardner, M. R. (1941). Deep South A Social Anthropological Study of Caste and Class. University of Chicago Press.
Doreian, P., Batagelj, V., & Ferligoj, A. (2004). Generalized blockmodeling of two-mode network data. Social networks, 26(1), 29-53.
Kovács, B. (2010). A generalized model of relational similarity. Social Networks, 32(3), 197-211.