Statistical Models of Networks IV

Two-Mode ERGMs

In the two-mode case, ERGMs allow us to test multivariate hypothesis regarding patterns of affiliations between persons and groups, net of other network characteristics.

Let’s load some data, this time we will use Daniel McFarland’s data of high school student affiliations with specific activity clubs collected in 1997:

   url1 <- "https://github.com/JeffreyAlanSmith/Integrated_Network_Science/raw/master/data/affiliations_1997.txt"

   A <- read.delim(file = url1, check.names = FALSE)
   dim(A)
[1] 1295   91

The bi-adjacency matrix is of dimensions \(1295 \times 91\). The clubs are:

   colnames(A)
 [1] "Academic decathalon"                  
 [2] "Art Club"                             
 [3] "Asian Club"                           
 [4] "Band, 8th"                            
 [5] "Band, Jazz"                           
 [6] "Band, Marching (Symphonic)"           
 [7] "Baseball, JV (10th)"                  
 [8] "Baseball, V"                          
 [9] "Basketball, boys 8th"                 
[10] "Basketball, boys 9th"                 
[11] "Basketball, boys JV"                  
[12] "Basketball, boys V"                   
[13] "Basketball, girls 8th"                
[14] "Basketball, girls 9th"                
[15] "Basketball, girls JV"                 
[16] "Basketball, girls V"                  
[17] "Cheerleaders, 8th"                    
[18] "Cheerleaders, 9th"                    
[19] "Cheerleaders, JV"                     
[20] "Cheerleaders, Spirit Squad"           
[21] "Cheerleaders, V"                      
[22] "Chess Club"                           
[23] "Choir, a capella"                     
[24] "Choir, barbershop quartet (4 men)"    
[25] "Choir, chamber singers"               
[26] "Choir, concert"                       
[27] "Choir, treble"                        
[28] "Choir, vocal ensemble (4 women)"      
[29] "Choir, women's ensemble"              
[30] "Close-up"                             
[31] "Cross Country, boys 8th"              
[32] "Cross Country, boys V"                
[33] "Cross Country, girls 8th"             
[34] "Cross Country, girls V"               
[35] "Debate"                               
[36] "Drill Team"                           
[37] "Drunk Driving"                        
[38] "Drunk Driving Officers"               
[39] "Football, 8th"                        
[40] "Football, 9th"                        
[41] "Football, V"                          
[42] "Forensics"                            
[43] "Forensics (National Forensics League)"
[44] "French Club (high)"                   
[45] "French Club (low)"                    
[46] "French NHS"                           
[47] "Full IB Diploma Students (12th)"      
[48] "German Club"                          
[49] "German NHS"                           
[50] "Golf, boys V"                         
[51] "Hispanic Club"                        
[52] "Internships"                          
[53] "Junior Class Board"                   
[54] "Key Club"                             
[55] "Latin Club"                           
[56] "Newspaper Staff"                      
[57] "NHS"                                  
[58] "Orchestra, 8th"                       
[59] "Orchestra, Full Concert"              
[60] "Orchestra, Symphonic"                 
[61] "PEER"                                 
[62] "Pep Club"                             
[63] "Pep Club Officers"                    
[64] "Quiz-Bowl (all)"                      
[65] "Science Olympiad"                     
[66] "Soccer, V"                            
[67] "Softball, JV (10th)"                  
[68] "Softball, V"                          
[69] "Spanish Club"                         
[70] "Spanish Club (high)"                  
[71] "Spanish NHS"                          
[72] "STUCO"                                
[73] "Swim & Dive Team, boys"               
[74] "Swim & Dive Team, girls"              
[75] "Teachers of Tomorrow"                 
[76] "Tennis girls V"                       
[77] "Tennis, boys V"                       
[78] "Theatre Productions"                  
[79] "Thespian Society (ITS)"               
[80] "Track, boys 8th"                      
[81] "Track, boys V"                        
[82] "Track, girls 8th"                     
[83] "Track, girls V"                       
[84] "Volleyball, 8th"                      
[85] "Volleyball, 9th"                      
[86] "Volleyball, JV"                       
[87] "Volleyball, V"                        
[88] "Wrestling, 8th"                       
[89] "Wrestling, V"                         
[90] "Yearbook Contributors"                
[91] "Yearbook Editors"                     

We also have attributes for students and clubs:

   url2 <- "https://github.com/JeffreyAlanSmith/Integrated_Network_Science/raw/master/data/attributes_students.txt"
   
   sa <- read.delim(file = url2, stringsAsFactors = FALSE)
   
   url3 <- "https://github.com/JeffreyAlanSmith/Integrated_Network_Science/raw/master/data/attributes_clubs.txt"
   
   ca <- read.delim(file = url3, stringsAsFactors = FALSE)

We then use the approach outlined here to construct a vertex attribute data frame that igraph can use.

The resulting two mode network has the following characteristics:

   library(network)
   table(V(g)$type)

FALSE  TRUE 
  922    91 
   attribute_list <- do.call(list, sca)
   n <- network(as.matrix(A), bipartite = TRUE, vertex.attr = attribute_list)
   n
 Network attributes:
  vertices = 1013 
  directed = FALSE 
  hyper = FALSE 
  loops = FALSE 
  multiple = FALSE 
  bipartite = 922 
  total edges= 2423 
    missing edges= 0 
    non-missing edges= 2423 

 Vertex attribute names: 
    club_feeder club_profile club_type_detailed club_type_gender club_type_grade gender grade96 grade97 ids missing96 missing97 race type vertex.names 

 Edge attribute names not shown 

We have a network with 2423 edges, 922 people nodes and 91 club nodes. Both the people and the clubs have vertex attributes.

And now let’s fit bipartite ergms!

As always, we begin with the baseline Erdos-Renyi random graph model that fits just the density:

   library(ergm)
   m1 <- ergm(n ~ edges)
   summary(m1)
Call:
ergm(formula = n ~ edges)

Maximum Likelihood Results:

      Estimate Std. Error MCMC % z value Pr(>|z|)    
edges -3.51534    0.02061      0  -170.5   <1e-04 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

     Null Deviance: 116313  on 83902  degrees of freedom
 Residual Deviance:  21953  on 83901  degrees of freedom
 
AIC: 21955  BIC: 21964  (Smaller is better. MC Std. Err. = 0)

Which we can check is indeed the density:

   exp(m1$coefficients[1])/(1 + exp(m1$coefficients))
     edges 
0.02887893 
   2423/(922 * 91)
[1] 0.02887893

Great. Now for more interesting stuff. Let’s look at affiliation heterogeneity based on gender and race.

Affiliation Heterogeneity Based on Node Attributes

For two-mode ergms, this works just a before, except that now we have to specify which node set the attribute belongs to. So instead of nodefactor, we use the b1factor and b2factor ergm terms.

In this case, b1factor refers to the people node set, so that’s what we will use:

   library(ergm)
   m2 <- ergm(n ~ edges 
              + b1factor("gender", base = 2)
              + b1factor("race", base = 5)
              )
   summary(m2)
Call:
ergm(formula = n ~ edges + b1factor("gender", base = 2) + b1factor("race", 
    base = 5))

Maximum Likelihood Results:

                              Estimate Std. Error MCMC % z value Pr(>|z|)    
edges                         -3.71411    0.03870      0 -95.960   <1e-04 ***
b1factor.gender.female         0.33093    0.04349      0   7.609   <1e-04 ***
b1factor.race.Asian            0.06197    0.08848      0   0.700   0.4837    
b1factor.race.black           -0.04807    0.04497      0  -1.069   0.2850    
b1factor.race.Hispanic         0.26870    0.13035      0   2.061   0.0393 *  
b1factor.race.Native American  0.18215    0.21879      0   0.833   0.4051    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

     Null Deviance: 116313  on 83902  degrees of freedom
 Residual Deviance:  21888  on 83896  degrees of freedom
 
AIC: 21900  BIC: 21956  (Smaller is better. MC Std. Err. = 0)

We find that women are more active than men. In the same way, compared to white people, Hispanic people tend to have more memberships.

   library(ergm)
   m2 <- ergm(n ~ edges 
              + b1factor("gender", base = 2) 
              + b1factor("race", base = 5)
              + b1cov("grade97") 
              )
   summary(m2)
Call:
ergm(formula = n ~ edges + b1factor("gender", base = 2) + b1factor("race", 
    base = 5) + b1cov("grade97"))

Maximum Likelihood Results:

                              Estimate Std. Error MCMC % z value Pr(>|z|)    
edges                         -5.11734    0.16218      0 -31.554   <1e-04 ***
b1factor.gender.female         0.33608    0.04349      0   7.728   <1e-04 ***
b1factor.race.Asian            0.03073    0.08859      0   0.347   0.7287    
b1factor.race.black           -0.03454    0.04499      0  -0.768   0.4427    
b1factor.race.Hispanic         0.26985    0.13040      0   2.069   0.0385 *  
b1factor.race.Native American  0.12759    0.21892      0   0.583   0.5600    
b1cov.grade97                  0.13864    0.01537      0   9.020   <1e-04 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

     Null Deviance: 116313  on 83902  degrees of freedom
 Residual Deviance:  21806  on 83895  degrees of freedom
 
AIC: 21820  BIC: 21885  (Smaller is better. MC Std. Err. = 0)

Of course, it could be that some types of clubs are more (or less) attractive foci for affiliation than others. In the two-mode ergm we can explore activity (popularity) differences across both modes, based on attributes, using the b2factor term:

   library(ergm)
   m3 <- ergm(n ~ edges + b1factor("gender", base = 2) 
              + b1factor("race", base = 5)
              + b2factor("club_type_detailed", base = 7)
              )
   summary(m3)
Call:
ergm(formula = n ~ edges + b1factor("gender", base = 2) + b1factor("race", 
    base = 5) + b2factor("club_type_detailed", base = 7))

Maximum Likelihood Results:

                                                 Estimate Std. Error MCMC %
edges                                            -3.59928    0.05590      0
b1factor.gender.female                            0.33443    0.04372      0
b1factor.race.Asian                               0.06266    0.08899      0
b1factor.race.black                              -0.04860    0.04522      0
b1factor.race.Hispanic                            0.27204    0.13126      0
b1factor.race.Native American                     0.18435    0.22022      0
b2factor.club_type_detailed.Academic Competition  0.10037    0.08095      0
b2factor.club_type_detailed.Academic Interest     0.39981    0.06235      0
b2factor.club_type_detailed.Ethnic Interest      -0.73864    0.19258      0
b2factor.club_type_detailed.Individual Sports    -0.87056    0.08573      0
b2factor.club_type_detailed.Leadership           -0.51845    0.17432      0
b2factor.club_type_detailed.Media                -0.46324    0.14129      0
b2factor.club_type_detailed.Service               0.84615    0.06601      0
b2factor.club_type_detailed.Team Sports          -0.80973    0.07102      0
                                                 z value Pr(>|z|)    
edges                                            -64.386  < 1e-04 ***
b1factor.gender.female                             7.649  < 1e-04 ***
b1factor.race.Asian                                0.704 0.481349    
b1factor.race.black                               -1.075 0.282422    
b1factor.race.Hispanic                             2.073 0.038212 *  
b1factor.race.Native American                      0.837 0.402524    
b2factor.club_type_detailed.Academic Competition   1.240 0.215012    
b2factor.club_type_detailed.Academic Interest      6.412  < 1e-04 ***
b2factor.club_type_detailed.Ethnic Interest       -3.835 0.000125 ***
b2factor.club_type_detailed.Individual Sports    -10.155  < 1e-04 ***
b2factor.club_type_detailed.Leadership            -2.974 0.002938 ** 
b2factor.club_type_detailed.Media                 -3.279 0.001043 ** 
b2factor.club_type_detailed.Service               12.818  < 1e-04 ***
b2factor.club_type_detailed.Team Sports          -11.402  < 1e-04 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

     Null Deviance: 116313  on 83902  degrees of freedom
 Residual Deviance:  21072  on 83888  degrees of freedom
 
AIC: 21100  BIC: 21231  (Smaller is better. MC Std. Err. = 0)

Here we can see that there is statistically significant heterogeneity in the attractiveness of clubs as sources of affiliation. In comparison to the base category of art kid clubs (“Performance Arts”) we can see that all clubs are less popular except those based on service and academic pursuits.

Homophily in the Two Mode Case

Obviously, in the two-mode case, we can’t have “homophily” in the standard sense, since people can only connect to groups and groups and people different kind of things, so they can’t technically share attributes.

However, we can look at a different type of homophily and that is whether people tend to make affiliations with the same kind of groups, based on a group attribute.

That’s what the b1nodematch and b2nodematch ergm terms do. The first looks at whether people of the same kind tend to join the same group (number of two stars featuring a group as the focal node connected to two people of the same kind) and the second as to whether people tend to pick groups of the same kind (counts the number of two-stars with a person as the focal node and two groups that share the same attribute).

Let’s look at b2nodematch first:

   library(ergm)
   m4 <- ergm(n ~ edges + b1factor("gender", base = 2) 
              + b1factor("race", base = 5)
              + b2factor("club_type_detailed", base = 7)
              + b2nodematch("club_type_detailed"), 
              estimate ="MPLE"
              )
   summary(m4)
Call:
ergm(formula = n ~ edges + b1factor("gender", base = 2) + b1factor("race", 
    base = 5) + b2factor("club_type_detailed", base = 7) + b2nodematch("club_type_detailed"), 
    estimate = "MPLE")

Maximum Pseudolikelihood Results:

                                                 Estimate Std. Error MCMC %
edges                                            -3.82279    0.05862      0
b1factor.gender.female                            0.26310    0.04404      0
b1factor.race.Asian                               0.07010    0.08929      0
b1factor.race.black                              -0.03046    0.04534      0
b1factor.race.Hispanic                            0.26585    0.13179      0
b1factor.race.Native American                     0.15316    0.22157      0
b2factor.club_type_detailed.Academic Competition  0.22802    0.08150      0
b2factor.club_type_detailed.Academic Interest     0.37327    0.06256      0
b2factor.club_type_detailed.Ethnic Interest      -0.48387    0.19343      0
b2factor.club_type_detailed.Individual Sports    -0.70224    0.08660      0
b2factor.club_type_detailed.Leadership           -0.26526    0.17527      0
b2factor.club_type_detailed.Media                -0.22181    0.14238      0
b2factor.club_type_detailed.Service               0.87144    0.06631      0
b2factor.club_type_detailed.Team Sports          -0.72231    0.07135      0
b2nodematch.club_type_detailed                    0.41356    0.02513      0
                                                 z value Pr(>|z|)    
edges                                            -65.217  < 1e-04 ***
b1factor.gender.female                             5.974  < 1e-04 ***
b1factor.race.Asian                                0.785  0.43238    
b1factor.race.black                               -0.672  0.50169    
b1factor.race.Hispanic                             2.017  0.04368 *  
b1factor.race.Native American                      0.691  0.48943    
b2factor.club_type_detailed.Academic Competition   2.798  0.00514 ** 
b2factor.club_type_detailed.Academic Interest      5.967  < 1e-04 ***
b2factor.club_type_detailed.Ethnic Interest       -2.501  0.01237 *  
b2factor.club_type_detailed.Individual Sports     -8.109  < 1e-04 ***
b2factor.club_type_detailed.Leadership            -1.513  0.13017    
b2factor.club_type_detailed.Media                 -1.558  0.11926    
b2factor.club_type_detailed.Service               13.141  < 1e-04 ***
b2factor.club_type_detailed.Team Sports          -10.124  < 1e-04 ***
b2nodematch.club_type_detailed                    16.457  < 1e-04 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Warning:  The standard errors are based on naive pseudolikelihood and are suspect. Set control.ergm$MPLE.covariance.method='Godambe' for a simulation-based approximation of the standard errors.

     Null Pseudo-deviance: 116313  on 83902  degrees of freedom
 Residual Pseudo-deviance:  20832  on 83887  degrees of freedom
 
AIC: 20862  BIC: 21002  (Smaller is better. MC Std. Err. = 0)

Which tells us, that indeed, when people select groups, they tend to select from within group buckets defined by club type.

Like before, we can check whether this tendency is the same across groups by specifying the argument diff = TRUE:

   library(ergm)
   m5 <- ergm(n ~ edges + b1factor("gender", base = 2) 
              + b1factor("race", base = 5)
              + b2factor("club_type_detailed", base = 7)
              + b2nodematch("club_type_detailed", diff = TRUE, 
                            levels = c(1:6, 8:9)), 
              estimate ="MPLE"
              )
   summary(m5)
Call:
ergm(formula = n ~ edges + b1factor("gender", base = 2) + b1factor("race", 
    base = 5) + b2factor("club_type_detailed", base = 7) + b2nodematch("club_type_detailed", 
    diff = TRUE, levels = c(1:6, 8:9)), estimate = "MPLE")

Maximum Pseudolikelihood Results:

                                                    Estimate Std. Error MCMC %
edges                                               -3.56597    0.05607      0
b1factor.gender.female                               0.27014    0.04474      0
b1factor.race.Asian                                  0.07597    0.09009      0
b1factor.race.black                                 -0.02116    0.04575      0
b1factor.race.Hispanic                               0.20642    0.13395      0
b1factor.race.Native American                        0.14636    0.22256      0
b2factor.club_type_detailed.Academic Competition    -0.66730    0.10792      0
b2factor.club_type_detailed.Academic Interest        0.38531    0.07094      0
b2factor.club_type_detailed.Ethnic Interest         -0.79386    0.19912      0
b2factor.club_type_detailed.Individual Sports       -0.98294    0.09399      0
b2factor.club_type_detailed.Leadership              -0.49734    0.17433      0
b2factor.club_type_detailed.Media                   -0.49065    0.14568      0
b2factor.club_type_detailed.Service                  0.44840    0.07811      0
b2factor.club_type_detailed.Team Sports             -0.90151    0.07930      0
b2nodematch.club_type_detailed.Academic Competition  1.20303    0.05720      0
b2nodematch.club_type_detailed.Academic Interest     0.02433    0.05680      0
b2nodematch.club_type_detailed.Ethnic Interest       1.52377    0.76157      0
b2nodematch.club_type_detailed.Individual Sports     0.47399    0.12795      0
b2nodematch.club_type_detailed.Leadership           -9.64287   86.14685      0
b2nodematch.club_type_detailed.Media                 0.49713    0.51260      0
b2nodematch.club_type_detailed.Service               0.62030    0.04885      0
b2nodematch.club_type_detailed.Team Sports           0.23752    0.08092      0
                                                    z value Pr(>|z|)    
edges                                               -63.596  < 1e-04 ***
b1factor.gender.female                                6.038  < 1e-04 ***
b1factor.race.Asian                                   0.843 0.399046    
b1factor.race.black                                  -0.463 0.643663    
b1factor.race.Hispanic                                1.541 0.123317    
b1factor.race.Native American                         0.658 0.510771    
b2factor.club_type_detailed.Academic Competition     -6.183  < 1e-04 ***
b2factor.club_type_detailed.Academic Interest         5.431  < 1e-04 ***
b2factor.club_type_detailed.Ethnic Interest          -3.987  < 1e-04 ***
b2factor.club_type_detailed.Individual Sports       -10.458  < 1e-04 ***
b2factor.club_type_detailed.Leadership               -2.853 0.004333 ** 
b2factor.club_type_detailed.Media                    -3.368 0.000757 ***
b2factor.club_type_detailed.Service                   5.741  < 1e-04 ***
b2factor.club_type_detailed.Team Sports             -11.368  < 1e-04 ***
b2nodematch.club_type_detailed.Academic Competition  21.032  < 1e-04 ***
b2nodematch.club_type_detailed.Academic Interest      0.428 0.668427    
b2nodematch.club_type_detailed.Ethnic Interest        2.001 0.045410 *  
b2nodematch.club_type_detailed.Individual Sports      3.704 0.000212 ***
b2nodematch.club_type_detailed.Leadership            -0.112 0.910875    
b2nodematch.club_type_detailed.Media                  0.970 0.332143    
b2nodematch.club_type_detailed.Service               12.698  < 1e-04 ***
b2nodematch.club_type_detailed.Team Sports            2.935 0.003334 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Warning:  The standard errors are based on naive pseudolikelihood and are suspect. Set control.ergm$MPLE.covariance.method='Godambe' for a simulation-based approximation of the standard errors.

     Null Pseudo-deviance: 116313  on 83902  degrees of freedom
 Residual Pseudo-deviance:  20500  on 83880  degrees of freedom
 
AIC: 20544  BIC: 20750  (Smaller is better. MC Std. Err. = 0)

The results show that compared to art kid clubs, “homophily” in club selection is stronger for Academic competiton, Service, and Individual sport clubs.

We can do the same to check out whether there is homophily on the people side. This tests hypotheses of the type: Do people with the same attributes tend to join the same clubs?

Here’s an example using racial identification:

   library(ergm)
   m6 <- ergm(n ~ edges + b1factor("gender", base = 2) 
              + b1factor("race", base = 5)
              + b2factor("club_type_detailed", base = 7)
              + b1nodematch("race", diff = TRUE, levels = 1:4),
              estimate ="MPLE"
              )
   summary(m6)
Call:
ergm(formula = n ~ edges + b1factor("gender", base = 2) + b1factor("race", 
    base = 5) + b2factor("club_type_detailed", base = 7) + b1nodematch("race", 
    diff = TRUE, levels = 1:4), estimate = "MPLE")

Maximum Pseudolikelihood Results:

                                                  Estimate Std. Error MCMC %
edges                                            -3.544174   0.055893      0
b1factor.gender.female                            0.341908   0.044110      0
b1factor.race.Asian                              -0.679353   0.124135      0
b1factor.race.black                              -0.553479   0.056490      0
b1factor.race.Hispanic                           -0.373579   0.175540      0
b1factor.race.Native American                    -0.185026   0.270507      0
b2factor.club_type_detailed.Academic Competition  0.123044   0.081117      0
b2factor.club_type_detailed.Academic Interest     0.292581   0.063349      0
b2factor.club_type_detailed.Ethnic Interest      -1.288069   0.225703      0
b2factor.club_type_detailed.Individual Sports    -0.788961   0.085917      0
b2factor.club_type_detailed.Leadership           -0.474640   0.174383      0
b2factor.club_type_detailed.Media                -0.397085   0.141393      0
b2factor.club_type_detailed.Service               0.513993   0.070547      0
b2factor.club_type_detailed.Team Sports          -0.724093   0.071205      0
b1nodematch.race.Asian                            0.212673   0.015707      0
b1nodematch.race.black                            0.033657   0.001801      0
b1nodematch.race.Hispanic                         0.520407   0.061182      0
b1nodematch.race.Native American                  0.726083   0.200975      0
                                                 z value Pr(>|z|)    
edges                                            -63.410  < 1e-04 ***
b1factor.gender.female                             7.751  < 1e-04 ***
b1factor.race.Asian                               -5.473  < 1e-04 ***
b1factor.race.black                               -9.798  < 1e-04 ***
b1factor.race.Hispanic                            -2.128 0.033323 *  
b1factor.race.Native American                     -0.684 0.493978    
b2factor.club_type_detailed.Academic Competition   1.517 0.129297    
b2factor.club_type_detailed.Academic Interest      4.619  < 1e-04 ***
b2factor.club_type_detailed.Ethnic Interest       -5.707  < 1e-04 ***
b2factor.club_type_detailed.Individual Sports     -9.183  < 1e-04 ***
b2factor.club_type_detailed.Leadership            -2.722 0.006492 ** 
b2factor.club_type_detailed.Media                 -2.808 0.004979 ** 
b2factor.club_type_detailed.Service                7.286  < 1e-04 ***
b2factor.club_type_detailed.Team Sports          -10.169  < 1e-04 ***
b1nodematch.race.Asian                            13.540  < 1e-04 ***
b1nodematch.race.black                            18.687  < 1e-04 ***
b1nodematch.race.Hispanic                          8.506  < 1e-04 ***
b1nodematch.race.Native American                   3.613 0.000303 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Warning:  The standard errors are based on naive pseudolikelihood and are suspect. Set control.ergm$MPLE.covariance.method='Godambe' for a simulation-based approximation of the standard errors.

     Null Pseudo-deviance: 116313  on 83902  degrees of freedom
 Residual Pseudo-deviance:  20565  on 83884  degrees of freedom
 
AIC: 20601  BIC: 20769  (Smaller is better. MC Std. Err. = 0)

Which shows that compared to white students, members of the same racialized minority category tend to be more likely to join the same group.

Conditining on Degree

Note that the models above only condition on the number of edges in the network (a two-mode version of the Erdos-Renyi model). So we may want to check whether any results regarding activity and homophily hold up against a more sophisticated null model, perhaps one that conditions on degree.

Note that in the two mode case we have two sets of degrees to worry about, the people and the groups, corresponding to the gwb1degree and gwb2degree geometrically weighted decay terms:

   library(ergm)
   m7 <- ergm(n ~ edges 
              + gwb1degree(decay = 0.25, fixed = TRUE)
              + gwb2degree(decay = 0.25, fixed = TRUE)
              + b1factor("gender", base = 2) 
              + b1factor("race", base = 5)
              + b2factor("club_type_detailed", base = 7)
              + b1nodematch("race", diff = TRUE, levels = 1:4),
              estimate ="MPLE"
              )
   summary(m7)
Call:
ergm(formula = n ~ edges + gwb1degree(decay = 0.25, fixed = TRUE) + 
    gwb2degree(decay = 0.25, fixed = TRUE) + b1factor("gender", 
    base = 2) + b1factor("race", base = 5) + b2factor("club_type_detailed", 
    base = 7) + b1nodematch("race", diff = TRUE, levels = 1:4), 
    estimate = "MPLE")

Maximum Pseudolikelihood Results:

                                                   Estimate Std. Error MCMC %
edges                                             -3.241289   0.058868      0
gwb1deg.fixed.0.25                                -1.419093   0.087824      0
gwb2deg.fixed.0.25                               -53.587909  55.867644      0
b1factor.gender.female                             0.240393   0.044445      0
b1factor.race.Asian                               -0.688534   0.124691      0
b1factor.race.black                               -0.541557   0.056688      0
b1factor.race.Hispanic                            -0.488329   0.176411      0
b1factor.race.Native American                     -0.168338   0.272995      0
b2factor.club_type_detailed.Academic Competition   0.108630   0.082188      0
b2factor.club_type_detailed.Academic Interest      0.282413   0.064644      0
b2factor.club_type_detailed.Ethnic Interest       -1.318605   0.227371      0
b2factor.club_type_detailed.Individual Sports     -0.795388   0.086082      0
b2factor.club_type_detailed.Leadership            -0.492344   0.175090      0
b2factor.club_type_detailed.Media                 -0.413327   0.142185      0
b2factor.club_type_detailed.Service                0.501474   0.071752      0
b2factor.club_type_detailed.Team Sports           -0.738812   0.072287      0
b1nodematch.race.Asian                             0.218717   0.015995      0
b1nodematch.race.black                             0.034394   0.001829      0
b1nodematch.race.Hispanic                          0.531126   0.061868      0
b1nodematch.race.Native American                   0.801330   0.206123      0
                                                 z value Pr(>|z|)    
edges                                            -55.060  < 1e-04 ***
gwb1deg.fixed.0.25                               -16.158  < 1e-04 ***
gwb2deg.fixed.0.25                                -0.959 0.337461    
b1factor.gender.female                             5.409  < 1e-04 ***
b1factor.race.Asian                               -5.522  < 1e-04 ***
b1factor.race.black                               -9.553  < 1e-04 ***
b1factor.race.Hispanic                            -2.768 0.005638 ** 
b1factor.race.Native American                     -0.617 0.537477    
b2factor.club_type_detailed.Academic Competition   1.322 0.186264    
b2factor.club_type_detailed.Academic Interest      4.369  < 1e-04 ***
b2factor.club_type_detailed.Ethnic Interest       -5.799  < 1e-04 ***
b2factor.club_type_detailed.Individual Sports     -9.240  < 1e-04 ***
b2factor.club_type_detailed.Leadership            -2.812 0.004924 ** 
b2factor.club_type_detailed.Media                 -2.907 0.003650 ** 
b2factor.club_type_detailed.Service                6.989  < 1e-04 ***
b2factor.club_type_detailed.Team Sports          -10.221  < 1e-04 ***
b1nodematch.race.Asian                            13.674  < 1e-04 ***
b1nodematch.race.black                            18.806  < 1e-04 ***
b1nodematch.race.Hispanic                          8.585  < 1e-04 ***
b1nodematch.race.Native American                   3.888 0.000101 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Warning:  The standard errors are based on naive pseudolikelihood and are suspect. Set control.ergm$MPLE.covariance.method='Godambe' for a simulation-based approximation of the standard errors.

     Null Pseudo-deviance: 116313  on 83902  degrees of freedom
 Residual Pseudo-deviance:  20198  on 83882  degrees of freedom
 
AIC: 20238  BIC: 20425  (Smaller is better. MC Std. Err. = 0)

Which shows that our results hold up even after accounting for degree-heterogeneity at the graph level.