[[1]]
IGRAPH 08a202a DNW- 497 1213 -- ATP Season 1968
+ attr: name (g/c), name (v/c), age (v/n), hand (v/c), country (v/c),
| surface (e/c), weight (e/n)
+ edges from 08a202a (vertex names):
[1] U Unknown ->Jose Mandarino Alfredo Acuna->Alan Fox
[3] Andres Gimeno->Ken Rosewall Andres Gimeno->Raymond Moore
[5] Juan Gisbert ->Tom Okker Juan Gisbert ->Zeljko Franulovic
[7] Onny Parun ->Jan Kodes Peter Curtis ->Tom Okker
[9] Premjit Lall ->Clark Graebner Rod Laver ->Ken Rosewall
[11] Thomas Lejus ->Nicola Pietrangeli Tom Okker ->Arthur Ashe
[13] U Unknown ->Jaidip Mukherjea U Unknown ->Jose Luis Arillla
+ ... omitted several edges
[[2]]
IGRAPH c895c57 DNW- 446 1418 -- ATP Season 1969
+ attr: name (g/c), name (v/c), age (v/n), hand (v/c), country (v/c),
| surface (e/c), weight (e/n)
+ edges from c895c57 (vertex names):
[1] U Unknown ->U Unknown
[2] Alejandro Olmedo ->Ron Holmberg
[3] Arthur Ashe ->Rod Laver
[4] Bob Carmichael ->Jim Osborne
[5] Cliff Richey ->Zeljko Franulovic
[6] Francois Jauffret ->Martin Mulligan
[7] Fred Stolle ->John Newcombe
+ ... omitted several edges
[[3]]
IGRAPH d6709af DNW- 451 1650 -- ATP Season 1970
+ attr: name (g/c), name (v/c), age (v/n), hand (v/c), country (v/c),
| surface (e/c), weight (e/n)
+ edges from d6709af (vertex names):
[1] Mark Cox ->Jan Kodes
[2] Charlie Pasarell ->Stan Smith
[3] Cliff Richey ->Arthur Ashe
[4] Francois Jauffret ->Manuel Orantes
[5] Georges Goven ->Jan Kodes
[6] Harald Elschenbroich->Zeljko Franulovic
[7] Ilie Nastase ->Zeljko Franulovic
+ ... omitted several edges
[[4]]
IGRAPH a73a020 DNW- 459 2580 -- ATP Season 1971
+ attr: name (g/c), name (v/c), age (v/n), hand (v/c), country (v/c),
| surface (e/c), weight (e/n)
+ edges from a73a020 (vertex names):
[1] Andres Gimeno ->Ken Rosewall Arthur Ashe ->Rod Laver
[3] Charlie Pasarell ->Cliff Drysdale Frank Froehling ->Clark Graebner
[5] Joaquin Loyo Mayo->Thomaz Koch John Alexander ->John Newcombe
[7] John Newcombe ->Marty Riessen Nikola Pilic ->Cliff Drysdale
[9] Owen Davidson ->Cliff Drysdale Robert Maud ->Cliff Drysdale
[11] Roger Taylor ->Marty Riessen Roy Emerson ->Rod Laver
[13] Tom Okker ->John Newcombe Allan Stone ->Bob Carmichael
+ ... omitted several edges
[[5]]
IGRAPH 761ed05 DNW- 504 2767 -- ATP Season 1972
+ attr: name (g/c), name (v/c), age (v/n), hand (v/c), country (v/c),
| surface (e/c), weight (e/n)
+ edges from 761ed05 (vertex names):
[1] Jean Loup Rouyer->Stan Smith Marty Riessen ->Cliff Drysdale
[3] Roy Emerson ->Arthur Ashe Roy Emerson ->Arthur Ashe
[5] Tom Leonard ->John Newcombe Tom Okker ->Arthur Ashe
[7] Tom Okker ->Rod Laver Adriano Panatta ->Andres Gimeno
[9] Adriano Panatta ->Ilie Nastase Allan Stone ->John Alexander
[11] Allan Stone ->Marty Riessen Andres Gimeno ->Jan Kodes
[13] Andres Gimeno ->Stan Smith Andrew Pattison ->Ilie Nastase
+ ... omitted several edges
[[6]]
IGRAPH 92ff576 DNW- 592 3653 -- ATP Season 1973
+ attr: name (g/c), name (v/c), age (v/n), hand (v/c), country (v/c),
| surface (e/c), weight (e/n)
+ edges from 92ff576 (vertex names):
[1] Harold Solomon ->Stan Smith John Alexander ->Stan Smith
[3] Patrice Dominguez ->Paolo Bertolucci Paul Gerken ->Jimmy Connors
[5] Roy Emerson ->Rod Laver Adriano Panatta ->Ilie Nastase
[7] Bjorn Borg ->Adriano Panatta Brian Gottfried ->Cliff Richey
[9] Charlie Pasarell ->John Alexander Cliff Richey ->Stan Smith
[11] Corrado Barazzutti->Bjorn Borg Francois Jauffret ->Ilie Nastase
[13] Georges Goven ->Manuel Orantes Manuel Orantes ->Ilie Nastase
+ ... omitted several edges
Handling Graph Objects in Lists
Sometimes network data comes pre-stored as an R
list. This is typical if you have a network with multiple kinds of ties recorded on the same set of actors (and thus multiple networks), or longitudinal network data, where we collect multiple “snapshots” of the same system (containing the same or more typically a different set of actors per time slice).
The networkdata
package contains one such data set called atp
. It’s a network of Tennis players who played in grand slam or official matches of the Association of Tennis Professionals (hence ATP) covering the years 1968-2021 (Radicchi 2011).
In the directed graph representing each network, a tie goes from the loser to the winner of each match. Accordingly, it can be interpreted as a directed “deference” network (it would be a dominance network if it was the other way around), where actor i “defers” to actor j by getting their ass kicked by them.
Let’s see how this list of networks works:
We create a graph object and then examine its contents, which we can see is a set of graph objects. In unnamed R
lists each of the objects inside is indexed by a number in double brackets. So [[6]] just means the sixth network in the list object (corresponding to the year 1973).
Now let’s say we wanted to compute a network statistic like density. One way to proceed would be:
Error in `ensure_igraph()`:
! Must provide a graph object (provided wrong object type).
Which gives us a weird error about the wrong object type. The reason is that edge_density
expects an igraph
graph object as input, but g is not a graph object it is a list of such objects. For it to work you have to reference a particular element inside the list not the whole list.
To do that, we use the double bracket notation:
Which gives us the density for the 1973 network.
Looping Through Lists
But what if we wanted a table of network statistics for all the years or some subset of years? Of course, we could just type a million versions of the edge_density
command or whatever, but that would be tedious. We could also write a for
loop or something like that (less tedious). Even less tedious is to use the many apply
functions in R
that are designed to work with lists, which is a subject onto itself in R
programming.
But here we can just use the simple version. Let’s say we wanted a vector of densities (or any other whole network statistic) for the whole 54 years. In that case, our friend sapply
can do the job:
[1] 0.004920653 0.007144657 0.008130081 0.012272740 0.010914671 0.010440961
[7] 0.010567864 0.013315132 0.012088214 0.014019237 0.014135328 0.011649909
[13] 0.011172821 0.011261426 0.012703925 0.012177336 0.012648755 0.012445937
[19] 0.012034362 0.012351377 0.010174271 0.009772014 0.019526953 0.012236462
[25] 0.014050245 0.015054181 0.013872832 0.014727924 0.014329906 0.013935502
[31] 0.013962809 0.013870042 0.013665097 0.013818887 0.012551113 0.011571679
[37] 0.012329090 0.012923683 0.011402945 0.012677988 0.012256963 0.013512884
[43] 0.012543025 0.013661748 0.013786518 0.013679697 0.015052857 0.015075622
[49] 0.015081206 0.014346468 0.015764351 0.020169225 0.011889114 0.016935400
sapply
is kind of a “meta” function that takes two inputs: A list, and the name of a function (which could be native, a package, or user defined); sapply
then “applies” that function to each element inside the list. Here we asked R
to apply the function edge_density
to each element of the list of networks g and it obliged, creating a vector of length 54 containing the info.
We could use any igraph
function, like number of nodes in the graph:
[1] 497 446 451 459 504 592 595 535 553 524 509 572 582 573 554 532 495 513 510
[20] 523 596 597 405 542 509 498 520 496 502 499 497 480 486 479 497 517 505 492
[39] 524 488 493 464 482 459 457 453 428 430 431 438 419 364 345 393
We could also select subset of elements inside the list. For instance this counts the number of nodes for the first five years:
Or for years 2, 6, 8, and 12:
Note the single bracket notation here to refer to subsets of elements in the list. Inside the brackets we could put any arbitrary vector, as long as the numbers in the vector do no exceed the length of the list.
Of course, sometimes the functions we apply to elements of the list don’t return single numbers but vectors or other igraph objects. In that case it would be better to use lapply
which is just like sapply
but returns another list with the set of answers inside it.
For instance, let’s say we wanted the top five players for each year. In this deference network, a “top” player is one who beats many others, which means they have high indegree (lots of losers pointing at them).
First we create a custom function to compute the indegree and return an ordered named vector of top 5 players:
Now, we can just feed that function to lapply
:
[[1]]
Arthur Ashe Rod Laver Clark Graebner Ken Rosewall Tom Okker
33 27 25 23 22
[[2]]
John Newcombe Tom Okker Rod Laver Tony Roche Arthur Ashe
45 41 40 40 33
[[3]]
Arthur Ashe Cliff Richey Rod Laver Stan Smith
51 49 48 45
Zeljko Franulovic
42
[[4]]
Ilie Nastase Tom Okker Marty Riessen Stan Smith
69 63 61 61
Zeljko Franulovic
60
[[5]]
Ilie Nastase Stan Smith Manuel Orantes Jimmy Connors Arthur Ashe
99 72 68 65 55
[[6]]
Ilie Nastase Tom Okker Jimmy Connors Arthur Ashe Stan Smith
96 81 68 63 63
Which is a list of named vectors containing the number of victories of the top five players each year.
Because the object top.list
is just a list, we can subset it just like before. Let’s say we wanted to see the top players for more recent years:
[[1]]
Andy Murray Dominic Thiem Kei Nishikori Novak Djokovic David Goffin
63 55 53 50 47
[[2]]
Rafael Nadal David Goffin Alexander Zverev
58 55 52
Roberto Bautista Agut Dominic Thiem
45 43
[[3]]
Dominic Thiem Alexander Zverev Novak Djokovic Fabio Fognini
51 50 46 45
Roger Federer
44
[[4]]
Daniil Medvedev Novak Djokovic Rafael Nadal Stefanos Tsitsipas
55 52 52 49
Roger Federer
47
[[5]]
Andrey Rublev Novak Djokovic Stefanos Tsitsipas Rafael Nadal
40 36 27 26
Daniil Medvedev
24
[[6]]
Daniil Medvedev Stefanos Tsitsipas Casper Ruud Alexander Zverev
54 52 52 51
Novak Djokovic
49
A series of names which make sense to you if you follow Tennis.
Naming Lists
Finally, sometimes it useful to name the elements of a list. In this case, for instance, having the year number would be easier to remember what’s what. For this, you can use the names
command, which works via standard R
assignment:
$`1968`
IGRAPH 08a202a DNW- 497 1213 -- ATP Season 1968
+ attr: name (g/c), name (v/c), age (v/n), hand (v/c), country (v/c),
| surface (e/c), weight (e/n)
+ edges from 08a202a (vertex names):
[1] U Unknown ->Jose Mandarino Alfredo Acuna->Alan Fox
[3] Andres Gimeno->Ken Rosewall Andres Gimeno->Raymond Moore
[5] Juan Gisbert ->Tom Okker Juan Gisbert ->Zeljko Franulovic
[7] Onny Parun ->Jan Kodes Peter Curtis ->Tom Okker
[9] Premjit Lall ->Clark Graebner Rod Laver ->Ken Rosewall
[11] Thomas Lejus ->Nicola Pietrangeli Tom Okker ->Arthur Ashe
[13] U Unknown ->Jaidip Mukherjea U Unknown ->Jose Luis Arillla
+ ... omitted several edges
$`1969`
IGRAPH c895c57 DNW- 446 1418 -- ATP Season 1969
+ attr: name (g/c), name (v/c), age (v/n), hand (v/c), country (v/c),
| surface (e/c), weight (e/n)
+ edges from c895c57 (vertex names):
[1] U Unknown ->U Unknown
[2] Alejandro Olmedo ->Ron Holmberg
[3] Arthur Ashe ->Rod Laver
[4] Bob Carmichael ->Jim Osborne
[5] Cliff Richey ->Zeljko Franulovic
[6] Francois Jauffret ->Martin Mulligan
[7] Fred Stolle ->John Newcombe
+ ... omitted several edges
$`1970`
IGRAPH d6709af DNW- 451 1650 -- ATP Season 1970
+ attr: name (g/c), name (v/c), age (v/n), hand (v/c), country (v/c),
| surface (e/c), weight (e/n)
+ edges from d6709af (vertex names):
[1] Mark Cox ->Jan Kodes
[2] Charlie Pasarell ->Stan Smith
[3] Cliff Richey ->Arthur Ashe
[4] Francois Jauffret ->Manuel Orantes
[5] Georges Goven ->Jan Kodes
[6] Harald Elschenbroich->Zeljko Franulovic
[7] Ilie Nastase ->Zeljko Franulovic
+ ... omitted several edges
$`1971`
IGRAPH a73a020 DNW- 459 2580 -- ATP Season 1971
+ attr: name (g/c), name (v/c), age (v/n), hand (v/c), country (v/c),
| surface (e/c), weight (e/n)
+ edges from a73a020 (vertex names):
[1] Andres Gimeno ->Ken Rosewall Arthur Ashe ->Rod Laver
[3] Charlie Pasarell ->Cliff Drysdale Frank Froehling ->Clark Graebner
[5] Joaquin Loyo Mayo->Thomaz Koch John Alexander ->John Newcombe
[7] John Newcombe ->Marty Riessen Nikola Pilic ->Cliff Drysdale
[9] Owen Davidson ->Cliff Drysdale Robert Maud ->Cliff Drysdale
[11] Roger Taylor ->Marty Riessen Roy Emerson ->Rod Laver
[13] Tom Okker ->John Newcombe Allan Stone ->Bob Carmichael
+ ... omitted several edges
$`1972`
IGRAPH 761ed05 DNW- 504 2767 -- ATP Season 1972
+ attr: name (g/c), name (v/c), age (v/n), hand (v/c), country (v/c),
| surface (e/c), weight (e/n)
+ edges from 761ed05 (vertex names):
[1] Jean Loup Rouyer->Stan Smith Marty Riessen ->Cliff Drysdale
[3] Roy Emerson ->Arthur Ashe Roy Emerson ->Arthur Ashe
[5] Tom Leonard ->John Newcombe Tom Okker ->Arthur Ashe
[7] Tom Okker ->Rod Laver Adriano Panatta ->Andres Gimeno
[9] Adriano Panatta ->Ilie Nastase Allan Stone ->John Alexander
[11] Allan Stone ->Marty Riessen Andres Gimeno ->Jan Kodes
[13] Andres Gimeno ->Stan Smith Andrew Pattison ->Ilie Nastase
+ ... omitted several edges
$`1973`
IGRAPH 92ff576 DNW- 592 3653 -- ATP Season 1973
+ attr: name (g/c), name (v/c), age (v/n), hand (v/c), country (v/c),
| surface (e/c), weight (e/n)
+ edges from 92ff576 (vertex names):
[1] Harold Solomon ->Stan Smith John Alexander ->Stan Smith
[3] Patrice Dominguez ->Paolo Bertolucci Paul Gerken ->Jimmy Connors
[5] Roy Emerson ->Rod Laver Adriano Panatta ->Ilie Nastase
[7] Bjorn Borg ->Adriano Panatta Brian Gottfried ->Cliff Richey
[9] Charlie Pasarell ->John Alexander Cliff Richey ->Stan Smith
[11] Corrado Barazzutti->Bjorn Borg Francois Jauffret ->Ilie Nastase
[13] Georges Goven ->Manuel Orantes Manuel Orantes ->Ilie Nastase
+ ... omitted several edges
Now instead of the useless one, two, three, etc. names, we have the actual year numbers as the names of the elements on each list.
So if we wanted to know the top five players for 1988 we could just type:
Stefan Edberg Andre Agassi Boris Becker Mats Wilander
63 59 52 49
Aaron Krickstein
48
Note the double bracket notation and the fact that the name of the list is a character not a number (hence the scare quotes).
If we don’t want to remember the bracket business, we could also use the $ operator to refer to particular list elements:
Stefan Edberg Andre Agassi Boris Becker Mats Wilander
63 59 52 49
Aaron Krickstein
48
Of course, we can also use the names to subset the list. Let’s say we wanted the top five players for 1970, 1980, 1990, 2000, 2010, and 2020.
All we have to do is type:
$`1970`
Arthur Ashe Cliff Richey Rod Laver Stan Smith
51 49 48 45
Zeljko Franulovic
42
$`1980`
Ivan Lendl John Mcenroe Brian Gottfried Bjorn Borg Eliot Teltscher
97 76 63 62 62
$`1990`
Boris Becker Stefan Edberg Ivan Lendl Pete Sampras Emilio Sanchez
62 57 50 47 44
$`2000`
Yevgeny Kafelnikov Marat Safin Gustavo Kuerten Magnus Norman
63 61 59 58
Lleyton Hewitt
53
$`2010`
Rafael Nadal Roger Federer David Ferrer Robin Soderling Jurgen Melzer
63 54 53 53 51
$`2020`
Andrey Rublev Novak Djokovic Stefanos Tsitsipas Rafael Nadal
40 36 27 26
Daniil Medvedev
24
Note that we are back to the single bracket notation.
With a bit of practice, lists will become your friends!