208A Syllabus
THIS IS A LIVE DOCUMENT CHECK REGULARLY FOR CHANGES
Course Information
- Department: Sociology (SOCIOL)
- Name: SOCIOL 208A - Social Networks Methods
- Meeting Days: Wednesdays
- Meeting Time: 9am-11:50am
- Meeting Location: Haines A37
Instructor Information
- Name: Omar Lizardo
- Position: Professor, UCLA Sociology
- Email: olizardo@g.ucla.edu
- Zoom (by appointment only): https://ucla.zoom.us/my/olizardo
- Office Hours: Wednesdays and Thursdays, 1p-2p
- Office Location: 290 Haines Hall
Class Description
This class is an introductory graduate-level seminar focused on techniques in Social Network Analysis (SNA). The seminar covers the most common data analytic tasks that people engage in when analyzing “network data.” What is network data? What counts as network data is itself a point of contention—as we will see, for some people all data is network data—but let us say for the sake of this class that network data is data in which the unit of analysis is the relation or the interaction between at least two actors or objects, and the data come typically arranged in this “dyadic” form. At the end of the course, you will be familiar with (and will have acquired some practice) the basic techniques used to analyze social network data.
Course Content
Basic SNA
So, what are the things that people usually do when they have network data? Well, they typically want to figure out basic statistics about the interaction system formed by the set of dyads in the data, where a dyad is any two pairs of actors (whether they are connected or not). This task requires computing basic network quantities like the number of nodes and the number of links between entities as well as more advanced statistics based on representing the network as a graph (like the average path length, number of components, etc., all notions we will cover in the first week of class).
Centrality and Prestige
Then come the various things that almost everyone is interested in computing when using network data to answer social science questions. Primarily, this includes measures and indices of a node’s position in the network (e.g., differentiating between more or less central or more or less prestigious nodes), which we will cover in weeks 2 and 3.
Classes and Communities
After taking a break in Week 4, we move to the common case of people wanting to see if the nodes in the network fall into definable clusters or classes, where the criterion for being in the same cluster is based on how they connect to other nodes. Here, we want to find clusters of nodes that are similar to one another by some graph theoretic criterion and partition the graph into clusters based on that criterion.
Week 6 is dedicated to the next thing we may want to do, and that is to see if we can uncover clumps of densely connected nodes in the network indicating some natural partition into subgroups or communities, defined as nodes that interact more among themselves than they do among those outside the group, leading to the myriad of group and community detection techniques designed to partition a graph into clusters based on the underlying connectivity structure.
Two-Mode and Ego Networks
The next two weeks are dedicated to the analysis of some pretty common “non-standard” types of network data (e.g., data that doesn’t use the dyadic relation between objects of the same type as the analytic unit). The first is ego networks, where we first sample a set of units (egos), and then within each ego, we sample a subset of their contacts (e.g., by asking the people who are their most important friends or figuring out the most frequent interaction partners). These data come closest to the traditional data in social science (a rectangular matrix of cases by variables), so various standard techniques—like regression—apply (with some twists).
The second type of non-standard network data comes in a two-mode network form, in which some sets of objects are linked to objects of a different set, but there is no data on the links between objects of the same set. Standard cases by variables data in surveys are two-mode data (people connect to variables), as is any web or archival data collecting memberships or interactions between persons and objects (like attendance at events or people buying books on Amazon). We will see that due to a neat mathematical trick, we can transform two-mode into standard dyadic network data and thus deploy the whole panoply of techniques we learned in weeks 1-6 (which means that we can do SNA on all types of data, not just network data, and therefore all data is network data).
Probabilistic Models of Networks
The bulk of SNA assumes that the ties exist as recorded in the data. Recently (e.g., over the last two decades or so) developed approaches to social network analysis make the ties the dependent variable and thus see the observed network data as a realization of some stochastic process governing the probability that two objects will be linked and thus one that can be modeled statistically. We analyze the theory and methods behind this approach to thinking of network structure from the bottom up and also cover some models designed to treat networks as composed of “relational events” and thus model how events that link entities in networks evolve.
Requirements
There are three main requirements in the class. Participation (mainly attendance and contributions made during our seminar meetings), a short weekly data exercise, and a longer data analysis paper due at the end of the quarter.
Class Attendance and Class Discussion (25% of grade)
Attendance is required not optional. If you need to miss a class meeting please let me know beforehand. It is part of your professional socialization to commit to attending class meetings and to inform me when that’s not possible (if only as a point of courtesy). The informal part of participation will be gauged by your contribution to our class discussion in the form of questions, comments, suggestions, wonderings, problems.
Weekly Data Analysis Exercises (25% of grade)
These will be short weekly assignments in which I will ask you to take a (small) social network data set of your choice and compute some of the basic statistics or implement some of the techniques that we covered the week before. They will be due on Sunday at the end of each week. What you submit will take the form of a file containing the code and results from your analysis (typically an R Markdown file). These will not be graded, but will just be counted as submitted or not submitted.
Final Data Analysis Paper (50% of grade)
The basic goal here is for you to end up with something that could be useful to you at the end of the day. Hopefully a basic data exercise that can be the basis of a longer substantive paper or as a standalone research note.
This will be a 2500 to 5000 word (single-spaced, Times New Roman Font, 12pt, 1in margins) write-up of a data analysis, including some type of network data and/or some kind of network analytic technique (to be discussed and cleared by me). The data source can be obtained from a public repository of network data, or it could be network data that you already have access to or collected yourself.
In the paper, you will describe the data, provide key network metrics, describe the data-analytic approach that you will use, and provide a summary of the key empirical patterns that you found, along with a brief conclusion. The paper should be written in the style of a “research note” focused on key empirical findings (not long theory windup or literature review).
You will submit an extended abstract of your final project, outlining your main research idea (e.g., data source and type of analysis) due on Sunday of week 6. This will be a one-page, single-spaced document with 12pt Times New Roman Font and 1in margins.