208A Syllabus
THIS IS A LIVE DOCUMENT. CHECK REGULARLY FOR CHANGES
Course Information
- Department: Sociology (SOCIOL)
- Name: SOCIOL 208A - Social Networks Methods
- Meeting Days: Wednesdays
- Meeting Time: 9am-11:50am
- Meeting Location: Haines A37
Instructor Information
- Name: Omar Lizardo
- Position: Professor, UCLA Sociology
- Email: olizardo@soc.ucla.edu
- Zoom (by appointment only): https://ucla.zoom.us/my/olizardo
- Office Hours: Wednesdays and Thursdays, 1p-2p
- Office Location: 290 Haines Hall
Class Description
This class is an introductory graduate-level seminar focused on techniques in Social Network Analysis (SNA). The seminar covers the most common data-analytic tasks people perform when analyzing “network data.” What is network data? What counts as network data is itself a point of contention—as we will see, for some people all data is network data—but let us say for the sake of this class that network data is data in which the unit of analysis is the relation or the interaction between at least two actors or objects, and the data come typically arranged in this “dyadic” form. At the end of the course, you will be familiar with (and will have acquired some practice) the basic techniques used to analyze social network data.
Course Content
Basic SNA
So, what are the things that people usually do when they have network data? Well, they typically want to figure out basic statistics about the interaction system formed by the set of dyads in the data, where a dyad is any pair of actors (whether or not they are connected). This task requires computing basic network quantities like the number of nodes and the number of links between entities, as well as more advanced statistics based on representing the network as a graph (like the average path length, number of components, etc., all notions we will cover in the first week of class).
Centrality and Prestige
Then come the various things that almost everyone is interested in computing when using network data to answer social science questions. Primarily, this includes measures and indices of a node’s position in the network (e.g., differentiating between more or less central or more or less prestigious nodes), which we will cover in weeks 2 and 3.
Ego Networks
During the class, we will learn how to analyze some pretty common “non-standard” types of network data (e.g., data that doesn’t use the dyadic relation between objects of the same type as the analytic unit). The first is ego networks, where we first sample a set of units (egos), and then, within each ego, we sample a subset of their contacts (e.g., by asking for their most important friends or identifying the most frequent interaction partners). These data come closest to the traditional data in social science (a rectangular matrix of cases by variables), so various standard techniques—like regression—apply (with some twists).
Classes and Communities
After taking a break in Week 4, we move to the common case of people wanting to see if the nodes in the network fall into definable clusters or classes, where the criterion for being in the same cluster is based on how they connect to other nodes. Here, we want to find clusters of nodes that are similar to one another by some graph theoretic criterion and partition the graph into clusters based on that criterion.
Week 6 is dedicated to the next thing we may want to do, and that is to see if we can uncover clumps of densely connected nodes in the network indicating some natural partition into subgroups or communities, defined as nodes that interact more among themselves than they do among those outside the group, leading to the myriad of group and community detection techniques designed to partition a graph into clusters based on the underlying connectivity structure.
Two-Mode Networks
The second type of non-standard network data is a two-mode network, in which some sets of objects are linked to objects in a different set, but there is no data on the links between objects within the same set. Standard survey data are two-mode data (people connect to variables), as are web or archival data that collect memberships or interactions between persons and objects (such as event attendance or people buying books on Amazon). We will see that due to a neat mathematical trick, we can transform two-mode into standard dyadic network data and thus deploy the whole panoply of techniques we learned in weeks 1-6 (which means that we can do SNA on all types of data, not just network data, and therefore all data is network data).
Probabilistic Models of Networks
The bulk of SNA assumes that the ties exist as recorded in the data. Recently (e.g., over the last two decades or so), developed approaches to social network analysis treat ties as the dependent variable and thus view observed network data as a realization of a stochastic process governing the probability that two objects will be linked, and thus as a statistical model. We analyze the theory and methods behind this approach to thinking about network structure from the bottom up, and we also cover models that treat networks as composed of “relational events,” thereby modeling how events linking entities in networks evolve.
Requirements
There are three main requirements in the class. Participation (mainly attendance and contributions made during our seminar meetings), a short weekly data exercise, and a longer data analysis paper due at the end of the quarter.
Class Attendance and Class Discussion (25% of grade)
Attendance is required, not optional. If you need to miss a class meeting, please let me know beforehand. It is part of your professional socialization to commit to attending class meetings and to inform me when that’s not possible (if only as a point of courtesy). The informal aspect of participation will be gauged by your contributions to our class discussion, in the form of questions, comments, suggestions, wonderings, and problems.
Weekly Data Analysis Exercises (25% of grade)
These will be short weekly assignments in which I will ask you to take a (small) social network dataset of your choice and compute some basic statistics or implement some of the techniques we covered the previous week. They will be due on Sunday at the end of each week. What you submit will be a file containing the code and results of your analysis (typically an R Markdown file). These will not be graded; they will simply be counted as submitted or not submitted.
Final Data Analysis Paper (50% of grade)
The basic goal here is for you to end up with something that could be useful to you at the end of the day. Hopefully, a basic data exercise that can serve as the basis for a longer substantive paper or as a standalone research note.
This will be a 2500 to 5000-word (single-spaced, Times New Roman Font, 12pt, 1in margins) write-up of a data analysis, including some type of network data and/or some kind of network analytic technique (to be discussed and cleared by me). The data source can be obtained from a public network data repository, or from network data you already have access to, or that you collected yourself.
In the paper, you will describe the data, provide key network metrics, describe the data-analytic approach that you will use, and provide a summary of the key empirical patterns that you found, along with a brief conclusion. The paper should be written in the style of a “research note” focused on key empirical findings (not a long theory windup or literature review).
You will submit an extended abstract of your final project, outlining your main research idea (e.g., data source and type of analysis) due on the Sunday of week 6. This will be a one-page, single-spaced document with 12pt Times New Roman Font and 1in margins.