Peer effects, knowledge transfer and social influence

The structural approach to social networks is inherently beautiful as a representational approach. I am always in awe of the fact that we can learn so much about how human beings act or their outcomes based merely on the pattern of their social ties. The idea is both simple and profound.

The structural approach is built on assumptions regarding information transfer across a simpler unit of analysis: the dyad. In the world of dyads, new complications arise and different theories must be developed and tested.

Let us take the Professionals data we have been analyzing as an example. Here is the advice network among these professionals.

Screen Shot 2017-05-04 at 10.45.24 AM.png

In the prior analyses, we have focused on analyzing the structure of each node’s connections.  For example, each node has a specific number of incoming connections, its outdegree:

Screen Shot 2017-05-04 at 10.47.03 AM.png

The beauty of the structural approach to social networks is that we can learn a lot about the outcomes of individuals and organizations by merely looking at the pattern of their relationships. Recall our prior analysis. There is information in indegree. We were able to explain 6.5% of the variation in our measure of whether a person has the “knowledge to succeed” just by looking at the count of their incoming connections! While indegree may capture or reflect other processes and might not be causal, it is nevertheless information rich.

However, an Ego’s alters (e.g., the people that a focal node is connected to) are not all the same—as we sometimes implicitly assume in our models. As a note, I don’t believe that researchers actually believe that all the people we are connected to are the same. Indeed, betweenness, closeness, eigenvector centrality, all assume that not all connections are the same by their very construction. However, the heterogeneity in alter characteristics is implicit rather than explicit because we never specify in our theories or models, exactly how these individuals vary.

The peer effects framework on the other had often ignores variation in structure, but emphasizes variation in the characteristics of connections.

Below, I walk through some examples of this approach.

A simple model of peer effects

The “peer effects” framework is called as such because it is based on a line of research in the economics of education where scholars were attempting to understand the impact of classroom peers on academic outcomes. Hence, peer effects.

Let us start with a simple setup. Let us assume there are 100 students in a classroom. The teacher has decided that everyone in the class will have a study partner, so he asks each of the students to pair up into groups of two. There are now 50 pairs, each with two people. The teacher wonders, whether having a smart peer (i.e., alter) increases the performance of for a focal student (e.g. Ego). Visually, he is interested in understanding this influence process:

Screen Shot 2017-05-04 at 1.20.36 PM.png

At the end of the class, all of the students take a standardized exam. This exam is scored on a 100 point scale, and students can get anywhere from a score of 0 to 100. The teacher takes this score and runs the following regression with 100 observations, 1 for each student. She’s also good with standard errors, so she clusters standard errors at the level of the dyad:

score_{i} = \beta_{0} + \beta_{1} score_{j} + \epsilon 

After running the regression, she finds a large and statistically significant coefficient for \beta_{1}. How should she interpret it?

A naive causal interpretation is: for every unit increase in score_{j} there is a corresponding \beta_{1} increase in score_{i}. Or, by having a study partner with a certain score, there is a corresponding increase/decrease in the performance of the focal student. This interpretation is naive for a reason, because is probably (though not definitely) wrong.

But before we dive into why it is probably wrong, it is useful to reiterate that this “peer effects” representation is quite general. For example these outcomes might be determined in part by the influence of peers (however defined).


  • Finance: Putting money away into a retirement savings account, adopting a microfinance product, etc.
  • Health behaviors: Obesity, Happiness, use of HIV/AIDS test, etc.
  • Academic performance: Getting good grades, choosing a major.
  • Entrepreneurship: Becoming an entrepreneur; deciding against becoming an entrepreneur.
  • Careers: Quitting; moving to a new company.
  • Adoption of products: Prescribing a drug, buying a car.
  • Adoption of behaviors: Smoking, drinking, sexual events.
  • Adoption of ideas: Learning from patents.
  • Organizational behavior:  Adoption of corporate practices and policies.

The basic idea is simple: We observe some level or change in the behavior or characteristics of an alter (or alters) and we see whether these are correlated to the behaviors or outcomes of Ego.


This apparently simple process is much more nuanced and complicated than it appears. There are dozens of “mechanisms” that can lead to the correlation we might observe (or that the teacher observes. Here are some examples of a few reasons why we might observe a correlation, either positive or negative. Consider the case of product adoption.



  Name Definition
1 Direct transfer of specific information. Alter tells me about a product, but nothing more.
2 Persuasion Effects Alter tells me about the product, and forcefully persuades me to adopt it.
3 Direct transfer of general information. Alter tells me about a website that reviews products, and on this page a list is produced where the product that I adopt is listed first.
4 Role-modeling / Imitation I see Alter doing something, I copy it.
5 Install Base Effects  I see many Alters adopting a product (i.e. buying an iPad, I adopt the iPad)
6 Threshold Effects I only buy an iPad if at least 10 people I know own it, once the 10th person adopts, I decide to adopt.
7 Snob effects I see an Alter(s) doing something, I avoid doing it myself.
8 Simultaneous Alter helps me out and I help her out, and together we perform better than either one would alone, because we, by talking through a problem for example, figure it out together.
9 Reverse causality The Alter does not affect Ego; but rather the Ego affects the Alter.
10 Contextual Effects We are both in the same neighborhood, and because we get exposed to the same billboard, we see the same advertisement for a project, and thus we adopt it.
11 Induced Environmental Effects Having a high achieving peer results in a teacher who teaches at a higher level, thus the student learns more not because of greater transfer of information from her peer, but because teaching quality improves.
12 Selection bias I become friends with people who already own iPads. I become friends with people who like technology, and because they like technology, they also own iPads.
13 Homophily Effects I like iPads and because I do, I become friends with iPads.

Can you think of more mechanisms?


Which mechanism is actually at play in a specific context?

This question is a hard one. Because we have several potential mechanisms that we must work with, how do we rule out some of them? Some mechanisms are easier to rule out then others, but most are actually quite difficult to conclusively confirm or deny.

To deal with this issue (which is VERY common during the review process) I have come up with a two part classification. The first set of mechanisms are what I call “pseudo-mechanisms.” Pseudo-mechanisms are alternative explanations of the correlation that have nothing to do with social influence of the type we care about: influence flowing from the peer to the focal individual. Charles Manski, in a famous paper has defined these as the reflection problem and the selection problem. 

Reflection problem: The reflection problem asks you to imagine a mirror. You see two objections moving. And if it is unclear to you that you are looking at a mirror, then you can’t tell which one is the actual person who is moving and which one is the mirror image. More formally, imagine that we have two sets of variables, let us call them  x and y; let x be the measurement of the characteristics of individual ’s peers’ characteristics at time t and let y be the measurement of the focal individual ’s characteristics at time t. Now, because of the simultaneous measurement, we are unable to tell whether the change in x’s characteristics has caused a change in y’s characteristic, or vice versa. And this indeterminacy exists for each observation.

Furthermore, we are unable to tell whether each of these actors was exposed to some environmental shock (advertising, etc. at the same time, which make their adoption correlated). The only way that we can insure that the reflection problem is not an issue is by measuring the traits and characteristics of the xs prior to measuring those of y.

However, solving the doing this does not resolve the issue of causality. Thus, it is a necessary, but insufficient condition.

Another important, and much more difficult condition now has to be met in order for the effect to have the title “Causal.”  This is the selection problem. The set of conditions that solves the selection problem are twofold:

  1. Either you know all the reasons why two people were paired together (i.e. why person y is friends with, shares a room with, enters the college as, with x).
  2. OR the two individuals are randomly assigned, and thus breaking the correlation between the characteristics of x and y.

Assume for a moment that we have ruled out reflection and selection effects by (1) using a lagged measure of peer consumption or action, and (2) the ego and alter are randomly paired, we have only ruled out a handful of possible “mechanisms” producing the peer effects. We can rule out the “pseudo-mechanisms” #8 – #13 (except for #11), but that leaves us with 8 possible mechanisms.

Imagine a doctor telling you that “Yes, we’ve ruled out the fact that you are faking your symptoms, but there are 8 or more possible viruses that could be causing your infection!”

So, we need to now try and distinguish between these.

This is hard, even harder than resolving the reflection and selection problems.  The reflection and selection problems are interesting in that they are hard problems to solve, but we know how to solve them. Not to make too many medical analogies, but this like separating conjoined twins. Hard, but someone can do it and has done it.

So how do we distinguish between different mechanisms, say #1 – #7?

This will depend a lot on context, and a lot on the data that you have available.

Let us examine a very simple situation where we have two students. Let us call the first student “Ego” and let us call the second student “Alter.” Assume for a moment that we have completely alleviated the problems of reflection and selection.


Screen Shot 2017-05-04 at 2.31.58 PM.png

Let us say that really there are two contender mechanisms.  (This is probably not true; but, for a moment assume that it is true.)

Mechanism 1: A student learns general study habits from his/her peer (alter) and this why his performance increases.

Mechanism 2: A student interacts a lot with his/her peer (alter) and they study together, and the peer helps the student learn the material.

How would we go about designing a test that would distinguish between these two mechanisms?

  1. For instance, if what the student is getting from her peer is increased motivation, that should have a positive effect on various subjects.
  2. On the other hand, if the student is learning something rather specific (like how to do an integral), then the effects should be subject specific.

Assume you do this test, and you find out that there are effects across subjects, what can you say about the mechanisms? Can you say anything?

How to conduct the estimation in R

Standard peer effects estimations are quite straightforward. This is especially true when you have randomization in the pairing of focal individuals to peers and longitudinal data so you can lag the characteristics of the peer.

score_{i,t+1} = \beta_{0} + \beta_{1} score_{j,t} + \epsilon 

Here is a synthetic peer effects dataset in which 2000 individuals have been randomly paired: peer_effects.csv.

Let us examine the extent to which there are peer effects.

The model we want to estimate is:

postself_{i,t+1} = \beta_{0} + \beta_{1} prepeer{j,t} + \epsilon 

Estimating this equation in R with this data results in:

Screen Shot 2017-05-04 at 3.28.39 PM.png

If the randomization is proper, this coefficient should be stable if we control for the focal individuals own pretreatment score.

Screen Shot 2017-05-04 at 3.30.22 PM.png

Another worry we have is whether this effect of the peer (captured by the pre-treatment characteristics) is homogeneous or heterogeneous. That is, does it depend on the characteristics of the focal individual or does it apply to everyone? To test this, we include a main effect of the characteristics of the focal individual (self_char) and an interaction term (pre_peer * self_char).

Screen Shot 2017-05-04 at 3.33.01 PM.png

Here, we see that the peer effects depends on the characteristic of the focal individual. If the focal individual has this characteristic (e.g., willingness to listen), the peer effect is larger.

This is only a simple demonstration of the complexity of peer effects, there are likely to be many interactional factors that turn peer effects “on” or “off” or modulate them in some important way. One could imagine the following contingencies, where peer effects depend on characteristics of:

  • the focal individual
  • the environment
  • the alter/peer
  • personalities of both


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s