MRQAP tests associations across the dyads of a network. Questions
generally ask whether one relation is associated with another relation
in multi-relational networks (e.g. do marriage ties correlate with
business ties?), or whether similarity on dyadic features predict
tie formation (e.g. are same-race pairs more likely to be
friends? or generally what shared characteristics predict
friendship nominations among adolescents in an American high
school?). MRQAP is an extension of the Mantel test which uses node
permutation to accommodate issues of non-independence that bias
traditional regression analysis estimates using (sociocentric) network
data. To illustrate how to use MRQAP with ideanet, we’ll
use the Faux Mesa High dataset native to the package.
Preparing to use ideanet’s MRQAP module is easy: all we
need is the igraph object produced using
netwrite, as this object contains the various node-level
attributes contained in a user’s original nodelist as well as the
node-level measurements produced by netwrite.
library(ideanet)
nw_fauxmesa <- netwrite(nodelist = fauxmesa_nodes,
node_id = "id",
i_elements = fauxmesa_edges$from,
j_elements = fauxmesa_edges$to,
directed = FALSE,
net_name = "faux_mesa",
shiny = TRUE)When we look at the original nodelist passed to netwrite
(fauxmesa_nodes), we see that we have information about
each student’s grade level, their race/ethnicity, and their sex. This
information exists at the individual level in that it pertains
to individual students. However, when using MRQAP analysis, we are
generally interested in how dyadic measures predict outcomes.
In other words, we’re interested in whether similarities or differences
between nodes lead to outcomes, both of which are understood at the
level of ego-alter relationships. While some users may have dyadic
measures stored in their edgelist ahead of time, typically these
measures are something one has to generate. The qap_setup
function in ideanet allows users to quickly take node-level
attributes and generate dyad-level comparisons with only a few
arguments.
These argument are:
net: An igraph object, preferably one
generated using netwrite. qap_setup also
supports pre-generated network object should they be
available to users, but users must ensure that such objects are properly
sorted to match node-level features.variables: A character vector of variable names that
are available in the igraph/network object.
These should match the names of node/vertex attributes in the object
passed to net.methods: A character vector specifying the methods that
should be applied to each item listed in variables. Values
in methods must be specified in correct order such that the
first item in methods applies to the first item in
variables, the second item in methods the
second item in variables, and so on. Each item in
methods must be one of the following:
"multi_category": Applies to categorical variables
only. Creates as many variables as there are unique values; each
variable signals if both ego and alter have the given value."reduced_category": Applies to categorical variables
only. Creates a single variable that signals if alter and ego have the
same value."both": Applies to categorical variables only. Computes
both the "multi_category" and
"reduced_category" methods."difference": Computes the difference in input value
between ego and alter. This method will produce two measures for each
variable to which it is applied: the difference between ego and alter’s
respective values and the absolute value of this difference. This is
generally used to estimate dyadic difference effects, such as whether
differences in popularity correlate with being friends.directed: Specify if edges in the network should be
interpreted as directed or undirected. Expects TRUE or
FALSE logical.For the Faux Mesa High network, let’s imagine one wants to know whether same sex, race, or grade levels affect the likelihood that adolescents nominate each other as friends.
For similarity in sex, we’ll want to apply the
reduced_category method to the sex variable.
For combinations of the race, we’ll apply the
multi_category method. And for grade, which
we’ll treat as a continuous variable, we’ll use the
difference method.
Given the importance of ensuring that each element in the
variables argument corresponds to the correct element in
the methods argument, users may find it helpful to store
both vectors in a data frame prior to running qap_setup.
This allows us to double-check that all elements appear in the correct
order.
var_methods <- data.frame(variable = c("sex", "race", "grade"),
method = c("reduced_category", "multi_category", "difference"))
var_methods| variable | method |
|---|---|
| sex | reduced_category |
| race | multi_category |
| grade | difference |
Now that we’ve ensured everything’s in the right order, let’s use
qap_setup:
faux_qap_setup <- qap_setup(net = nw_fauxmesa$faux_mesa,
variables = var_methods$variable,
methods = var_methods$method,
directed = FALSE)qap_setup produces a list object containing a nodelist,
an edgelist containing newly-calculated dyadic measures, and an
igraph object with these dyadic measures. Let’s quickly
inspect our new edgelist to see the kinds of variables we’ve just
created:
| from | to | weight | sex_ego | sex_alter | same_sex | race_ego | race_alter | both_race_Hisp | both_race_NatAm | both_race_White | both_race_Other | both_race_Black | grade_ego | grade_alter | diff_grade | abs_diff_grade |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 24 | 1 | F | F | 1 | Hisp | White | 0 | 0 | 0 | 0 | 0 | 7 | 7 | 0 | 0 |
| 0 | 51 | 1 | F | F | 1 | Hisp | NatAm | 0 | 0 | 0 | 0 | 0 | 7 | 7 | 0 | 0 |
| 0 | 57 | 1 | F | F | 1 | Hisp | Hisp | 1 | 0 | 0 | 0 | 0 | 7 | 12 | -5 | 5 |
| 0 | 69 | 1 | F | F | 1 | Hisp | NatAm | 0 | 0 | 0 | 0 | 0 | 7 | 7 | 0 | 0 |
| 0 | 86 | 1 | F | F | 1 | Hisp | White | 0 | 0 | 0 | 0 | 0 | 7 | 7 | 0 | 0 |
| 0 | 91 | 1 | F | F | 1 | Hisp | NatAm | 0 | 0 | 0 | 0 | 0 | 7 | 7 | 0 | 0 |
We now have several new variables. Variables appended with
_ego and _alter represent the original values
for each edge’s ego and alter, respectively, as determined in our
original nodelist. Our same_sex and both_race
variables are binary indicators of whether ego and alter have the same
value for a particular attribute. By contrast, diff_grade
is a continuous measure showing how many grade levels ego and alter are
apart from each other. Note that values in diff_grade may
be positive or negative depending on whether or not the node designated
as ego in the edgelist is in a higher grade level than the node
designated as alter. Signed differences of this sort can be useful when
working with directed networks — you can imagine younger students being
more likely to nominate older students as friends than vice versa. Given
that ties in our network are undirected, however, the order in which
egos and alters appear in our edgelist have no real meaning, and whether
values in diff_grade are positive or negative is largely a
matter of chance. Consequently, we’re better off using the absolute
value of ego and alter’s grade level in our analysis, as this measure is
agnostic to the order in which nodes are presented in our edgelist.
qap_setup provides us with this absolute value
automatically — here it is stored in abs_diff_grade.
With our variables of interest in hand, we turn to the MRQAP analysis
itself. ideanet’s qap_run function seamlessly
integrates output from netwrite and qap_setup.
However, in its current iteration users must select variables
produced by qap_setup. Arguments for
qap_run include:
net: An igraph object containing the
variables of interest. This igraph object should be one
produced by qap_setup.dependent: A single categorical or continuous variable
name to use as a dependent variable. This variable must be
produced by qap_setup. If the dependent is set to
NULL (default), the function will predict the existence of
a non-weighted tie.variables: A vector of variable names to use as
independent variables. These variables must be produced by
qap_setup.directed: Specify if the edges should be interpreted as
directed or undirected. Expects TRUE or FALSE
logical.reps: Select the number of permutations. Relevant to
null-hypothesis testing only. Default is set to 500.family: The functional form the model should follow.
Currently available are "linear" and
"binomial".
NOTE: If the input network is multi-relational,
qap_run will automatically merge duplicated rows. This is
necessary given that, at this time, the MRQAP wrapper does not elegantly
handle repeated observations. When merging rows, it will take the
sum of numeric edge attributes, and a random value of
character edge attributes. If the user is interested in the association
between two types of ties (e.g., marriage ties predicting business
ties), we recommend that they create a set of binary edge attributes
using qap_setup.
Let’s use qap_run using the default 500 permutations.
While we can significantly decrease the number of permutations to allow
for lower computation times, this may make confidence intervals in our
results less interpretable. As far as variables go in this example,
we’ll include same_sex, both_race_White, and
abs_diff_grade.
faux_qap <- qap_run(net = faux_qap_setup$graph,
dependent = NULL,
variables = c("same_sex", "both_race_White", "abs_diff_grade"),
directed = FALSE,
family = "linear")qap_run returns a list of two objects. The first
(covs_df) is a data frame summarizing model results in a
way resembling a traditional regression output. The second
(mods_df) is a data frame providing the number of dyadic
observations on which the model is computed.
| covars | estimate | pvalue |
|---|---|---|
| intercept | 0.0142453 | 0.006 |
| same_sex | 0.0048255 | 0.044 |
| both_race_White | 0.0014419 | 0.692 |
| abs_diff_grade | 0.0004715 | 0.632 |
| num_obs |
|---|
| 10878 |
Assuming a p-value of .1 or less indicates some statistical significance, our results here tell us that students are more likely to nominate students of the same sex as friends, holding all else constant. It is recommended that researchers apply the same model with different amounts of draws to confirm the confidence intervals associated with each variable.
The above example shows that setting up and using MRQAP in
ideanet is fast and easy, allowing users to quickly explore
a variety of model specifications with their own data.