psychmeta is a comprehensive and open-source R package for computing psychometric meta-analyses, conducting follow-up analyses of meta-analytic results (e.g., cumulative meta-analyses, leave-one-out meta-analyses, bootstrapped meta-analyses, and meta-regressions [with help from metafor]), simulating data impacted by psychometric artifacts, and much more! This tutorial is meant to introduce you to some of the most critical features of psychmeta and familiarize you with workflows within the package. We’ll start with the basics (installation and data import) and work our way up to the more techncial topics.

Installing and loading psychmeta

Install the official CRAN version of the package:

install.packages("psychmeta")

Or install the GitHub verson (with incremental updates between CRAN releases):

devtools::install_github("psychmeta/psychmeta")

Load the package library:

library(psychmeta)

Data import

Recommended packages for data import:

Other packages for data import:

psychmeta requires data to be in a “long” format. This means that each effect size is coded in its own row of a matrix. For example, a long format looks like this:

Data conversion and cleaning

Not all data come in a long format, so data manipulation tools can be helpful to convert it to a long format.

Recommended packages for data manipulation:

Other packages for data manipulation:

psychmeta can also help with reshaping data!

Sometimes it can be quick to code data in a journal table-like format:

dat_matrix

This matrix-style format can be easily converted to the long-format data psychmeta requires with the reshape_mat2dat() function:

reshape_mat2dat(
     var_names = "var_names",              # Column of variable names
     cor_data = c("X", "Y", "Z"),          # Names of correlation columns
     common_data = "n",                    # Names of columns shared among relationships
     unique_data = c("mean", "sd", "rel"), # Names of columns unique to relationships
     data = dat_matrix)

Sometimes it can also be quick to code data with multiple correlations in one row (i.e., a wide format):

dat_wide

This wide format can be easily converted to the long format psychmeta requires:

common_vars <- c("sample_id")           # Column names for variables common to all
                                        # relationships
var_names <- c("X", "Y", "Z")
es_design = matrix(NA, 3, 3)            # Matrix containing the column names
es_design[lower.tri(es_design)] <-      # for the intercorrelations among variables
  c("rxyi_X_Y", "rxyi_X_Z", "rxyi_Y_Z") # in the lower triangle of the matrix
rownames(es_design) <-
  colnames(es_design) <-
  var_names
n_design <- "ni"                        # Sample size column name or es_design-like
                                        # matrix
other_design <-                         # Matrix with variable names as row names,
  cbind(rel = c("rel_X",                # names of long-format variables as column names,
                 "rel_Y",               # and column names of dat_wide as elements
                 "rel_Z"))
rownames(other_design) <- var_names

reshape_wide2long(common_vars = common_vars,
                  es_design = es_design,
                  n_design = n_design,
                  other_design = other_design,
                  es_name = "rxyi",              # Type of effect size in dat_wide
                  data = dat_wide)

The convert_es() function can be used to convert a wide variety of statistics to the r or d metric.

convert_es(es = 1,    input_es = "d",       output_es = "r", n1 = 50,  n2 = 50)
convert_es(es = -1.3, input_es = "t",       output_es = "r", n1 = 100, n2 = 140)
convert_es(es = 10.3, input_es = "F",       output_es = "r", n1 = 100, n2 = 150)
convert_es(es = 1.3,  input_es = "chisq",   output_es = "r", n1 = 100, n2 = 100)
convert_es(es = .021, input_es = "p.chisq", output_es = "r", n1 = 100, n2 = 100)
convert_es(es = 4.37, input_es = "or",      output_es = "r", n1 = 100, n2 = 100)
convert_es(es = 1.47, input_es = "lor",     output_es = "r", n1 = 100, n2 = 100)

convert_es(es = .2,   input_es = "r",       output_es = "d", n1 = 50,  n2 = 50)
convert_es(es = -1.3, input_es = "t",       output_es = "d", n1 = 100, n2 = 140)
convert_es(es = 10.3, input_es = "F",       output_es = "d", n1 = 100, n2 = 150)
convert_es(es = 1.3,  input_es = "chisq",   output_es = "d", n1 = 100, n2 = 100)
convert_es(es = .021, input_es = "p.chisq", output_es = "d", n1 = 100, n2 = 100)
convert_es(es = 4.37, input_es = "or",      output_es = "d", n1 = 100, n2 = 100)
convert_es(es = 1.47, input_es = "lor",     output_es = "d", n1 = 100, n2 = 100)

To convert effect sizes for inclusion in a meta-analysis, pull out the meta_input object:

convert_es(es = c(.4, .3, .25),
           input_es = "r", output_es = "d",
           n1 = c(50, 110, 65), n2 = c(50, 70, 65)
           )$meta_input

The convert_es() function can also be used to simply compute confidence intervals for r, d, and A:

convert_es(es = .3, input_es = "r", output_es = "r", n1 = 100)
convert_es(es = .8, input_es = "d", output_es = "d", n1 = 64, n2 = 36)
convert_es(es = .8, input_es = "A", output_es = "A", n1 = 64, n2 = 36)

Corrections for measurement error and range restriction can be applied automatically within meta-analyses.

Other corrections can be made, but they need to be applied sporadically (i.e., applied to individual effect sizes before including them in a meta-analysis).

correct_r_dich(r = c(.3, .5), px = .5, py = .5, n = 100)
correct_r_split(r = c(.3, .5), pi = .2, n = 100)

Computing meta-analyses

ma_r() is the general-purpose meta-analysis function for correlations

The arguments to this function are:

ma_r(
#### Commonly used arguments: ####
# Which data set would you like to use?
     data = NULL,

# Specify essential effect-size information.
     rxyi, n, n_adj = NULL,

# Differentiate sources of data.
     sample_id = NULL, citekey = NULL,

# Specify methodological parameters.
     ma_method = "bb", ad_type = "tsa",
     correction_method = "auto",

# Specify constructs and measures of constructs.
     construct_x = NULL, construct_y = NULL,
     measure_x = NULL, measure_y = NULL,

# Weighting method
     wt_type = "sample_size",

# Correct for small-sample bias?
     correct_bias = TRUE,
# Correct for measurement error?
     correct_rxx = TRUE, correct_ryy = TRUE,
# Correct for range restriction?
     correct_rr_x = TRUE, correct_rr_y = TRUE,
# Is the range restriction indirect (vs. direct) in nature?
     indirect_rr_x = TRUE, indirect_rr_y = TRUE,

# What are your reliability coefficients?
     rxx = NULL, ryy = NULL,
# Are your reliability coefficients range restricted?
     rxx_restricted = TRUE, ryy_restricted = TRUE,
# What types of reliability estimates are you using?
     rxx_type = "alpha", ryy_type = "alpha",

# What are your range-restriction u (SD) ratios?
     ux = NULL, uy = NULL,
# Are your u ratios computed from observed (vs. true-score) SDs?
     ux_observed = TRUE, uy_observed = TRUE,

# What are your moderators and which ones are categorical?
     moderators = NULL, cat_moderators = TRUE,

# What type of moderator analysis do you want (simple vs. hierarchical)?
     moderator_type = "simple",

# If correcting for bivariate indirect RR, how do constructs correlate
# with selection mechanisms? (only need to specify the +1 or -1 sign of the relationships)
     sign_rxz = 1, sign_ryz = 1,

# Do you have any artifact distributions to include that are not in your database?
     supplemental_ads = NULL,

#### Other arguments to know about: ####
# Specify the order in which constructs should be displayed in output.
     construct_order = NULL,
# If analyzing multiple relationships, how should each constructs's measurement
# error be handled?
     correct_rel = NULL,
# If analyzing multiple relationships, how should each construct's
# range restriction be handled?
     correct_rr = NULL,
# If analyzing multiple relationships, which constructs are affected by indirect
# range restriction?
     indirect_rr = NULL,
# If analyzing multiple relationships, how does each construct correlate with
# the selection mechanism?
     sign_rz = NULL,

# Additional methological parameters can be modified using the "control" argument.
     control =  control_psychmeta()
)

In addition to the arguments used with ma_r() directly, the control argument allows users to modify methological parameters of their analyses and interact with some of the more specialized features of psychmeta’s meta-analysis functions. The control argument in meta-analysis functions accepts the output of the control_psychmeta() function.

control_psychmeta(
     # Should the mean or the sample-specific effect sizes be used to estimate error variance?
     error_type = c("mean", "sample"),
     # What proportion of the distribution of means should be included in confidence intervals?
     conf_level = .95,
     # What proportion of the residual distribution of observations should be included in credibility intervals?
     cred_level = .8,
     # How should confidence and credibility intervals be computed? With the t or normal distribution?
     conf_method = c("t", "norm"),
     cred_method = c("t", "norm"),
     # Should weighted variance estimates be computed as unbiased (i.e., multiplied by k / [k-1])?
     var_unbiased = TRUE,
     # Should overall artifaction distributions be computed for each construct (pairwise_ads == FALSE) or should
     # artifact distributions be computed separately for each construct pair (pairwise_ads == TRUE)?
     pairwise_ads = FALSE,
     # Should artifact distributions be computed by collapsing across moderator
     # levels (moderated_ads == FALSE) or should artifact distributions be computed
     # separately for each moderator combination (moderated_ads == TRUE)?
     moderated_ads = FALSE,
     # Should artifact-distribution corrections be computed using artifact distributions
     # that have had sampling error removed (residual_ads == TRUE) or should the observed
     # distributions be used (residual_ads == TRUE)?
     residual_ads = TRUE,
     # Should dependent observations (i.e., multiple observations of the same relationship in a single sample) be
     # consolidated so that there is just one independent observation per sample?
     check_dependence = TRUE,
     # If check_dependence is TRUE, how should dependency be resolved? Options are to compute a composite
     # effect size (default), compute the average of the effect sizes, or to stop with an error message.
     collapse_method = c("composite", "average", "stop"),
     # The intercor argument uses the control_intercor() function to control how the intercorrelations
     # among variables are handled with collapse_method == "composite"
     intercor = control_intercor(),
     # Should artifact information be cleaned to resolve discrepancies among values recorded for multiple
     # relationsips involving a given construct in a given sample?
     clean_artifacts = TRUE,
     # Should missing artifact information be imputed? (For use with individual-correction method only)
     impute_artifacts = TRUE,
     # If impute_artifacts is TRUE, how should imputation be performed? See the documentation for the
     # control_psychmeta() function for descriptions of the available options.
     impute_method = c("bootstrap_mod", "bootstrap_full", "simulate_mod", "simulate_full",
                       "wt_mean_mod", "wt_mean_full", "unwt_mean_mod", "unwt_mean_full",
                       "replace_unity", "stop"),
     # What seed value should be set for the imputation process? (This makes the imputation reproducible)
     seed = 42,
     # Should artifact information from observations not included in meta-analyses be harvested from "data"
     # and included in artifact distributions?
     use_all_arts = TRUE,
     # For meta-analyses of d values, should the proportionality of membership in the unrestricted sample
     # be estimated from the range-restricted proportions and the range-restriction correction factor?
     estimate_pa = FALSE,
     # To what number of decimal places should interactive artifact distributions be rounded prior to use?
     # Rounding reduces the computational burden of creating multi-dimensional arrays of artifact information.
     decimals = 2,
     # Should the "Hunter-Schmidt" override settings be used? When TRUE, this will override settings for:
     # - wt_type will set to "sample_size"
     # - error_type will set to "mean"
     # - correct_bias will set to TRUE
     # - conf_method will set to "norm"
     # - cred_method will set to "norm"
     # - var_unbiased will set to FALSE
     # - residual_ads will be set to FALSE
     # - use_all_arts will set to FALSE
     hs_override = FALSE
)

Bare-bones meta-analyses

At a minimum, simply specify correlations and sample sizes. You can specify vectors containing correlations, sample sizes, and other input data for each study in three different ways (and any combination of the three):

  1. As vectors, with a vector manually passed to each argument.
ma_r(rxyi = data_r_meas_multi$rxyi,
     n = data_r_meas_multi$n)
  1. As string-format (quoted) column names corresponding to the columns of data.
ma_r(rxyi = "rxyi", n = "n",
     data = data_r_meas_multi)
  1. As unquoted column names corresponding to the columns of data.
ma_r(rxyi = rxyi, n = n,
     data = data_r_meas_multi)

Moderator analyses

Moderator analyses are easy—just identify a column or matrix of moderator information.

ma_r(rxyi = rxyi, n = n,
     moderators = moderator,
     data = data_r_meas_multi)

To use multiple moderators, supply the column names as a string vector.

ma_r(rxyi = rxyi, n = n,
     moderators = c("Rating source", "Published", "Complexity"),
     data = data_r_gonzalezmule_2014)

Multi-construct analyses

On top of moderators, psychmeta can also meta-analyze multiple construct relationships in one function call.

ma_r(rxyi = rxyi, n = n,
     construct_x = x_name,
     construct_y = y_name,
     data = data_r_meas_multi)
ma_r(rxyi = rxyi, n = n,
     construct_x = x_name,
     construct_y = y_name,
     moderators = moderator,
     data = data_r_meas_multi)

Individual-correction meta-analyses

psychmeta can automatically make corrections for reliability and range restriction as it computes meta-analyses. To do so, specify vectors of data for each artifact your want to correct. To apply corrections individually to each study, specify ma_method = "ic".

The clean_artifacts and impute_artifacts arguments are used to impute missing artifact data and check for various errors or inconsistencies in the artifacts. You should generally leave these set to TRUE; they are only FALSE here to speed up computations during this tutorial, as our data are clean.

ma_obj <- ma_r(ma_method = "ic",
               rxyi = rxyi, n = n,
               construct_x = x_name,
               construct_y = y_name,
               rxx = rxxi,
               ryy = ryyi,
               moderators = moderator,
               clean_artifacts = FALSE,
               impute_artifacts = FALSE,
               data = data_r_meas_multi)

Artifact-distribution meta-analyses

To apply corrections to meta-analysis results overall, rather than to individual studies, specify ma_method = "ad". Artifact distribution methods are useful when you have missing data (they make imputating missing values unnecessary). They can also be used to supplement observed artifacts with values from other sources (reducing artifact sampling error) and for applying corrections to artifact values to yield more accurate meta-analysis results (e.g., Dahlke & Wiernik, 2018a).

psychmeta can apply a variety of artifact correction models. You can specify which method to use with the correction_method argument. The default is “auto”, and psychmeta will choose the best method based on available artifact data and settings for these arguments:

  • correct_bias
  • correct_rxx, correct_ryy
  • correct_rr_x, correct_rr_y
  • indirect_rr_x, indirect_rr_y
  • correct_rel, correct_rr, indirect_rr
ma_r(ma_method = "ad",
     rxyi = rxyi, n = n,
     construct_x = x_name,
     construct_y = y_name,
     rxx = rxxi,
     ryy = ryyi,
     correct_rr_x = FALSE,
     correct_rr_y = FALSE,
     moderators = moderator,
     data = data_r_meas_multi)

You can also convert a meta-analysis previously calculated using barebones or individual-correction methods to an artifact-distribution meta-analysis:

ma_r_ad(ma_obj, correct_rr_x = FALSE, correct_rr_y = FALSE)

Handling dependency

Psychometric meta-analysis assumes that each effect size for a particular construct relationship comes from an independent sample. If one sample provides multiple effect sizes for the same relationship, psychmeta can automatically consolidate them as either composites (recommended) or averages.

ma_r(rxyi = rxyi, n = n, sample_id = sample_id,
     collapse_method = "composite",
     data = data_r_meas_multi)

ma_r(rxyi = rxyi, n = n, sample_id = sample_id,
     collapse_method = "average",
     data = data_r_meas_multi)

Follow-up analyses

ma_r() will automatically compute the primary results that should be reported for meta-analyses, including total sample size (N), number of studies (k), average effect sizes (mean_r and mean_rho), observed (sd_r and sd_r_c) and residual (sd_res and sd_rho) standard deviations, and confidence intervals and credibility intervals. These results are sufficient for evaluating average effect sizes, heterogeneity of effects, and precision or uncertainty in the results.

psychmeta also provides a variety of follow-up analyses to supplement the primary meta-analysis results.

sensitivity()

sensitivity() computes several different types of sensitivity analyses for meta-analytic results:

  • bootstrap (calculate bootstrap confidence intervals on all meta-analysis results)
  • cumulative (calculate cumulative meta-analyses—a form of publication bias analysis)
  • leave1out (compute meta-anlayses leaving out 1 sample at a time—for outlier/influential point detection)

You can either call all sensitivity analyses at once (sensitivity()) or call each type of analysis individually (sensitivity_bootstrap(), sensitivity_cumulative(), sensitivity_leave1out()):

ma_obj <- sensitivity(ma_obj,
                      leave1out = TRUE,
                      bootstrap = TRUE,
                      cumulative = TRUE,

                      sort_method = "weight",

                      boot_iter = 100,
                      boot_conf_level = 0.95,
                      boot_ci_type = "norm")

metareg()

metareg() computes meta-regression results using both categorical and continuous moderators via the rma() function from the metafor package.

ma_obj_gm <- ma_r(ma_method = "ic",
                  rxyi = rxyi, n = n,
                  rxx = rxxi,
                  ryy = ryyi,
                  moderators = c("Rating source", "Complexity"),
                  data = data_r_gonzalezmule_2014)

ma_obj_gm <- metareg(ma_obj_gm)
get_metareg(ma_obj = ma_obj_gm)[[1]]$individual_correction$true_score

ma_obj_mc <- ma_r(ma_method = "ic",
                  rxyi = r, n = N,
                  moderators = c("Parent", "Age"),
                  cat_moderators = c(TRUE, FALSE),
                  data = data_r_mcleod_2007)

ma_obj_mc <- metareg(ma_obj_mc, formula_list = list("Parent*Age" = yi ~ Parent*Age))
get_metareg(ma_obj_mc)[[1]]$individual_correction$true_score

Note that the relevant sample size for meta-regression is the number of studies (k). Meta-regression results should be interpreted cautiously when k is small or when there is low variability in moderator levels across studies (Schmidt, 2017). In these and other cases, hierarhical subgroup moderator analyses can yield more reliable results:

ma_obj_gm <- ma_r(ma_method = "ic",
                  rxyi = rxyi, n = n,
                  rxx = rxxi,
                  ryy = ryyi,
                  moderators = c("Rating source", "Complexity"),
                  moderator_type = "hierarchical",
                  data = data_r_gonzalezmule_2014)

ma_obj_gm

heterogeneity()

psychmeta automatically estimates residual standard deviations (SDres and SD\(\rho\)) and credibility intervals. These statistics are the best way to examine heterogeneity in meta-anlayses (Wiernik et al., 2017). The heterogeneity() function computes additional heterogeneity statistics (e.g., % Variance Accounted For, Q, I2) that may be requested by some journals or reviewers; we don’t recommend these statistics for regular use.

ma_obj <- heterogeneity(ma_obj)

Results presentation

psychmeta has several functions to access meta-analysis results and quickly prepare them for publication. The most basic way to access results is to print the ma_obj results object:

ma_obj <- ma_r(ma_method = "ic", sample_id = sample_id,
               rxyi = rxyi, n = n,
               construct_x = x_name,
               construct_y = y_name,
               rxx = rxxi,
               ryy = ryyi,
               moderators = moderator,
               clean_artifacts = FALSE,
               impute_artifacts = FALSE,
               data = data_r_meas_multi)

ma_obj

You can limit printing to specific constructs and/or construct pairs using the filter_ma() function:

filter_ma(ma_obj, analyses = list(construct = "Y") )
filter_ma(ma_obj, analyses = list(construct_pair = list(c("X", "Z"))) )

Filtering on other analysis features, such as moderator levels or minimum N and/or k, is also possible.

psychmeta also includes a variety of get_stuff() functions to retrive specific results from an ma_obj object. For example:

# Heterogeneity statistics
ma_obj <- heterogeneity(ma_obj)
out_heterogeneity <- get_heterogeneity(ma_obj)
out_heterogeneity$`analysis id: 1`$barebones

# Sensitivity analyses
ma_obj <- sensitivity(ma_obj, bootstrap = FALSE)
out_leave1out <- get_leave1out(ma_obj)
out_leave1out$`analysis id: 1`$barebones

out_cumulative <- get_cumulative(ma_obj)
out_cumulative$`analysis id: 1`$barebones

# Meta-regression analyses
ma_obj <- metareg(ma_obj)
out_metareg <- get_metareg(ma_obj)
out_metareg$`analysis id: 1`$barebones

# A summary of artifact distributions
get_ad(ma_obj, ma_method = "ic")

Writing formatted results tables to Word, PDF, Markdown, or other formats

psychmeta can automatically produce publication-ready tables of meta-analysis results that can be copied into your word processor or inserted into an RMarkdown document.

# Write meta-analysis tables to a Word file called "Meta-analysis table.docx"
metabulate(ma_obj, file = "Meta-analysis table.docx", 
           output_format = "word", output_dir = tempdir())

Generating plots

psychmeta can produce both funnel and forest plots for visually presenting meta-analytic results.

# Create funnel and forest plots
ma_obj <- plot_funnel(ma_obj = ma_obj)
ma_obj <- plot_forest(ma_obj = ma_obj, ma_facetname = "MA")

# Extract plots for viewing
out_plots <- get_plots(ma_obj)
out_plots$funnel$`analysis id: 1`$barebones
out_plots$forest$`analysis id: 1`$moderated$barebones

# The sensitivity_cumulative() and sensitivity_leave1out() functions also produce plots.
out_plots$leave1out$`analysis id: 1`$barebones$plots
out_plots$cumulative$`analysis id: 1`$barebones$plots
## Funnel plots have been added to 'ma_obj' - use get_plots() to retrieve them.

Generating reference lists

psychmeta can automatically generate a formatted reference list using a BibTeX or Zotero library.

First, specify the citekey argument when running ma_r(). Then use the generate_bib() function:

ma_obj <- ma_r(ma_method = "ic",
               rxyi = rxyi, n = n,
               construct_x = x_name,
               construct_y = y_name,
               rxx = rxxi,
               ryy = ryyi,
               moderators = moderator,
               clean_artifacts = FALSE,
               impute_artifacts = FALSE,
               citekey = citekey,
               data = data_r_meas_multi)

generate_bib(ma_obj,

             # The location of your .bib file (normally should be in the same directory
             # as the analysis script).
             bib = system.file("templates/sample_bibliography.bib", package="psychmeta"),

             # Your citation style. Must be the style ID for a style hosted at
             # http://zotero.org/styles/
             style = "apa",

             # What format to write output as (options are: "word", "html", "pdf", "text",
             # "Rmd", "biblatex", "citekeys")?
             output_format = "word",

             # What filename to output to (use "console" to print to the R console).
             file = "sources.docx"
             )

You can also generate a bibliography containing only sources contributing to a specific subset of the meta-analyses:

generate_bib(ma_obj,

             bib = system.file("templates/sample_bibliography.bib",
                               package="psychmeta"),

             analyses = list(construct_pair = list(c("X", "Y")))

             )

Supplemental tools

Computing meta-analyses with pre-existing artifact distributions

Even if no studies in your meta-analytic database provide artifact information, pre-specified artifact distributions from previous meta-analyses can still be used in psychmeta! There are several ways to use pre-specified distributions, ranging from using them on their on their own to combining them with observed artifact information. Below are four potential methods, but psychmeta allows many other ways to use artifact distributions.

  1. You can use the create_ad() function to create artifact distribution objects, which can then be used with the ma_r_ad() and ma_d_ad() functions to correct bare-bones meta-analyses.
# Create artifact-distribution objects for X, Y, and Z.
ad_x <- create_ad(mean_qxi = 0.8927818, var_qxi = 0.0008095520, k_qxi = 40,
                  mean_n_qxi = 11927 / 40, qxi_dist_type = "alpha")
ad_y <- create_ad(mean_qxi = 0.8927818, var_qxi = 0.0008095520, k_qxi = 40,
                  mean_n_qxi = 11927 / 40, qxi_dist_type = "alpha")
ad_z <- create_ad(mean_qxi = 0.8962108, var_qxi = 0.0007840593, k_qxi = 40,
                  mean_n_qxi = 11927 / 40, qxi_dist_type = "alpha")

# Compute a meta-analysis of X and Y.
ma_bb_xy <- ma_r_bb(r = rxyi, n = n,
                    data = filter(data_r_meas_multi, x_name == "X", y_name == "Y"))

# Correct the meta-analysis for measurement error.
ma_r_ad(ma_obj = ma_bb_xy, ad_obj_x = ad_x, ad_obj_y = ad_y,
        correct_rr_x = FALSE, correct_rr_y = FALSE)
  1. Rather than applying a two-step process (first compute bare-bones, then apply corrections), you can use the ma_r() function to automate your workflow! Just supply the artifact distributions as a list with the supplemental_ads argument.
ma_r(ma_method = "ad", rxyi = rxyi, n = n,
     correct_rr_x = FALSE, correct_rr_y = FALSE,
     construct_x = x_name, construct_y = y_name, sample_id = sample_id,
     clean_artifacts = FALSE, impute_artifacts = FALSE,
     moderators = moderator, data = data_r_meas_multi,
     # The "supplemental_ads" argument can take a list of artifact-distribution objects.
     supplemental_ads =
          list(X = ad_x,
               Y = ad_y,
               Z = ad_z))
  1. You can bypass the manual creation of artifact distribution objects and have psychmeta do it for you. Instead of running create_ad(), you can create a list of artifact information to use as the supplemental_ads argument in ma_r() or ma_d(). This list contains artifact information for each construct you want to supply information for. This is a “list of lists” - each construct for which you provide information gets its own list labeled with the construct name. Each construct’s list should contain entries that are named like the arguments to the create_ad() function.
ma_r(ma_method = "ad", rxyi = rxyi, n = n,
     correct_rr_x = FALSE, correct_rr_y = FALSE,
     construct_x = x_name, construct_y = y_name, sample_id = sample_id,
     clean_artifacts = FALSE, impute_artifacts = FALSE,
     moderators = moderator, data = data_r_meas_multi,
     supplemental_ads =
          # The "supplemental_ads" argument can also take raw artifact values.
          # (These values are unrounded so our examples all produce identical results.)
          list(X = list(mean_qxi = 0.8927818, var_qxi = 0.0008095520, k_qxi = 40,
                        mean_n_qxi = 11927 / 40, qxi_dist_type = "alpha"),
               Y = list(mean_qxi = 0.8941266, var_qxi = 0.0009367234, k_qxi = 40,
                        mean_n_qxi = 11927 / 40, qxi_dist_type = "alpha"),
               Z = list(mean_qxi = 0.8962108, var_qxi = 0.0007840593, k_qxi = 40,
                        mean_n_qxi = 11927 / 40, qxi_dist_type = "alpha")))
  1. You can also create a list of artifact distributions using the create_ad_list() function. This list can then be passed to ma_r() as before.
ad_list <- create_ad_list(n = n, rxx = rxxi, ryy = ryyi,
                          construct_x = x_name, construct_y = y_name,
                          sample_id = sample_id,
                          data = data_r_meas_multi)

ma_r(ma_method = "ad", rxyi = rxyi, n = n,
     correct_rr_x = FALSE, correct_rr_y = FALSE,
     construct_x = x_name, construct_y = y_name, sample_id = sample_id,
     clean_artifacts = FALSE, impute_artifacts = FALSE,
     moderators = moderator, data = data_r_meas_multi,
     # The "supplemental_ads" argument can take the output of create_ad_list().
     supplemental_ads = ad_list)
  1. It’s possible to mix-and-match any of the above methods, as well as use artifacts from your database.
# For purposes of illustration, delete all reliability values not associated with "Z":
dat_zarts <- data_r_meas_multi
dat_zarts[,"rxxi"] <- dat_zarts[dat_zarts$y_name != "Z","ryyi"] <- NA

# Compute a meta-analysis using three different types of artifact formats:
ma_r(ma_method = "ad", rxyi = rxyi, n = n,
     # Observed artifacts can be provided along with the "supplemental_ads" argument.
     ryy = ryyi,
     correct_rr_x = FALSE, correct_rr_y = FALSE,
     construct_x = x_name, construct_y = y_name, sample_id = sample_id,
     clean_artifacts = FALSE, impute_artifacts = FALSE,
     moderators = moderator, data = dat_zarts,
     supplemental_ads =
          list(X = ad_x,
               Y = list(mean_qxi = 0.8941266, var_qxi = 0.0009367234, k_qxi = 40,
                        mean_n_qxi = 11927 / 40, qxi_dist_type = "alpha")))

Simulations

In addition to its collection of tools for computing meta-analyses, psychmeta also boasts a robust suite of simulation functions. These functions are intended to support high-quality meta-analytic simulations by correctly introducing sampling error, measurement error, and range restriction into a set of parameter values. These functions can be valuable for both research and pedagogical applications.

Psychometric data sets

You can generate data impacted by psychometric artifacts for a single sample by using the simulate_psych() function. This function reports three datasets per sample: Observed scores, true scores, and error scores. It can also perform selection and indicate (1) which cases would be selected on the basis of observed scores (i.e., actual selection), (2) which cases would be selected on the basis of true scores (i.e., error-free selection), and (3) which cases would be selected on the basis of error scores (i.e., random selection).

sim_dat <- simulate_psych(n = 1000,
                          rho_mat = reshape_vec2mat(.5),
                          sigma_vec = c(1, 1),
                          sr_vec = c(1, .5),
                          rel_vec = c(.8, .8), var_names = c("Y", "X"))

Psychometric correlations

You can automate the process of analyzing a simulated sample when you are interested in effect sizes and artifact estimates. For example, you can generate correlations impacted by psychometric artifacts for a single sample of data using the simulate_r_sample() function. This function generates sample statistics when the n argument is a finite value, but it can also analyze parameters when n = Inf!

simulate_r_sample(n = 1000,
                  # Correlation parameter matrix
                  rho_mat = reshape_vec2mat(.5, order = 5),
                  # Reliability parameter vector
                  rel_vec = rep(.8, 5),
                  # Selection ratio vector
                  sr_vec = c(1, 1, 1, 1, .5),
                  # Number of items in each scale
                  k_items_vec = 1:5,
                  # Matrix of weights to use in creating composites
                  wt_mat = cbind(c(0, 0, 0, .3, 1),
                                 c(1, .3, 0, 0, 0)),
                  # Selection ratios for composites
                  sr_composites = c(.7, .5))

simulate_r_sample(n = Inf,
                  rho_mat = reshape_vec2mat(.5, order = 5),
                  rel_vec = rep(.8, 5),
                  sr_vec = c(1, 1, 1, 1, .5),
                  k_items_vec = 1:5,
                  wt_mat = cbind(c(0, 0, 0, .3, 1),
                                 c(1, .3, 0, 0, 0)),
                  sr_composites = c(.7, .5))

One of the most powerful simulation functions in psychmeta is simulate_r_database(), which can generate entire databases of correlations for use in meta-analysis simulations. This function is a wrapper for simulate_r_sample() and it can sample parameter values from whichever distributions you specify.

# Note the varying methods for defining parameters:
simulate_r_database(k = 10,

                    # Sample-size parameters
                    # Parameters can be defined as functions.
                    n_params = function(n) rgamma(n, shape = 100),

                    # Correlation parameters (one parameter distribution per correlation)
                    # Parameters can also be defined as vectors...
                    rho_params = list(c(.1, .3, .5),
                                      # ...as a mean + a standard deviation...
                                      c(mean = .3, sd = .05),
                                      # ...or as a matrix of values and weights.
                                      rbind(value = c(.1, .3, .5),
                                            weight = c(1, 2, 1))),

                    # Reliability parameters
                    rel_params = list(c(.7, .8, .9),
                                      c(mean = .8, sd = .05),
                                      rbind(value = c(.7, .8, .9),
                                            weight = c(1, 2, 1))),

                    # Selection-ratio parameters
                    sr_params = list(1, 1, c(.5, .7)),

                    # Measure-length parameters
                    k_items_params = list(5, 8, 10),

                     # Composite weight parameters
                    wt_params = list(list(1, 1, 0)),

                    # Selection-ratio parameters for composites
                    sr_composite_params = list(1),

                    # Variable names
                    var_names = c("X", "Y", "Z"))

Psychometric d values

psychmeta includes d-value simulation functions that parallel the correlation simulation functions described above. These functions automate the simulation of subgroup comparisons and can introduce sampling error, measurement error, and range-restriction artifacts into d-value parameters.

To simulate a single sample of results, use the simulate_d_sample() function. This function can generate data for any number of variables with any number of groups and it can form composite variables and perform selection.

## Simulate statistics by providing integers as "n_vec":
simulate_d_sample(n_vec = c(200, 100),

                  # List of rho matrices - one per group
                  rho_mat_list = list(reshape_vec2mat(.5),
                                      reshape_vec2mat(.4)),

                  # Matrix of group means (groups on rows, variables on columns)
                  mu_mat = rbind(c(1, .5),
                                 c(0, 0)),

                  # Matrix of group SDs (groups on rows, variables on columns)
                  sigma_mat = rbind(c(1, 1),
                                    c(1, 1)),

                  # Matrix of group reliabilities (groups on rows, variables on columns)
                  rel_mat = rbind(c(.8, .7),
                                  c(.7, .7)),

                  # Vector of selection ratios
                  sr_vec = c(1, .5),

                  # Number of items in each scale
                  k_items_vec = c(5, 10),

                  # Group names
                  group_names = c("A", "B"))

Like simulate_r_sample(), simulate_d_sample() can also analyze parameters. To get parameter estimates, simply define the sample sizes as proportions (with the n_vec argument) rather than as integers.

simulate_d_sample(n_vec = c(2/3, 1/3),
                  rho_mat_list = list(reshape_vec2mat(.5),
                                      reshape_vec2mat(.4)),
                  mu_mat = rbind(c(1, .5),
                                 c(0, 0)),
                  sigma_mat = rbind(c(1, 1),
                                    c(1, 1)),
                  rel_mat = rbind(c(.8, .7),
                                  c(.7, .7)),
                  sr_vec = c(1, .5),
                  k_items_vec = c(5, 10),
                  group_names = c("A", "B"))

The simulate_d_database() function is the d-value counterpart to simulate_r_database(). However, each parameter argument now requires a set of parameters for each group (e.g., a distribution of correlations, means, reliabilities, etc. must be specified for each group).

simulate_d_database(k = 5,
                    # "Group1" and "Group2" labels are not required for parameter arguments:
                    # They are shown only for enhanced interpretability.

                    # Sample-size parameter distributions
                    n_params = list(Group1 = c(mean = 200, sd = 20),
                                    Group2 = c(mean = 100, sd = 20)),

                    # Correlation parameter distributions
                    rho_params = list(Group1 = list(c(.3, .4, .5)),
                                      Group2 = list(c(.3, .4, .5))),

                    # Mean parameter distributions
                    mu_params = list(Group1 = list(c(mean = .5, sd = .5),
                                                   c(-.5, 0, .5)),
                                     Group2 = list(c(mean = 0, sd = .5),
                                                   c(-.2, 0, .2))),

                    # Reliability parameter distributions
                    rel_params = list(Group1 = list(.8, .8),
                                      Group2 = list(c(.7, .8, .8),
                                                    c(.7, .8, .8))),

                    # Group names
                    group_names = c("Group1", "Group2"),

                    # Variable names
                    var_names = c("Y", "Z"))

Multivariate range-restriction corrections

In addition to its array of univariate and bivariate range-restriction corrections, psychmeta includes functions to correct data for multivariate range restriction. The correct_matrix_mvrr() corrects covariance matrices for range restriction and the correct_means_mvrr() function corrects vectors of means for range restriction.

# Create example matrices with different variances and covariances:
Sigma_i <- reshape_vec2mat(cov = .2, var = .8, order = 4)
Sigma_xx_a <- reshape_vec2mat(cov = .5, order = 2)

# Correct "Sigma_i" for range restriction in variables 1 & 2:
correct_matrix_mvrr(Sigma_i = Sigma_i,
                    Sigma_xx_a = Sigma_xx_a,
                    x_col = 1:2)

# Correct the means of variables 1 & 2 for range restriction:
correct_means_mvrr(Sigma = Sigma_i,
                   means_i = c(2, 2, 1, 1),
                   means_x_a = c(1, 1),
                   x_col = 1:2)

Composites

Spearman-Brown reliability estimates

The Spearman-Brown formula can be used to estimate the reliability of a measure that has been lengthened (or shortened) k times. This formula is implemented in the estimate_rel_sb() function and its inverse (which estimates the number of times a measure needs to be lengthened to achieve a desired reliability) is implemented in the estimate_length_sb() function.

estimate_rel_sb(rel_initial = .5, k = 4)

estimate_length_sb(rel_initial = .5, rel_desired = .8)

Correlations between a variable and a composite or between two composites

The composite_r_matrix() and composite_r_scalar() functions can compute composite correlations using matrix and scalar inputs, respectively. Both functions are capable of computing correlations between a composite and a single variable or between two composites.

# Correlation between a variable and a composite
composite_r_matrix(r_mat = reshape_vec2mat(.4, order = 5), x_col = 2:5, y_col = 1)

composite_r_scalar(mean_rxy = .4, k_vars_x = 4, mean_intercor_x = .4)


# Correlation between two composites
composite_r_matrix(r_mat = reshape_vec2mat(.3, order = 5), x_col = 1:3, y_col = 4:5)

composite_r_scalar(mean_rxy = .3,
                   k_vars_x = 3, mean_intercor_x = .3, k_vars_y = 2, mean_intercor_y = .3)

d values of composites

The composite_d_matrix() and composite_d_scalar() functions can compute composite d values using matrix and scalar inputs, respectively.

composite_d_matrix(d_vec = c(1, 1),
                   r_mat = reshape_vec2mat(.7),
                   wt_vec = c(1, 1),
                   p = .5)

composite_d_scalar(mean_d = 1, mean_intercor = .7, k_vars = 2, p = .5)

Reliability of composites

The composite_rel_matrix() and composite_rel_scalar() functions can compute composite reliabilities (i.e., Mosier reliabilities) using matrix and scalar inputs, respectively.

composite_rel_matrix(rel_vec = c(.8, .8), r_mat = reshape_vec2mat(.4), sd_vec = c(1, 1))

composite_rel_scalar(mean_rel = .8, mean_intercor = .4, k_vars = 2)

Matrix regression

The lm_mat() function mimics the structure and appearance of the lm() function from the stats package, but uses correlation/covariance matrices to compute regression models instead of raw data sets.

# For purposes of demonstration, generate a dataset from the following matrix:
S <- reshape_vec2mat(cov = c(.3 * 2 * 3,
                             .4 * 2 * 4,
                             .5 * 3 * 4),
                     var = c(2, 3, 4)^2,
                     var_names = c("X", "Y", "Z"))
mean_vec <- setNames(c(1, 2, 3), colnames(S))
dat <- data.frame(MASS::mvrnorm(n = 100, mu = mean_vec, Sigma = S, empirical = TRUE))

The same formulas can be used with lm_mat() as with lm(), you just have to provide cov_mat and n arguments (and an optional mean_vec argument) instead of data. However, lm_mat() can only compute moderated regressions if the product term is already in cov_mat.

# Compute regression models using one predictor:
lm_out1 <- lm(formula = Y ~ X, data = dat)
lm_mat_out1 <- lm_mat(formula = Y ~ X, cov_mat = S, mean_vec = mean_vec, n = nrow(dat))

# Compute regression models using two predictors:
lm_out2 <- lm(formula = Y ~ X + Z, data = dat)
lm_mat_out2 <- lm_mat(formula = Y ~ X + Z, cov_mat = S, mean_vec = mean_vec, n = nrow(dat))

The summary() method works for both lm() and lm_mat() results and the printouts look identical.

summary(lm_out1)
summary(lm_mat_out1)

lm_mat() results can also be used with functions such as anova(), predict(), and confint() to yield output comparable to what one would get with lm() results.

anova(lm_out1, lm_out2)
anova(lm_mat_out1, lm_mat_out2)

Matrix regression applied to meta-analytic correlations

The get_matrix() function can extract correlation matrices from meta-analysis objects for use in subsequent multivariate analyses.

mat_array <- get_matrix(ma_obj = ma_obj)
R <- mat_array$individual_correction$`moderator_comb: 1`$true_score$mean_rho
n <- mat_array$individual_correction$`moderator_comb: 1`$true_score$N[1,3]

The lm_mat() function can be used to compute regressions from meta-analytic matrices, using the mean N or harmonic mean N as a sample size.

lm_x <- lm_mat(formula = Y ~ X, cov_mat = R, n = n)
lm_xz <- lm_mat(formula = Y ~ X + Z, cov_mat = R, n = n)

Meta-analytic incremental validity analyses are simple with the anova() method applied to lm_mat() results.

summary(lm_x)
summary(lm_xz)
anova(lm_x, lm_xz)

Best practices for publishing with psychmeta:

  1. Include your R script as an appendix or online supplement.
  2. Include your data table as a CSV in an online supplement.
  3. Cite psychmeta.
    • To cite psychmeta, please reference our in-press software announcement in Applied Psychological Measurement (Dahlke & Wiernik, 2018b, see the bibliography below for reference information).

Bibliography

Dahlke, J. A., & Wiernik, B. M. (2018a). One of these artifacts is not like the others: Accounting for indirect range restriction in organizational and psychological research [Manuscript submitted for publication].

Dahlke, J. A., & Wiernik, B. M. (2018b). psychmeta: An R package for psychometric meta-analysis. Applied Psychological Measurement. https://doi.org/10.1177/0146621618795933

Gonzalez-Mulé, E., Mount, M. K., & Oh, I.-S. (2014). A meta-analysis of the relationship between general mental ability and nontask performance. Journal of Applied Psychology, 99(6), 1222–1243. https://doi.org/10.1037/a0037547

Schmidt, F. L. (2017). Statistical and measurement pitfalls in the use of meta-regression in meta-analysis. Career Development International, 22(5), 469–476. https://doi.org/10/gcnjf4

Wiernik, B. M., Kostal, J. W., Wilmot, M. P., Dilchert, S., & Ones, D. S. (2017). Empirical benchmarks for interpreting effect size variability in meta-analysis. Industrial and Organizational Psychology, 10(3), 472–479. https://doi.org/10/ccnv