This document is a companion tutorial to a dual-set of presentations given by Katie Von Holzen at the colloquium of the Department of English Language and Literature at Chosun University on March 15th, 2024.

Any questions should be directed to Katie Von Holzen.

The first presentation, “Meta-analysis and mispronunciations: An introduction to meta-analyses” can be downloaded in the project folder. During this presentation, the audience was introduced to the meta-analysis of Von Holzen & Bergmann (2021), which examined infants’ sensitivity to mispronunciations. Using this meta-analysis, the audience was introduced to what we can gain from meta-analyses. This included how meta-analyses can be used for experimental planning, aggregating across studies to determine the effect size of the phenomenon, which can then be used to determine the typical statistical power of studies investigating this effect as well as the required sample size to have 80% power to detect the effect. The potential theoretical insights were also discussed, namely investigations of variables that may modulate the effect size. Although only briefly mentioned, meta-analyses can be used to investigate the potential publication bias of an effect as well as to uncover unforeseen insights, namely things that are discovered along the way of conducting a meta-analysis.

The manuscript of Von Holzen & Bergmann (2021) as well as the accompanying code which can recreate the manuscript including the meta-analysis can be found in our Open Science Framework Repository:

The manuscript has been published in Developmental Psychology:

Von Holzen, K., & Bergmann, C. (2021). The development of
infants’ responses to mispronunciations: A Meta-Analysis.
*Developmental Psychology, 57*(1), 1–18. https://doi.org/10.1037/dev0001141

The second presentation, “How-to Meta-Analysis”, can be downloaded in the project folder. During this presentation, the audience was introduced to the different parts that go into creating a meta-analytic database (Parts A and B). The majority of the current document is devoted to Part C: Calculate the meta-analytic effect.

Assembling the studies for a meta-analysis involves narrowing down the topic of focus of the meta-analysis, conducting a literature search for relevant studies, and converging on final constraints of what studies and variables will be analyzed in the meta-analysis. This step is too long of a process to be covered in this tutorial, but here are some helpful links:

The next step is to enter the relevant variables from each study. In the presentation, we looked at these variables in the study of Von Holzen and colleagues (Von Holzen et al., 2023).

Finally, we calculate the effect size Hedges’ *g* (Hedges,
1981) for each record in our database and compute the meta-analytic
effect. We then use this to compute statistical power and for sample
size planning. Part C is covered in this document.

If you haven’t already, download and unzip the project folder and download RStudio. Find the location of the project folder on your computer and open the folder. Inside you will find this RMarkdown file “Intro_to_Meta_Analysis.Rmd” and several folders. These folders are:

`data`

: where our database is located`figures`

: for the pretty pictures we’ll produce`literature`

: contains a pdf of the Von Holzen et al., (2023) paper with the relevant information highlighted`scripts`

: different`.R`

files containing the code needed for our analysis

Double click on the RMarkdown file
`Intro_to_Meta_Analysis.Rmd.`

You can now run each of the
following steps on your own computer.

We’ll need several packages for this analysis, so the next step is to install and load them.

```
### Load libraries
library(metafor)
library(meta)
library(pwr)
library(knitr)
library(schoRsch)
library(tidyverse)
```

The .csv file with the updated values from Von Holzen and colleagues
(Von Holzen et al., 2023) is `~data/MA_ET.csv`

. We want to
load this .csv file into R using the `read.csv()`

function
and assign it to the variable `db_ET`

.

`db_ET <- read.csv("data/MA_ET.csv")`

You should probably take a look at the database and make sure it loaded correctly.

`View(db_ET)`

In the script folder are a series of .R script files that this
RMarkdown file uses to prepare the data. We’ll read these .R script
files into our RMarkdown file using the function
`source()`

.

The file `starter_functions.R`

is a collection of
functions that I have written to make small actions in R more easy.
Instead of writing out a complicated code over and over again, I made my
own small functions to take care of it. If you want to learn more about
creating your own functions, read
this.

`source("scripts/starter_functions.R")`

The file `calculateES.R`

is a script adapted from MetaLab to compute effect sizes,
specifically Cohen’s *d* and Hedges’ *g*. A description is
given here.

`source("scripts/calculateES.R")`

The resulting object is still called `db_ET`

. It now has
several new columns that contain values we will need for our
meta-analysis.

corr_imputed | d_calc | d_var_calc | es_method | imputed_corr | g_calc | g_var_calc | weights_g | weights_d |
---|---|---|---|---|---|---|---|---|

0.2439964 | 0.5740402 | 0.0941200 | t_one | yes | 0.5535387 | 0.0875172 | 130.5608 | 112.8849 |

0.0611375 | 0.6624503 | 0.0964965 | t_one | yes | 0.6387914 | 0.0897270 | 124.2092 | 107.3932 |

0.0673125 | 0.4499741 | 0.0913582 | t_one | yes | 0.4339036 | 0.0849491 | 138.5741 | 119.8133 |

-0.5758838 | 0.5786275 | 0.0942350 | t_one | yes | 0.5579622 | 0.0876241 | 130.2425 | 112.6096 |

0.1127877 | 0.5933376 | 0.0989102 | t_one | yes | 0.5710874 | 0.0916310 | 119.1008 | 102.2157 |

The file `Preprocess.R`

further takes the information
produced by `calculateES.R`

and prepares it for analysis,
removing outliers for example. This will give us the object
`dat`

, which we will use in the rest of the code.

`source("scripts/Preprocess.R")`

If you’re working through this within RMarkdown, type
`noES`

into the console and it will give you the list of
papers that don’t have enough information to compute an effect size.

```
## # A tibble: 3 × 2
## short_cite n_records
## <chr> <int>
## 1 Ballem & Plunkett (2005) 4
## 2 Renner (2017) 6
## 3 Skoruppa et al. (2013) 1
```

If you’re working through this within RMarkdown, type
`n_outlier`

into the console and it will give you the number
of records (rows) that were removed because Hedges’ *g* was 3
standard deviations above or below the mean of the entire sample:

```
## # A tibble: 4 × 2
## same_participant_calc n_records
## <chr> <int>
## 1 24_4_ 2
## 2 42_1_ 2
## 3 42_2_ 1
## 4 45_1_ 1
```

Now we finally come to to the meat of the subject, the meta-analysis!
As in Von Holzen & Bergmann (2021), we first examine object
recognition for both correctly pronounced (`is_correct == 1`

)
and mispronounced words (`is_mp == 1`

). Here, we want to see
if our effect size Hedges’ *g* is significantly different from
zero, which would indicate object recognition.

Basically, we are asking R to evaluate whether our effect size
Hedges’ *g*, considering its variance, differs significantly from
0 across all of the papers/records in our database. That’s what we’re
saying when we refer to `g_calc`

and `g_var_calc`

.
Then, the bit of code after `random =`

is specifying our
random effects structure. Since some of these records might be from the
same participants in different conditions, they are likely to be
correlated with one another. We want to take that into account! Most
likely, you won’t have to change anything here, but its useful to know
what it means.

`rma_model_correct = rma.mv(g_calc, g_var_calc, data = subset(dat, is_correct == 1), random = list(~ same_participant_calc | short_cite, ~ unique_row | short_cite.same_participant_calc))`

We can look at the results and inspect the model by calling
`summary()`

.

`summary(rma_model_correct)`

```
##
## Multivariate Meta-Analysis Model (k = 111; method: REML)
##
## logLik Deviance AIC BIC AICc
## -93.5074 187.0147 197.0147 210.5171 197.5917
##
## Variance Components:
##
## outer factor: short_cite (nlvls = 33)
## inner factor: same_participant_calc (nlvls = 52)
##
## estim sqrt fixed
## tau^2 0.4225 0.6500 no
## rho 0.9838 no
##
## outer factor: short_cite.same_participant_calc (nlvls = 52)
## inner factor: unique_row (nlvls = 111)
##
## estim sqrt fixed
## gamma^2 0.0500 0.2235 no
## phi -0.5687 no
##
## Test for Heterogeneity:
## Q(df = 110) = 553.3077, p-val < .0001
##
## Model Results:
##
## estimate se zval pval ci.lb ci.ub
## 0.8953 0.1192 7.5082 <.0001 0.6616 1.1290 ***
##
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

It may seem overwhelming at first, but we can focus on specific
things that we really need. First, the value under
`estimate`

, 0.895 is actually our estimated effect size
Hedges’ *g*, according to our model! We can also get its standard
error, or `SE`

, which is 0.119, as well as the 95% confidence
interval [0.662, 1.129]. Don’t know what a confidence interval is?

*A confidence interval gives an estimated range of values which is
likely to include an unknown population parameter, the estimated range
being calculated from a given set of sample data. (Definition taken from
Valerie J. Easton and John H. McColl’s Statistics Glossary v1.1)
*

And finally, is our estimated effect size Hedges’ *g*
significantly different from 0? We can see this when when we look at
`pval`

, which gives our *p*-value of 0. If it is below
.05, then our estimated effect size Hedges’ *g* is significantly
different from 0.

We can do the same for mispronounced word object recognition, running
the model on just mispronounced words and evaluating the model and
determining whether objects were still recognized in this condition by
calling `summary()`

. Below, I’ll use the wording from Von
Holzen & Bergmann (2021) to give these values.

```
rma_model_misp = rma.mv(g_calc, g_var_calc, data = subset(dat, is_mp == 1), random = list(~ same_participant_calc | short_cite, ~ unique_row | short_cite.same_participant_calc))
summary(rma_model_misp)
```

```
##
## Multivariate Meta-Analysis Model (k = 159; method: REML)
##
## logLik Deviance AIC BIC AICc
## -115.3333 230.6667 240.6667 255.9797 241.0614
##
## Variance Components:
##
## outer factor: short_cite (nlvls = 33)
## inner factor: same_participant_calc (nlvls = 51)
##
## estim sqrt fixed
## tau^2 0.0778 0.2790 no
## rho 0.6470 no
##
## outer factor: short_cite.same_participant_calc (nlvls = 51)
## inner factor: unique_row (nlvls = 159)
##
## estim sqrt fixed
## gamma^2 0.0835 0.2889 no
## phi 0.1508 no
##
## Test for Heterogeneity:
## Q(df = 158) = 572.3957, p-val < .0001
##
## Model Results:
##
## estimate se zval pval ci.lb ci.ub
## 0.2486 0.0595 4.1781 <.0001 0.1320 0.3653 ***
##
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

We then calculated the meta-analytic effect for object identification
in response to mispronounced words. In this case, the variance-weighted
meta-analytic effect size was 0.249 (SE = 0.06), a small effect, which
was also significantly different from zero (CI [0.132, 0.365],
*p* < .0001).

In each of the above models, a `Test for Heterogeneity`

is
included in the `summary()`

output and it was significant for
both models. This indicates that the sample contains unexplained
variance leading to significant difference between studies beyond what
is to be expected based on random sampling error. One way to account for
this is in moderator analyses, which we investigate extensively in our
manuscript.

From our manuscript: “The above two analyses considered the data from mispronounced and correctly pronounced words separately. To evaluate mispronunciation sensitivity, we compared the effect size Hedges’ g for correct pronunciations with mispronunciations directly.” This means we include “condition” (correct vs. mispronounced) as a moderator. The resulting model looks like this:

```
dat$condition <- ifelse(dat$is_correct == 1, "correct", "mispronunciation")
rma_MPeffect <- rma.mv(g_calc, g_var_calc, mods = ~condition, data = dat, random = list(~ same_participant_calc | short_cite, ~ unique_row | short_cite.same_participant_calc))
summary(rma_MPeffect)
```

```
##
## Multivariate Meta-Analysis Model (k = 270; method: REML)
##
## logLik Deviance AIC BIC AICc
## -227.4766 454.9533 466.9533 488.4992 467.2751
##
## Variance Components:
##
## outer factor: short_cite (nlvls = 33)
## inner factor: same_participant_calc (nlvls = 52)
##
## estim sqrt fixed
## tau^2 0.1194 0.3455 no
## rho 0.7666 no
##
## outer factor: short_cite.same_participant_calc (nlvls = 52)
## inner factor: unique_row (nlvls = 270)
##
## estim sqrt fixed
## gamma^2 0.1358 0.3685 no
## phi 0.0714 no
##
## Test for Residual Heterogeneity:
## QE(df = 268) = 1125.7034, p-val < .0001
##
## Test of Moderators (coefficient 2):
## QM(df = 1) = 98.7897, p-val < .0001
##
## Model Results:
##
## estimate se zval pval ci.lb ci.ub
## intrcpt 0.8410 0.0763 11.0276 <.0001 0.6915 0.9905
## conditionmispronunciation -0.5789 0.0582 -9.9393 <.0001 -0.6930 -0.4647
##
## intrcpt ***
## conditionmispronunciation ***
##
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

`sum_eff <- coef(summary(rma_MPeffect))[2,]`

Again, the wording from Von Holzen & Bergmann (2021):

When condition was included (correct, mispronounced), the moderator
test was significant (QM(1) = 98.79, *p* < .001). The estimate
for mispronunciation sensitivity was -0.579 (SE = 0.058), and infants’
looking behavior across conditions was significantly different (CI
[-0.693, -0.465], *p* < .001). This confirms that although
infants fixate the correct object for both correct pronunciations and
mispronunciations, the observed fixations to target (as measured by the
effect sizes) were significantly greater for correct pronunciations,
suggesting sensitivity to mispronunciations.

Power analysis helps us determine the statistical power of the studies included in our meta-analysis as well as planning for future experiments, in determining how many participants need to be tested in order to find this effect. We’ll focus here on the “Mispronunciation sensitivity meta-analytic effect” described in the previous section.

Ultimately, actual analyses done will depend on whether you are analyzing a within- or between-participants design. The current analysis will focus on power analysis for wihin-participant designs. To learn more, check out the pwr vignette.

First we extract the effect size estimate from our model
`rma_MPeffect`

. We need this to be positive, so we use the
`abs()`

function to get the absolute value.

`effect_size <- abs(r2(coef(summary(rma_MPeffect))[2,1]))`

Next, we get the median sample size by looking at the *n*’s of
the different records. This is a within-subjects design, so we only look
at the values in column `n_1`

. If it was between-subjects, we
would also be interested in the column `n_2`

.

```
si <- dat %>%
select(short_cite.same_participant_calc, expt_num, n_1, n_2) %>%
distinct()
median_sample <- median(si$n_1, na.rm = T)
```

Based on the above calculations, the mispronunciation sensitivity
effect size is 0.579 and the median sample size is 26 infants. We can
use this information to determine how well powered mispronunciation
sensitivity studies usually are if mispronunciation sensitivity is
typically established using a t-test. As a result, we choose the
function `pwr.t.test`

.

```
ttest_power <- pwr.t.test(n = median_sample, d = effect_size, sig.level = 0.05, power = NULL, type = "paired", alternative = "two.sided")
ttest_power
```

```
##
## Paired t test power calculation
##
## n = 26
## d = 0.579
## sig.level = 0.05
## power = 0.8098428
## alternative = two.sided
##
## NOTE: n is number of *pairs*
```

This analysis shows us that these studies are well powered, usually having 80% power to detect the effect of mispronunciation sensitivity. 26 infants is still a reasonable sample and this makes us more sure of the results!

Since we also know our effect size, we can use this to determine what sample size is needed to have 80% power to detect the effect of mispronunciation sensitivity (if we hadn’t gotten a satisfactory value in the previous step).

```
ttest_80 <- pwr.t.test(n = NULL, d = effect_size, sig.level = 0.05, power = .8, type = "paired", alternative = "two.sided")
ttest_80
```

```
##
## Paired t test power calculation
##
## n = 25.40023
## d = 0.579
## sig.level = 0.05
## power = 0.8
## alternative = two.sided
##
## NOTE: n is number of *pairs*
```

We would need 32 or 33 infants to have 80% statistical power to detect the effect of mispronunciation sensitivity. Not bad!

A really nice way to illustrate your results is with a figure! These
we can easily create here in R. For example, the `forest()`

function will create a forest plot of your data. This gives the mean and
variance for each study, as well as for all studies in the
meta-analysis. Values at 0 in this plot indicate no difference in
conditions. This code will save the forest plot in the figures folder
using the function `png()`

.

`forest(rma_MPeffect)`

```
# the code below will save this plot in the figures folder.
png(file = 'figures/forestplot.png')
forest(rma_MPeffect)
dev.off()
```

```
## png
## 2
```