Table 1
Participant characteristics for Experiment 1: average score on the L2 English LexTALE test, average score on the AX-CPT test, self-reported age of acquisition (‘AoA’, in years of age) and proficiency (‘SRP’; based on a 1–7 Likert scale) with regards to listening, speaking, reading, and writing in both L1 Dutch and L2 English, and self-reported average hours of use per day of L2 English. For comparison with Table 6.
| MEASURE | AVERAGE | SD |
|---|---|---|
| LexTALE | 79.9 | 12.37 |
| AX-CPT | –0.03 | 0.05 |
| L1 Listening AoA | 0.0 | 0.00 |
| L1 Speaking AoA | 0.5 | 0.90 |
| L1 Reading AoA | 1.7 | 2.39 |
| L1 Writing AoA | 1.9 | 2.65 |
| L1 Listening SRP | 7.0 | 0.00 |
| L1 Speaking SRP | 7.0 | 0.00 |
| L1 Reading SRP | 7.0 | 0.00 |
| L1 Writing SRP | 7.0 | 0.14 |
| L2 Listening AoA | 7.9 | 3.58 |
| L2 Speaking AoA | 9.4 | 3.25 |
| L2 Reading AoA | 10.0 | 2.23 |
| L2 Writing AoA | 10.2 | 2.59 |
| L2 Listening SRP | 6.1 | 0.85 |
| L2 Speaking SRP | 5.3 | 1.17 |
| L2 Reading SRP | 5.9 | 1.16 |
| L2 Writing SRP | 5.3 | 1.29 |
| L2 Hours of use per day | 5.6 | 3.73 |

Figure 1
Example of one trial in the Price Memorization Block. This block familiarized participants with the prices of each of the items to facilitate sentence production in subsequent blocks.

Figure 2
A: Two example trials from the Language A Practice Block. For half of the participants, Language A corresponded to English. As such, they named each presented item in English, before pressing the space bar on a keyboard to move to the next trial. B: Two example trials from a corresponding Language B Practice Block for the same participant. If Language B indeed corresponded to Dutch for a participant, they were required to name each of the presented items in Dutch. Note that this block also familiarized participants with the link between color cues and the to-be-used languages.

Figure 3
Trial structure in Language A Baseline Block and Language B Baseline Block. Each of these blocks required the use of only one language. A: If yellow and blue color cues referred to Dutch for a participant, they produced sentences in Dutch in response to the presented cue and item. B: For that same participant, purple and pink color cues then required sentence production in English. The relation between to-be-used language and corresponding color cues was counterbalanced across participants.

Figure 4
A: Language switch sequence in case the yellow and purple color cues corresponded to different languages. B: Language repeat sequence in case the purple and pink color cues corresponded to the same language.
Table 2
Linear mixed effects models that we pre-registered to use in the analyses of the response time data collected in Experiment 1 (Switch Cost Model and Mixing Cost Model).
| RT Switch Cost Model Experiment 1: ReactionTime ~ Language*TrialType + (1 + Language*TrialType | Subject) + (1 + Language*TrialType | Item) |
| RT Mixing Cost Model Experiment 1: ReactionTime ~ Language*TrialType + (1 + Language*TrialType | Subject) + (1 + Language*TrialType | Item) |
[i] Note: RT Switch Cost Model: Language (Dutch = –0.5; English = 0.5) and TrialType (Language Repeat = –0.5 and Language Switch = 0.5) were sum-coded. RT Mixing Cost Model: Language (Dutch = –0.5; English = 0.5) and TrialType (Language Repeat = –0.5 and Single-Language = 0.5) were sum-coded.
Table 3
Average reaction times (in ms) and error rate (in proportions) per condition in Experiment 1. Values within parentheses are standard deviations.
| CONDITION | RT | ERROR RATE |
|---|---|---|
| L1 (Dutch) Baseline | 998 (351) | .036 (.186) |
| L2 (English) Baseline | 913 (314) | .023 (.150) |
| L1 (Dutch) Language Switch | 1354 (431) | .105 (.307) |
| L1 (Dutch) Language Repeat | 1310 (433) | .072 (.259) |
| L2 (English) Language Switch | 1284 (419) | .065 (.246) |
| L2 (English) Language Repeat | 1214 (401) | .033 (.180) |
Table 4
Outcome of the linear mixed effects models performed on the RT data from Experiment 1. Model structure reflects the model fit by maximum likelihood as indicated by the buildmer package. Significant p values are indicated in boldface.
| 1. Mixed Block comparison | ||||
| RT Switch Cost Model: ReactionTime ~ Language*TrialType + (1+Language*TrialType | Subject) + (1+Language*TrialType | Item) | ||||
| Estimate | SE | t value | p value | |
| Language | –91.31 | 15.74 | –5.80 | 1.62e-06 |
| TrialType | 57.37 | 8.06 | 7.12 | 8.62e-07 |
| Language × TrialType | 23.52 | 17.50 | 1.34 | 0.19 |
| 2. Mixing Cost analysis | ||||
| RT Mixing Cost Model: ReactionTime ~ Language*TrialType + (1 + Language*TrialType | Subject) + (1 + Language*TrialType | Item) | ||||
| Estimate | SE | t value | p value | |
| Language | –90.31 | 16.98 | –5.32 | 3.76e-06 |
| TrialType | –303.95 | 24.97 | –12.17 | 4.00e-16 |
| Language × TrialType | 8.51 | 19.96 | 0.43 | 0.67 |
[i] Note: RT Switch Cost Model: Language (Dutch = –0.5; English = 0.5) and TrialType (Language Repeat = –0.5 and Language Switch = 0.5) were sum-coded. RT Mixing Cost Model: Language (Dutch = –0.5; English = 0.5) and TrialType (Language Repeat = –0.5 and Single-Language = 0.5) were sum-coded.

Figure 5
Violin plots depicting the RT data on correct trials in each of the four conditions in the mixed block in Experiment 1. Filled diamonds indicate the average RT for each condition.

Figure 6
Violin plots depicting the RT data on correct trials in each of the four conditions in the mixing cost analysis in Experiment 1. Filled diamonds indicate the average RT for each condition.
Table 5
Outcome of the logistic mixed effects models performed on the error rate data from Experiment 1. Model structure reflects the model fit by maximum likelihood as indicated by the buildmer package. Significant p values are indicated in boldface.
| 1. Mixed Block comparison | ||||
| Accuracy Switch Cost Model: ErrorRate ~ 1 + Language*TrialType + (1 + TrialType | Subject) + (1 + Language | Item) | ||||
| Estimate | SE | z value | p value | |
| Language | –.70 | .12 | –5.92 | 3.25e–09 |
| TrialType | .59 | .10 | 5.69 | 1.26e–08 |
| Language × TrialType | .26 | .16 | 1.61 | .11 |
| 2. Mixing Cost analysis | ||||
| Accuracy Mixing Cost Model: ErrorRate ~ 1 + Language*TrialType + (1 + TrialType | Subject) + (1 + Language | Item) | ||||
| Estimate | SE | z value | p value | |
| Language | –.21 | .15 | –1.44 | .15 |
| TrialType | –.54 | .15 | –3.48 | .0005 |
| Language × TrialType | 1.31 | .20 | 6.41 | 1.44e–10 |
[i] Note: Accuracy Switch Cost Model: Language (Dutch = –0.5; English = 0.5) and TrialType (Language Repeat = –0.5 and Language Switch = 0.5) were sum-coded. Accuracy Mixing Cost Model: Language (Dutch = –0.5; English = 0.5) and TrialType (Language Repeat = –0.5 and Single-Language = 0.5) were sum-coded.

Figure 7
Illustration of the CAVE setup with its three connected screens. Participants were immersed in a virtual marketplace and acted as a storeowner of a fruit and vegetables stand. Virtual agents (one at a time, as depicted here) visited the stand. A subset of the infrared-cameras that allowed for motion tracking are present as red circles in the picture.

Figure 8
In the Price Memorization Block in Experiment 2, participants practiced the link between the fruit and vegetable items and their prices. They indicated via button press what price they thought an object had, selecting either the left or right option as presented at the bottom of the virtual computer screen as depicted here, and received on-screen feedback after their response, as in Experiment 1.

Figure 9
In the Language Practice Blocks in Experiment 2, participants named each of the pictures once in Dutch (in the Dutch block) and once in English (in the English block). Per practice block, two virtual agents were presented (one at a time) to further familiarize participants with the link between language cue (i.e., a virtual agent) and the language they should respond with (Dutch for two virtual agents, English for two other virtual agents).
Table 6
Participant characteristics for Experiment 2: average score on the L2 English LexTALE test, average score on the AX-CPT test, self-reported age of acquisition (‘AoA’, in years of age) and proficiency (‘SRP’; based on a 1–7 Likert scale) with regards to listening, speaking, reading, and writing in both L1 Dutch and L2 English, and self-reported average hours of use per day of L2 English. For comparison with Table 1.
| MEASURE | AVERAGE | SD |
|---|---|---|
| LexTALE | 80.3 | 13.50 |
| AX-CPT | –0.03 | 0.05 |
| L1 Listening AoA | 0.2 | 0.57 |
| L1 Speaking AoA | 0.5 | 0.85 |
| L1 Reading AoA | 1.2 | 2.01 |
| L1 Writing AoA | 1.4 | 2.29 |
| L1 Listening SRP | 7.0 | 0.00 |
| L1 Speaking SRP | 7.0 | 0.00 |
| L1 Reading SRP | 7.0 | 0.00 |
| L1 Writing SRP | 7.0 | 0.00 |
| L2 Listening AoA | 9.5 | 4.39 |
| L2 Speaking AoA | 10.8 | 3.63 |
| L2 Reading AoA | 10.5 | 3.31 |
| L2 Writing AoA | 11.3 | 3.16 |
| L2 Listening SRP | 6.1 | 0.87 |
| L2 Speaking SRP | 5.0 | 1.10 |
| L2 Reading SRP | 5.8 | 1.10 |
| L2 Writing SRP | 5.1 | 1.29 |
| L2 Hours of use per day | 5.2 | 3.57 |
Table 7
Linear mixed effects models that were pre-registered for the analyses of the response time data collected in Experiment 2 (Switch Cost Model and Mixing Cost Model) and the dataset combining the response time data from both experiments (Switch Cost Model Overall and Mixing Cost Model Overall).
| RT Switch Cost Model Experiment 2: ReactionTime ~ Language*TrialType + (1 + Language*TrialType | Subject) + (1 + Language*TrialType | Item) |
| RT Mixing Cost Model Experiment 2: ReactionTime ~ Language*TrialType + (1 + Language*TrialType | Subject) + (1 + Language*TrialType | Item) |
| RT Switch Cost Model Overall: ReactionTime ~ Language*TrialType*Experiment + (1 + Language*TrialType | Subject) + (1 + Language*TrialType | Item) |
| RT Mixing Cost Model Overall: ReactionTime ~ Language*TrialType*Experiment + (1 + Language*TrialType | Subject) + (1 + Language*TrialType | Item) |
[i] Note: RT Switch Cost Model: Language (Dutch = –0.5; English = 0.5) and TrialType (Language Repeat = –0.5 and Language Switch = 0.5) were sum-coded. RT Mixing Cost Model: Language (Dutch = –0.5; English = 0.5) and TrialType (Language Repeat = –0.5 and Single-Language = 0.5) were sum-coded. Overall models: Experiment (Experiment 1 = –0.5; Experiment 2 = 0.5) added into the models on the combined data from both experiments.
Table 8
Average reaction times (in ms) and error rate (in proportions) per condition in Experiment 2. Values within parentheses are standard deviations.
| CONDITION | RT | ERROR RATE |
|---|---|---|
| L1 (Dutch) Baseline | 1196 (392) | .026 (.160) |
| L2 (English) Baseline | 1137 (367) | .045 (.208) |
| L1 (Dutch) Language Switch | 1502 (409) | .075 (.264) |
| L1 (Dutch) Language Repeat | 1512 (431) | .050 (.219) |
| L2 (English) Language Switch | 1439 (393) | .061 (.239) |
| L2 (English) Language Repeat | 1397 (382) | .042 (.201) |
Table 9
Outcome of the linear mixed effects models performed on the RT data from Experiment 2. Model structure reflects the model fit by maximum likelihood as indicated by the buildmer package. Significant p values are indicated in boldface.
| 1. Mixed Block comparison | ||||
| RT Switch Cost Model: ReactionTime ~ 1 + Language*TrialType + (1 + Language + TrialType | Subject) + (1 + Language + TrialType | Item) | ||||
| Estimate | SE | t value | p value | |
| Language | –91.04 | 16.41 | –5.55 | 2.07-06 |
| TrialType | 17.88 | 9.41 | 1.90 | 0.07 |
| Language × TrialType | 48.01 | 12.16 | 3.95 | 7.91-05 |
| 2. Mixing Cost analysis | ||||
| RT Mixing Cost Model: ReactionTime ~ Language*TrialType + (1 + Language*TrialType | Subject) + (1 + Language*TrialType | Item) | ||||
| Estimate | SE | t value | p value | |
| Language | –82.37 | 17.89 | –4.61 | 3.46e-05 |
| TrialType | –275.36 | 18.90 | –14.57 | <2e-16 |
| Language × TrialType | 55.52 | 20.11 | 2.76 | 0.01 |
[i] Note: RT Switch Cost Model: Language (Dutch = –0.5; English = 0.5) and TrialType (Language Repeat = –0.5 and Language Switch = 0.5) were sum-coded. RT Mixing Cost Model: Language (Dutch = –0.5; English = 0.5) and TrialType (Language Repeat = –0.5 and Single-Language = 0.5) were sum-coded.

Figure 10
Violin plots depicting the RT data on correct trials in each of the four conditions in the mixed block in Experiment 2. Filled diamonds indicate the average RT for each condition.

Figure 11
Violin plots depicting the RT data on correct trials in each of the four conditions in the mixing cost analysis in Experiment 2. Filled diamonds indicate the average RT for each condition.
Table 10
Outcome of the logistic mixed effects models performed on the error rate data from Experiment 2. Model structure reflects the model fit by maximum likelihood as indicated by the buildmer package. Significant p values are indicated in boldface.
| 1. Mixed Block comparison | ||||
| Accuracy Switch Cost Model: ErrorRate ~ 1 + Language*TrialType + (1 + Language | Subject) + (1 + Language | Item) | ||||
| Estimate | SE | z value | p value | |
| Language | –.32 | .21 | –1.51 | .13 |
| TrialType | .41 | .08 | 4.96 | 7.19e-07 |
| Language × TrialType | –.05 | .16 | –.29 | .77 |
| 2. Mixing Cost analysis | ||||
| Accuracy Mixing Cost Model: ErrorRate ~ 1 + Language*TrialType + (1 + Language | Subject) + (1 + Language | Item) | ||||
| Estimate | SE | z value | p value | |
| Language | –.16 | .27 | –.59 | .55 |
| TrialType | –.34 | .10 | –3.46 | .001 |
| Language × TrialType | .84 | .19 | 4.33 | 1.48e-05 |
[i] Note: Accuracy Switch Cost Model: Language (Dutch = –0.5; English = 0.5) and TrialType (Language Repeat = –0.5 and Language Switch = 0.5) were sum-coded. Accuracy Mixing Cost Model: Language (Dutch = –0.5; English = 0.5) and TrialType (Language Repeat = –0.5 and Single-Language = 0.5) were sum-coded.

Figure 12
Line graphs depicting the average RT data per condition for Experiment 1 (top panel) and Experiment 2 (bottom panel). Shaded ribbons indicate one standard deviation above and below the mean.
Table 11
Outcome of the linear mixed effects models performed on the RT data from Experiments 1 and 2 combined. Model structure reflects the model fit by maximum likelihood as indicated by the buildmer package. Significant p values are indicated in boldface.
| 1. Mixed Block comparison | ||||
|---|---|---|---|---|
| RT Switch Cost Model: Reaction Time ~ TrialType*Experiment*Language + (1 + Language*TrialType | Subject) + (1 + Language*TrialType | Item) | ||||
| Estimate | SE | t value | p value | |
| Experiment | 173.68 | 33.80 | 5.14 | 1.37e-05 |
| Language | –103.09 | 15.03 | –6.86 | 8.59e-09 |
| TrialType | 36.61 | 6.54 | 5.60 | 2.00e-06 |
| Experiment × TrialType | –44.54 | 12.13 | –3.67 | .001 |
| Language × TrialType | 34.90 | 11.81 | 2.96 | .004 |
| Experiment × Language | –9.31 | 25.02 | –.37 | .71 |
| Experiment × Language × TrialType | 16.36 | 23.19 | .71 | .48 |
| 2. Mixing Cost analysis | ||||
| RT Mixing Cost Model: Reaction Time ~ TrialType*Experiment*Language + (1 + Language*TrialType | Subject) + (1 + Language*TrialType | Item) | ||||
| Estimate | SE | t value | p value | |
| Experiment | 206.09 | 31.96 | 6.45 | 3.15e-07 |
| Language | –99.00 | 14.23 | –6.96 | 3.77e-09 |
| TrialType | –307.68 | 17.45 | –17.64 | <2e-16 |
| Experiment × TrialType | 17.69 | 11.69 | 1.51 | .13 |
| Language × TrialType | 42.74 | 17.98 | 2.38 | .02 |
| Experiment × Language | 5.86 | 22.13 | .27 | .79 |
| Experiment × Language × TrialType | 46.82 | 23.34 | 2.01 | .049 |
[i] Note: RT Switch Cost Model: Language (Dutch = –0.5; English = 0.5), TrialType (Language Repeat = –0.5 and Language Switch = 0.5), and Experiment (Experiment 1 = –0.5; Experiment 2 = 0.5) were sum-coded. RT Mixing Cost Model: Language (Dutch = –0.5; English = 0.5), TrialType (Language Repeat = –0.5 and Single-Language = 0.5), and Experiment (Experiment 1 = –0.5; Experiment 2 = 0.5) were sum-coded.
Table 12
Outcome of the logistic mixed effects models performed on the error rate data from the two experiments combined. Significant p values are indicated in boldface.
| 1. Mixed Block comparison | ||||
| Accuracy Switch Cost Model: ErrorRate ~ 1 + Language*TrialType*Experiment + (1 + Language | Subject) + (1 + Language | Item) | ||||
| Estimate | SE | z value | p value | |
| Experiment | –.19 | .17 | –1.15 | .25 |
| Language | –.58 | .12 | –4.75 | 2.03e–06 |
| TrialType | .49 | .06 | 8.65 | <2e–16 |
| Experiment × TrialType | –.14 | .11 | –1.22 | .22 |
| Language × TrialType | .15 | .11 | 1.35 | .18 |
| Experiment × Language | .43 | .22 | 2.00 | .046 |
| Experiment × Language × TrialType | –.27 | .23 | –1.18 | .24 |
| 2. Mixing Cost analysis | ||||
| Accuracy Mixing Cost Model: ErrorRate ~ 1 + Language*TrialType*Experiment + (1 + TrialType + Language | Subject) + (1 + Language | Item) | ||||
| Estimate | SE | z value | p value | |
| Experiment | –.00 | .23 | –.01 | .99 |
| Language | –.22 | .16 | –1.40 | .16 |
| TrialType | –.46 | .10 | –4.62 | 3.81e–06 |
| Experiment × TrialType | .27 | .14 | 1.94 | .053 |
| Language × TrialType | 1.09 | .15 | 7.43 | 1.07e–13 |
| Experiment × Language | .37 | .25 | 1.47 | .14 |
| Experiment × Language × TrialType | –.53 | .28 | –1.89 | .06 |
[i] Note: Accuracy Switch Cost Model: Language (Dutch = –0.5; English = 0.5), TrialType (Language Repeat = –0.5 and Language Switch = 0.5), and Experiment (Experiment 1 = –0.5; Experiment 2 = 0.5) were sum-coded. Accuracy Mixing Cost Model: Language (Dutch = –0.5; English = 0.5), TrialType (Language Repeat = –0.5 and Single-Language = 0.5), and Experiment (Experiment 1 = –0.5; Experiment 2 = 0.5) were sum-coded.
Table 13
Numerical size of the RT switch cost in ms in L1 (L1 switch cost: L1 switch – L1 repeat), the RT switch cost in L2 (L2 switch cost: L2 switch – L2 repeat) and the RT reversed language dominance (RLD: L1 Dutch – L2 English) as observed in the mixed block in seven experiments on different samples from the same unbalanced Dutch-English bilingual population.
