Table 1
Examples of Singapore English Concepts.
| HIGHEST | LOWEST | ||||
|---|---|---|---|---|---|
| WORD | M | SD | WORD | M | SD |
| Valence | |||||
| ang bao/hong bao | 8.29 | 1.07 | chee ko pek/ti ko pek | 2.33 | 1.60 |
| huat | 8.00 | 1.31 | Ah Long | 2.61 | 1.52 |
| shiok | 7.88 | 1.43 | saman | 2.80 | 1.54 |
| makan | 7.87 | 1.42 | O$P$ | 3.11 | 2.16 |
| pasar malam | 7.54 | 1.26 | hao lian | 3.14 | 1.55 |
| Arousal | |||||
| bo ta bo lan pa | 7.12 | 1.58 | lepak | 3.51 | 2.18 |
| chee bai | 7.01 | 1.78 | meh | 3.59 | 1.92 |
| huat | 6.99 | 1.84 | abit | 3.69 | 1.82 |
| si mi lan jiao | 6.85 | 1.91 | helication | 3.92 | 1.73 |
| hong gan | 6.83 | 1.83 | handphone | 4.01 | 1.97 |
| Concreteness | |||||
| kopi | 4.68 | 0.78 | mah | 1.41 | 0.74 |
| handphone | 4.65 | 0.83 | ah | 1.52 | 0.97 |
| ang bao/hong bao | 4.56 | 0.88 | lah | 1.52 | 0.94 |
| kopitiam | 4.52 | 1.00 | ar | 1.53 | 0.88 |
| makan | 4.52 | 0.90 | nia | 1.56 | 0.86 |
| Humor | |||||
| your head | 3.95 | 0.97 | handphone | 1.56 | 0.85 |
| pak chiu cheng | 3.89 | 1.18 | kopitiam | 1.77 | 0.88 |
| sod | 3.86 | 0.88 | kampung | 1.93 | 0.91 |
| bo geh | 3.85 | 1.10 | da bao | 2.00 | 0.98 |
| piak piak | 3.84 | 1.27 | kopi | 2.05 | 1.03 |
Table 2
Correlation of lexical-semantic and affective measures for 282 Singapore English items.
| VALENCE | AROUSAL | CONCRETENESS | HUMOR | |
|---|---|---|---|---|
| Valence | 1 | –0.34*** | 0.12* | –0.41*** |
| Arousal | –0.34*** | 1 | –0.01 | 0.55*** |
| Concreteness | 0.12* | –0.01 | 1 | –0.26*** |
| Humor | –0.41*** | 0.55*** | –0.26*** | 1 |
[i] * p < .05, *** p < .001.

Figure 1
Density plots for valence, arousal, concreteness, and humor ratings provided by human raters.
Table 3
Descriptive information for raw human ratings.
| RATING | MEAN | SD | MEDIAN | MIN | MAX | RANGE | SKEW | KURTOSIS |
|---|---|---|---|---|---|---|---|---|
| Valence | 4.94 | 1.04 | 4.75 | 2.33 | 8.29 | 5.96 | 0.62 | 0.26 |
| Arousal | 5.42 | 0.68 | 5.45 | 3.51 | 7.12 | 3.61 | –0.19 | –0.18 |
| Concreteness | 2.83 | 0.73 | 2.72 | 1.41 | 4.68 | 3.27 | 0.47 | –0.38 |
| Humor | 3.00 | 0.42 | 3.03 | 1.56 | 3.95 | 2.39 | –0.35 | 0.13 |
Table 4
Correlations of human ratings with ChatGPT ratings.
| MEASURE | GENERAL (RAW) | GENERAL (WEIGHTED) | SPECIFIC (RAW) | SPECIFIC (WEIGHTED) |
|---|---|---|---|---|
| Valence | 0.40 | 0.42 | 0.76 | 0.78 |
| Arousal | 0.26 | 0.27 | 0.57 | 0.59 |
| Concreteness | 0.29 | 0.31 | 0.66 | 0.69 |
| Humor | 0.18 | 0.19 | 0.33 | 0.39 |
[i] All correlations were statistically significant, all ps < .01.

Figure 2
Density plots for valence, arousal, concreteness, and humor ratings generated by GPT4 (specific, weighted condition) overlaid on the corresponding human ratings.
Table 5
Descriptive information for LLM ratings (specific, weighted).
| RATING | MEAN | SD | MEDIAN | MIN | MAX | RANGE | SKEW | KURTOSIS |
|---|---|---|---|---|---|---|---|---|
| Valence | 5.30 | 1.55 | 5.08 | 1.00 | 9.00 | 8.00 | –0.19 | –0.01 |
| Arousal | 4.36 | 1.45 | 4.33 | 1.03 | 8.07 | 7.04 | 0.51 | –0.50 |
| Concreteness | 2.40 | 1.09 | 2.06 | 1.00 | 5.00 | 4.00 | 1.12 | 0.70 |
| Humor | 2.93 | 0.40 | 3.00 | 1.00 | 4.17 | 3.17 | –0.78 | 5.06 |
Table 6
Descriptive statistics for 135 word items.
| STATISTIC | N | MEAN | ST. DEV. | MIN | MAX |
|---|---|---|---|---|---|
| no. of letters | 135 | 4.87 | 1.67 | 2 | 12 |
| orthographic neighborhood size | 135 | 4.01 | 5.74 | 0 | 26 |
| mean bigram frequency | 135 | 2,907.95 | 1,485.98 | 226.00 | 6,993.67 |
| log frequency | 135 | 2.80 | 1.60 | 0.00 | 6.76 |
| valence | 135 | 0.18 | 0.61 | –1.26 | 1.90 |
| arousal | 135 | –0.13 | 0.36 | –1.00 | 1.00 |
| concreteness | 135 | –0.02 | 0.74 | –1.51 | 1.92 |
| humor | 135 | –0.13 | 0.43 | –1.64 | 0.80 |
Table 7
Correlation of lexical-semantic and affective measures for 135 word items.
| NO. OF LETTERS | ORTHOGRAPHIC NEIGHBORHOOD SIZE | MEAN BIGRAM FREQUENCY | LOG FREQUENCY | VALENCE | AROUSAL | CONCRETENESS | HUMOR | |
|---|---|---|---|---|---|---|---|---|
| no. of letters | 1 | –0.62*** | 0.21* | –0.44*** | –0.11 | –0.03 | 0.30*** | 0.06 |
| orthographic neighborhood size | –0.62*** | 1 | 0.07 | 0.39*** | 0.10 | –0.12 | –0.29*** | –0.04 |
| mean bigram frequency | 0.21* | 0.07 | 1 | –0.03 | –0.02 | –0.08 | 0.11 | –0.01 |
| log frequency | –0.44*** | 0.39*** | –0.03 | 1 | 0.35*** | –0.18* | –0.10 | –0.29*** |
| valence | –0.11 | 0.10 | –0.02 | 0.35*** | 1 | –0.28*** | 0.08 | –0.34*** |
| arousal | –0.03 | –0.12 | –0.08 | –0.18 | –0.28*** | 1 | 0.001 | 0.52*** |
| concreteness | 0.30*** | –0.29*** | 0.11 | –0.10 | 0.08 | 0.001 | 1 | –0.25** |
| humor | 0.06 | –0.04 | –0.01 | –0.29 | –0.34*** | 0.52*** | –0.25** | 1 |
[i] * p < .05, ** p < .01, *** p < .001.
Table 8
Visual Lexical Decision Models.
| DEPENDENT VARIABLE | ||||
|---|---|---|---|---|
| RT | ACC | |||
| LINEAR MIXED-EFFECTS | GENERALIZED LINEAR MIXED-EFFECTS | |||
| BASE | BASE+NORMS | BASE | BASE+NORMS | |
| (1) | (2) | (3) | (4) | |
| no. of letters | 54.121*** (11.434) | 52.229*** (11.246) | 0.006 (0.170) | 0.041 (0.156) |
| orthographic neighborhood size | 64.943*** (12.387) | 56.943*** (12.024) | –0.473** (0.167) | –0.397** (0.152) |
| mean bigram frequency | 2.057 (9.596) | 3.701 (9.181) | –0.265* (0.131) | –0.249* (0.118) |
| log frequency | –88.212*** (9.943) | –93.193*** (10.627) | 1.664*** (0.147) | 1.668*** (0.146) |
| valence | –20.299* (10.013) | 0.456*** (0.132) | ||
| arousal | –27.718* (11.355) | 0.487*** (0.144) | ||
| concreteness | –24.448* (10.405) | 0.211 (0.128) | ||
| humor | –13.877 (11.662) | 0.192 (0.146) | ||
| Constant | 779.558*** (17.712) | 776.354*** (17.488) | 2.028*** (0.185) | 2.023*** (0.177) |
| Observations | 5,554 | 5,554 | 7,510 | 7,510 |
| Log Likelihood | –37,982.720 | –37,960.160 | –2,684.140 | –2,670.626 |
| Akaike Inf. Crit. | 75,981.440 | 75,944.310 | 5,382.280 | 5,363.251 |
| Bayesian Inf. Crit. | 76,034.420 | 76,023.780 | 5,430.748 | 5,439.415 |
[i] Note: **p < 0.05; **p < 0.01; ***p < 0.001.
Table 9
Comparing human-generated and LLM-generated norms.
| DEPENDENT VARIABLE | ||||
|---|---|---|---|---|
| RT | ACC | |||
| LINEAR MIXED-EFFECTS | GENERALIZED LINEAR MIXED-EFFECTS | |||
| HUMAN | CHATGPT | HUMAN | CHATGPT | |
| (1) | (2) | (3) | (4) | |
| no. of letters | 52.229*** (11.246) | 56.792*** (11.256) | 0.041 (0.156) | 0.009 (0.164) |
| orthographic neighborhood size | 56.943*** (12.024) | 58.536*** (12.161) | –0.397** (0.152) | –0.366* (0.162) |
| mean bigram frequency | 3.701 (9.181) | –2.214 (9.470) | –0.249* (0.118) | –0.236 (0.128) |
| log frequency | –93.193*** (10.627) | –90.895*** (9.935) | 1.668*** (0.146) | 1.685*** (0.146) |
| valence | –20.299* (10.013) | 0.456*** (0.132) | ||
| arousal | –27.718* (11.355) | 0.487*** (0.144) | ||
| concreteness | –24.448* (10.405) | 0.211 (0.128) | ||
| humor | –13.877 (11.662) | 0.192 (0.146) | ||
| valence (gpt) | –4.719 (9.280) | 0.137 (0.126) | ||
| arousal (gpt) | –10.646 (9.312) | 0.245* (0.123) | ||
| concreteness (gpt) | –22.781* (9.747) | 0.165 (0.129) | ||
| humor (gpt) | –32.490*** (9.469) | 0.358** (0.129) | ||
| Constant | 776.354*** (17.488) | 777.769*** (17.509) | 2.023*** (0.177) | 2.031*** (0.181) |
| Observations | 5,554 | 5,554 | 7,510 | 7,510 |
| Log Likelihood | –37,960.160 | –37,961.750 | –2,670.626 | –2,677.244 |
| Akaike Inf. Crit. | 75,944.310 | 75,947.500 | 5,363.251 | 5,376.488 |
| Bayesian Inf. Crit. | 76,023.780 | 76,026.970 | 5,439.415 | 5,452.652 |
[i] Note: **p < 0.05; **p < 0.01; ***p < 0.001.
