Representational level matters for tone-word recognition: Evidence from form priming

In a form priming experiment with a lexical decision task, we investigated whether the representational structure of lexical tone in lexical memory impacts spoken-word recognition in Mandarin. Target monosyllabic words were preceded by five types of primes: (1) the same real words (/lun4/-/lun4/), (2) real words with only tone contrasts (/lun2/-/lun4/), (3) unrelated real words (/pie3/-/lun4/), (4) pseudowords with only tone contrasts (*/lun3/-/lun4/), and (5) unrelated pseudowords (*/tai3/-/lun4/). We found a facilitation effect in target words with pseudoword primes that share the segmental syllable but contrast in tones (*/lun3/-/lun4/). Moreover, no evident form priming effect was observed in target words primed by real words with only tone contrasts (/lun2/-/lun4/). These results suggest that the recognition of a tone word is influenced by the representational level of tone accessed by the prime word. The distinctive priming patterns between real-word and pseudoword primes are best explained by the connectionist models of tone-word recognition, which assume a hierarchical representation of lexical tone.


Introduction
A key issue in the research of spoken-word recognition is how the human brain decodes the phonological make-up of words with remarkable speed.To recognise a word rapidly, it is common to assume that a specialised cognitive system is needed to continuously map input speech signals onto phonological representations in lexical memory (Liberman et al., 1967;Samuel, 2001).Naturally, the structure and organisation of phonological representations and how they impact lexical processing have become a central concern for modern models and theories of spokenword recognition (e.g., the Cohort model in Marslen-Wilson, 1987;the TRACE model in McClelland & Elman, 1986).So far, these models were put forward mainly based on data in Indo-European languages; they only considered the role of segmental information, expressed by consonants and vowels, for single-word recognition, but did not include lexical prosody (e.g., stress, pitch accent), possibly because of the minor role of prosody in determining a word's identity.
However, doubts can be raised about the universality of these models for languages in other families, particularly when considering lexical recognition in tone languages, which account for about 60-70% of the languages in the world, including Mandarin, Cantonese, and Thai (Yip, 2006).Tone languages not only have segment-based systems of word phonology, but they also extensively employ tone, a type of lexical prosody, to differentiate meanings between words even when the words have the same segmental structure.For monosyllabic words in Mandarin, a widely known demonstration of the significance of lexical tone is the quadruplets of the segmental template /ma/, meaning "mother" with Tone 1 (flat), "hemp" with Tone 2 (rising), "horse" with Tone 3 (dipping), and "to curse" with Tone 4 (falling) (Duanmu, 2002).Given the high information value of lexical prosody in tone languages, research on tone-word recognition would provide an important perspective for advancing potentially more universal models.In the present study, we investigated the representations of lexical tone in phonological memory and how they influence tone-word recognition.
One approach to understanding the role of lexical tone in word recognition is to assume a similar activation-competition process as proposed in the Cohort model (e.g., Marslen-Wilson, 1987).In this case, tonal information is processed in a bottom-up manner.In detail, it is assumed that before a tone word is recognised, segmental signals first activate possible candidates which share the same segments but contrast in tone (e.g., /ma1/, /ma2/, /ma3/, and /ma4/), resulting in lexical competition.Then, continuous input of tonal information is used to deactivate incompatible candidates, leaving only one candidate to represent the word to be recognised.To examine this assumed role of tone during spoken-word recognition, form priming experiments have been conducted, mostly limited to Mandarin (e.g., Lee, 2007;Sereno & Lee, 2015) and Cantonese (Yip, 2001;Yip et al., 1998).Form priming refers to the priming effects that a word produces on the processing of a subsequently presented word related to the phonological overlap between the two words during the recognition stage (see Zwitserlood, 1996, for a review).The effects, typically indexed by reaction times and/or accuracy rates, are assumed to reflect the characteristics of phonological representations, and are independent of semantic processing.
In an auditory lexical decision task, Lee (2007) investigated form priming in monosyllabic Mandarin word pairs overlapping in segments and/or tone.The author found robust facilitatory priming effects when the prime-target overlap was in both segments and tone (the classical repetition priming).However, there was no reliable form priming effects when a prime and a target had identical segments but contrastive tones (i.e., minimal tone pairs) or when they had only tone in common (e.g., /leng4/ and /wai4/), given either a long (250 ms) or a short (50 ms) interstimulus interval (ISI).Moreover, a facilitation effect was found in a mediated form priming experiment, in which a target word (/jian4zhu4/ "construction") was primed by a monosyllabic word (/lou3/ "hug") via an assumed mediator (/ lou2/ "building") with a short ISI.Following the same line of reasoning as the bottom-up processing approach, the author's interpretation of these results was that when a prime word is recognised, tonal information is successfully used online to inhibit the activation of incompatible candidates (including its paired target word) sharing the same tone, thus producing no priming effects on the recognition of the target.
The interpretation in Lee (2007) was also inspired by previous studies that reported no priming effects in minimal prosodic word pairs in non-tonal languages; that is, when two paired words contrast only in stress or pitch accent, like Ame means "rain," and aME means "candy" in Japanese, with the accented syllable to be capitalised (e.g., Cutler & Otake, 1999;Cutler & Van Donselaar, 2001;Slowiaczek et al., 2006).In these studies, the absence of priming effects was interpreted to suggest that prosodic information was successfully used to resolve the segmental ambiguity.In other words, if a prime and a target exhibit prosodic contrasts but the same segments, they can be recognised as two completely different words, rather than homophones.
Differing from the absence of reliable priming effects in Lee (2007), in Sereno and Lee (2007), a part of Lee's (2007) study (Experiment 1) was replicated by further balancing the distribution of prime-target tone types, and the authors found a marginally significant facilitation effect in minimal tone pairs, as well as reliable repetition priming effects.The authors' interpretation was that during the recognition of a prime word, tone played a weaker role (than segments) in blocking the activation of other lexical candidates with tone-only contrasts, whose activation facilitated the recognition of its paired target word.This finding appears to echo those of earlier priming studies in Cantonese with shadowing tasks.In Yip et al. (1998), Cantonese participants were asked to orally repeat the second word in a word pair which carried either onset, rhyme, or tone mismatch and reported facilitation effects only when there was a mismatch in tones (see Yip, 2001 for a replication).In those studies, the facilitation effects were considered to indicate a higher sensitivity to segmental information than to tonal information in native Cantonese speakers.Similar suggestions were also advanced in other studies on tone-word processing, stating that tonal information plays a less important role than the information of segments (e.g., Cutler & Chen, 1997;Taft & Chen, 1992;Tong et al., 2008;Ye & Connine, 1999), even though some other studies found evidence of a resemblance between the processing of tones and segments during the recognition of a tone word (e.g., Malins & Joanisse, 2010;Schirmer et al., 2005;Zhao et al., 2011).
To summarise, so far, form priming paradigms with minimal tone pairs have been used to study the role of lexical tone during tone-word recognition of Mandarin and Cantonese monosyllabic words.Following the bottom-up processing approach, the seemingly inconsistent priming data have been interpreted to suggest either an effective role (Lee, 2007) or a relatively weaker role (Sereno & Lee, 2015) played by tone in terms of constraining the activation of lexical candidates with only tone contrasts (e.g., /ma1/ and /ma3/).However, since the bottom-up processing views attribute the outcome of word recognition to the lexical-level processing, they do not concern much about the influence of structure and organisation of tonal representations on tone-word recognition.
Several recent models of tone-word recognition, which are uniformly derived from the connectionist TRACE model (McClelland & Elman, 1986), provide an alternative approach to the role of tone in word recognition, which directly speaks to the representational structure of tone (e.g., Gao et al., 2019;Malins & Joanisse, 2012Shuai & Malins, 2017;Ye & Connine, 1999;Zhao et al., 2011).In these models, lexical tone and segmental phonemes are represented separately at sublexical levels as independent modules, and the lexical level represents whole-word forms, which are the combinations of tone and segments.Moreover, as in TRACE, the representational units across the two levels are thought to be connected by bi-directionally excitatory linkages, which allow not only bottom-up processing but also top-down information flow from the lexical level back to sublexical levels.Mutually inhibitory connections are assumed to occur between the representational units within the same level.This assumption allows lexical competition between words sharing the same segments but differing in tone at the lexical representational level.Figure 1 illustrates how words with identical segments but contrastive tones are represented and processed in a lexical-sublexical structure assumed by these models with /lun1/, /lun2/, and / lun4/ as examples.
Although lexical tone has been included in these current models, to date, few studies have investigated the role of lexical tone from the alternative approach by considering the hierarchical representation of tone, referred to as the hierarchical representation approach hereafter.To address this issue, we aim in the present study to examine whether the organisation of tonal tone in lexical memory influences Mandarin word recognition by measuring the form priming effects in minimal tone pairs of monosyllabic words.
Many studies have investigated how form priming effects with segmental manipulations (e.g., the number of shared phonemes between a prime and a target) are related to a lexical and a sublexical level of phonological representation during spoken-word recognition in Indo-European non-tonal languages, such as English (e.g., Slowiaczek & Hamburger, 1992), French (e.g., Dufour &Nguyen, 2017), andRussian (e.g., Cook &Gor, 2015;Gor & Cook, 2020).A typical facilitatory priming effect (e.g., shorter reaction times or higher accuracy rates) is usually observed in word pairs with low phonological similarity, with one or two shared initial phonemes (e.g., smoke-still, steep-still in Slowiaczek & Hamburger, 1992), relative to the phonologically unrelated baseline.Even though a facilitatory priming effect has been argued to be related to strategic processing (Goldinger et al., 1992), it is normally considered to be form-based, resulting from the segmental overlap (Slowiaczek & Hamburger, 1992;Spinelli et al., 2001).Moreover, form-based facilitatory effects are independent of the lexicality of the primes and, thus, can be assumed to reflect sublexical processing (Cook & Gor, 2015;Gor & Cook, 2020;Slowiaczek & Hamburger, 1992).
As the number of shared phonemes increases in words with high phonological similarity, inhibition tends to be observed relative to the conditions with low similarity (Hamburger & Slowiaczek, 1996;Slowiaczek & Hamburger, 1992) and/or the unrelated baseline (Cook & Gor, 2015;Radeau et al., 1989).Inhibitory priming effects (e.g., longer reaction times and/or lower accuracy rates) are assumed to reflect lexically-based priming, induced by the competition between highly similar primes and targets (Dufour & Nguyen, 2017;Slowiaczek & Hamburger, 1992).
It is noteworthy that lexical inhibition with behavioural measures is likely to reflect the combination of two opposite priming effects, namely a facilitatory priming effect and an inhibitory priming effect, related to phonological similarity at a form level and a lexical level, respectively (Hamburger & Slowiaczek, 1996;Slowiaczek & Hamburger, 1992;Zhou & Marslen-Wilson, 1994; also see Forster & Veres, 1998 for similar reasoning on visualword recognition).Therefore, inhibition at a lexical level is not necessarily to be indexed by an evident inhibitory priming effect, but it can be inferred by eliminating the lexical-level processing using non-word primes (Slowiaczek & Hamburger, 1992).
Inspired by these studies, in order to observe the effects of the representational hierarchy of tone on word recognition, we employed minimal tone pairs carrying either a pseudoword prime or a real-word prime.Pseudowords are word-like, meaningless word forms that follow the phonological rules of a language, thus they are thought to be unlikely to access a specific lexical entry and are often used to distil a sublexical processing effect by eliminating lexical-level effects (Slowiaczek & Hamburger, 1992; also see Forster & Veres, 1998;Heathcote et al., 2018;Taft et al., 2021 for the same reasoning in visual-word studies).We constructed tone-manipulated pseudowords by replacing the tones of existing monosyllabic words with another lexical tone, resulting in novel, meaningless combinations.For example, /se/ + Tone4 (/se4/) is a real word for "colour" (色), but /se/ + Tone 2 (noted as */se2/ 1 ) does not carry any meaning.The lexicality of tone-manipulated pseudowords can be assumed to be "knocked out" 2 from their original base words without breaking any of the sublexical, phonological rules.Furthermore, the sublexical nature of such pseudowords has been thoroughly tested in previous studies of our own, resulting in differential behavioural and electrophysiological responses to real words and tone-manipulated pseudowords in Mandarin, during spoken-word perception (Yue et al., 2014(Yue et al., , 2017(Yue et al., , 2022)).
According to the hierarchical representation approach to the role of tone during spoken-word recognition, the following results could be predicted.In minimal tone pairs, real-word primes and pseudoword primes produce different priming effects.To be specific, pseudoword primes (*/ lun3/-/lun2/) could only induce form-based facilitation for the segmental overlap, and the sublexical processing of tone in a pseudoword prime would not influence the recognition of its paired target word by means of lexical level processing (cf.Slowiaczek & Hamburger, 1992;Spinelli et al., 2001).Real-word primes (/lun4/-/lun2/) may not produce a robust, typical priming effect as shown in some previous research (e.g., Lee, 2007;Sereno & Lee, 2015).This could be because priming effects to real words using behavioural measures reflect a balance between a lexically-based inhibition due to the enhanced lexical-level competition between the prime and target, and a formbased facilitation caused by the segmental overlap (Gor & Cook, 2020;Slowiaczek & Hamburger, 1992).If these predictions are correct, a further facilitation effect could be expected in minimal tone pairs with pseudoword primes (*/lun3/-/lun2/) relative to those with real-word primes (/ lun4/-/lun2/).
The bottom-up processing approach suggested in previous form priming studies on tone-word recognition can also predict no typical priming effects in real-word primes, which is similar to the prediction of the hierarchical representation approach.However, the bottom-up processing views would not predict different priming effects in minimal tone pairs between real-word and pseudoword priming conditions; this is because, given a completely bottom-up process of lexical activation, lexical tone, whether in real words or pseudowords, is utilised at a lexical level to reduce the number of lexical candidates.Consequently, both real-word and pseudoword primes would produce the same priming effects on the recognition of a target.In contrast, in a connectionist model allowing both bottom-up and top-down processing, the tones carried pseudowords would not access lexical-level representations due to the lack of feedback from the lexical representations.

Participants
Ninety adults living in Daqing, China voluntarily participated in the current experiment (female: 52; mean age = 21 years, SD = 3.4 years).They all reported Mandarin Chinese as their native language, no long-term exposure to other dialects of Chinese, 3 and no history of auditory or speech disorders.

Materials and design
We selected 35 monosyllabic target real words in Mandarin Chinese.Following Lee (2007), the real-word stimuli that were selected had no homophonic words according to the Modern Chinese Word Frequency Dictionary (Language Teaching Research Centre for Beijing Language Institute, 1986), which endeavoured to calculate the usage frequencies of current Chinese words, instead of focusing on the frequency of each morpheme (or character).This control allowed us to avoid potential influences of the word forms and the semantics carried by homophones on the processing of spoken-word stimuli.Each target word was paired with five different types of primes: (1) a real-word prime sharing both segments and tone (RST for simplicity, according to the underlined letters; e.g., /lun4/-/lun4/); (2) a real-word prime sharing only segments (RS; e.g., /lun2/-/ lun4/); a real-word prime phonologically unrelated to the target (RUR; e.g., /pie3/-/lun4/); a pseudoword prime sharing only segments (PS; e.g., */lun3/-/lun4/); and a pseudoword prime phonologically unrelated to the target (PUR; */tai3/-/lun4/).The average word frequencies of the three types of real-word primes (RST, RS, RUR) were matched, F(2, 102) = .091,p = .409.No paired words could form a disyllabic word, and they had no overt semantic associations.See Table 1 for a demonstration of the design.
The repetition priming condition (RST) has been known to elicit facilitatory priming effects in Chinese monosyllabic words (e.g., Lee, 2007;Sereno & Lee, 2015), and was thus used to validate the current paradigm in eliciting form priming effects.The RUR primes and the PUR primes were exploited for the baseline conditions to probe the priming effects in conditions with real-word primes (RST, RS) and pseudoword primes (PS), respectively.
In pursuit of a balanced design, we also generated the same number of tone-manipulated pseudowords for target use.The pseudoword targets were paired with the five types of primes that were the analogues of the priming and baseline conditions for the real-word targets but with opposite lexical status.In this case, the real-words and pseudowords had an equal probability to be presented, whether as primes or as targets.This design also helps to control the potential strategic bias in the priming paradigm with a lexical decision task (Norris et al., 2002).
The prime-target combinations were equally distributed over five lists by applying a Latin Square arrangement.As a result, in one list, each target word was only primed by one of the five prime types.Seventeen word forms were used twice in different lists, but no repeated items occurred within one list.See the online Supplementary Material for a complete list of stimuli.
The materials were recorded by a well-trained female, native Mandarin speaker in a soundproof cabin with highquality recording equipment at Newcastle University, UK.All experimental materials were normalised for the same duration of 550 ms and an intensity of 75 dB SPL with PRAAT (Boersma & Weenink, 2013).

Procedure
The participants were tested individually in a sound-attenuated room.Auditory stimuli included 70 prime-target pairs, presented via a pair of Cosonic CD-778MV headphones at 75% of the full volume of an HP 540 laptop.The presentation of stimuli was pseudo-randomised for each participant to ensure that items of the same priming condition would not be presented in more than three successive trials.Participants were randomly assigned to one list and were instructed to decide if the second item (i.e., the target) in a stimulus pair was an existing word in Mandarin Chinese, by pressing the "O" key for a positive answer and a "P" key for a negative answer, as quickly and accurately as possible.A practice session with 10 prime-target pairs was presented before the beginning of the experiment.
A trial began with a set of fixation marks (*****) presented in the centre of the screen for 1000 ms.The prime was delivered 500 ms after the disappearance of the fixation marks, followed by the presentation of a target with an ISI of 250 ms to make our results comparable with those reported by Lee (2007) and Sereno and Lee (2015).The next trial automatically started as soon as an answer was given within 6 s.Response latency was measured from the onset of the target (Zwitserlood, 1996).The software program DMDX V4.0.4.6 (Forster & Forster, 2003) was used to control the delivery of auditory stimuli and to record RTs and accuracy.The entire protocol took about 10 min.

Results
Here, only the RT data in real-word targets were analysed.Trials where RTs were shorter than 400 ms or longer than 3000 ms were omitted as outliers (cf.Lee, 2007).Response accuracy was calculated to check the reliability of the data, leading to an exclusion of three items (175 trials in total) as their accuracy rates were around chance level (ranging from 60% to 40%).Only correct lexical decisions were included for statistical analyses.In total, 386 data points (12.3%) were trimmed off.The means of RTs are shown in Table 2.
The RTs of lexical decisions in real-word targets were analysed as the dependent variable.We used linear mixedeffect modelling in an R environment (version 4.4.2) with the package lme4 (Bates et al., 2015) to evaluate the expected priming effects (Baayen et al., 2008).For each model, we first constructed a maximum model by including random intercepts and slopes for subjects and items, respectively.Then, we built a simple model by assessing only by-subject and by-item random intercepts (cf.Wang et al., 2021).Likelihood ratio tests would be performed to verify if the goodness-of-fit of a simple model improves relative to the maximum model, once the data were fit to both models.Moreover, only converging models were considered.
We began to inspect the repetition priming effect in word pairs with two identical words (RST condition versus RUR condition) to validate the current paradigm.As the maximum model failed to converge, the optimal model was the intercept-only simple model, RT ~ condition + (1-subject) + (1-item,), which revealed a significant facilitation of repetition priming (t = −3.244,p = .0019).This supports the observation of a 120 ms shorter response latency to target words in identical word pairs as compared to the unrelated baseline for real-word primes, and validates the current priming paradigm.The summary of the model fit is presented in Table 3.
Then, we analysed form priming in minimal tone pairs of real-word-real-word and pseudoword-real-word structures.We first defined two fixed-effect factors with two levels each: CONDITION-TYPE (CT for short in the model, baseline versus priming condition) and PRIME-LEXICALITY (PL for short in the model, real-word versus pseudoword prime).While the maximum model failed to converge, the simple model became the optimal one, which only included by-subject and by-item intercepts along with the fixed effects and their interaction, RT ~ CT*PL + (1-subject) + (1-item).This intercept-only model suggests an interaction between CONDITION-TYPE and PRIME-LEXICALITY (t = 1.985, p = .049).Table 4 presents the summary of the optimal model.
Based on this interaction, three planned comparisons were further performed to directly test for the expected priming effects, namely pseudoword priming (i.e., PS versus PUR), real-word priming (RS versus RUR), and the difference between pseudoword and real-word priming (PS versus RS), by resetting the reference level to the pseudoword unrelated baseline condition (PUR), the real word unrelated baseline condition (RUR) and the minimally contrasted real-word primes (RS), respectively.To avoid Type I error given the multiple comparisons, the p values reported here were adjusted using Bonferroni's method (multiplying the number of comparisons).The results showed significant facilitatory priming effects in pseudoword-real-word pairs with minimal tone contrasts relative to the unrelated baseline (PS versus PUR: t = -2.636,p = .027)and in the comparison between the conditions containing pseudoword and real-word primes with minimal tone contrasts (PS versus RS: t = -3.077,p = .009).No significant effects were identified in minimal tone pairs of real words (RS versus RUR: t = 0.168, p > .1).The summary of the planned comparisons is presented in Table 5.
These results suggest an 80 -ms facilitation in the lexical decisions on words primed by pseudowords with minimal tone contrasts, relative to the unrelated baseline, and a 92 ms facilitation as compared to the lexical decisions on targets in minimal tone pairs of real words.However, no significant priming effects were identified in minimal tone pairs of real words, relative to the unrelated baseline.See Figure 2 for demonstrations of the priming effects.

Discussion
The current study investigated how lexical tone is represented in phonological memory by identifying its impact on tone-word recognition.A bottom-up processing approach to this issue, as concluded from the existing form priming studies of tonal languages, only assumed lexicallevel processing for tonal information.In contrast, a hierarchical representation approach allowed interactive processing of tone at both a lexical and a sublexical level, which is suggested by the current tone-word recognition models, based on the TRACE model (McClelland & Elman, 1986).In order to test these approaches, we performed a form priming experiment with a lexical decision task, which included minimal tone pairs with real-word and pseudoword primes as stimuli.To do this, we could observe priming effects on the recognition of target words when the tonal information carried by primes was processed either lexically or sublexically.The validity of the current paradigm in eliciting form priming was attested by obtaining the classic repetition priming effect, as indexed by facilitated lexical decision on a target word when it was preceded by an identical prime word, carrying the same segments and tone (cf.Lee, 2007;Sereno & Lee, 2015).
Consistent with predictions of the hierarchical representation approach, we identified distinctive form priming patterns between minimal tone pairs with real-word and pseudoword primes.Specifically, a facilitatory priming effect was reliably identified when the same targets were primed by pseudowords with minimal tone contrasts, as compared to the unrelated pseudoword baseline (PS versus PUR).In contrast, no priming effect was directly identified in minimal tone pairs of real words (RS) relative to the baseline condition with phonologically unrelated prime   Signif.codes: 0 "***" 0.001 "**" 0.01 "*" 0.05 "." 0.1 " "1.
words (RS versus RUR).These differential priming patterns suggest that the recognition of a tone word is influenced by the representational level of the tonal information carried by its preceding prime.Only when the lexicality of a prime is "knocked out" (i.e., the prime is a pseudoword) does the tone not influence the recognition of the target, leaving the shared segmental syllable to give rise to sublexical priming that facilitates the processing of its target word.
Here, the facilitation effect is in line with the literature of studies on words in non-tonal languages with segmental manipulations.In particular, even though most previous segmental studies found facilitation effects in word pairs with only one or two shared phonemes (e.g., Slowiaczek & Hamburger, 1992;Slowiaczek et al., 1987), some other experiments also reported facilitation in primetarget pairs with high segmental similarity (e.g., three overlapping phonemes), in which the primes were nonwords (pseudowords) without specific lexical entries (Slowiaczek & Hamburger, 1992) or only had partially developed lexical-level representations in non-native speakers (noted as "fuzzy phonolexical" representations in Gor & Cook, 2020).
Along with the facilitation effect, the absence of an evident priming effect in minimal tone pairs of real words relative to the unrelated baseline can be well explained from a hierarchical representation perspective (cf.Slowiaczek & Hamburger, 1992).That is, a null priming effect in minimal tone pairs of real words reflects a counterbalance between two effects with opposite directions: a lexically-based inhibition due to the competition between two words with minimal tone contrasts and a form-based facilitation in the segments.This explanation is further attested in the comparison between the pseudoword priming condition (PS) and the real-word priming condition (RS), which revealed a facilitation effect.This priming effect, albeit not a typical one (compared to an unrelated baseline), suggests that the different form priming patterns in minimal tone pairs with pseudoword and real-word primes are independent from their own baselines, and could originate from the contrastive lexicality of the two types of primes.In detail, due to the lack of lexicality, pseudowords only induce a sublexical-level priming effect (e.g., /lun/ in */lun3/ -/lun4/), whereas real-word primes produce priming both at a sublexical and a lexical level (e.g., /lun/ in /lun2/ -/lun4/).Therefore, when the two conditions are compared, the most likely explanation for the remaining effect in the real-word priming condition is the lexical competition between the members of a minimal tone pair.
Although this inference of lexical competition is based on the observation of sublexical effects, our reasoning is in harmony with previous studies on form priming between words with segmental similarity, reporting inhibitory priming effects when the form-based (i.e., sublexical) priming is controlled (e.g., Dufour & Peereman, 2003;Gor & Cook, 2020;Slowiaczek & Hamburger, 1992).More specifically, to Mandarin word recognition, evidence of lexical competition between monosyllabic Mandarin words sharing the same segments but carrying divergent tones have been found in several studies that employ eye-tracking paradigms (Malins & Joanisse, 2010;Yang & Chen, 2022).For example, Yang and Chen (2022) only found significantly more eye-fixation on the picture of a segmental-syllable competitor (/chuang2/) on hearing a target word (/chuang1/) relative to the baseline, but no effects with other types of competitors that did not share all segments with the target word.Considering these findings together, it is reasonable to assume that the null priming effect in minimal tone pairs with real-word primes in the current study is the result of the counterbalance between sublexical-and lexical-level priming effects.
While these findings can be well predicted by the TRACE-based models of tone-word recognition, our priming data are not coherent with the predictions generated by the bottom-up processing approach.As reviewed before, the bottom-up processing views consider the form priming effects in minimal tone pairs as indications of either an effective (Lee, 2007) or a weaker (Sereno & Lee, 2015) role played by tone in inhibiting the activation of lexical candidates with tone-only contrasts.Admittedly, the null priming effect in minimal tone pairs of real words as revealed in the current study is a replication of Lee (2007) and is in line with several other studies on priming with minimal contrasts in other types of lexical prosody (e.g., Cutler & Otake, 1999;Cutler & Van Donselaar, 2001;Slowiaczek et al., 2006).Moreover, the assumption of a RST: real-word prime with identical segments and identical tone; RS: real-word prime with identical segments; RUR: phonologically unrelated real-word prime; PS: pseudoword prime with identical segments; PUR: phonologically unrelated pseudoword prime.n.s., p > .05,** p < .01.
weaker role of tone in terms of constraining the activation of lexical candidates (e.g., Sereno & Lee, 2015) also seems to explain the facilitatory priming effect in pseudoword primes.
However, the bottom-up processing views could not lead to a prediction of distinctive priming effects between real-word and pseudoword primes in minimal tone pairs.Like the propositions of the Cohort model (Marslen-Wilson, 1987), these views assume that the inhibition of irrelevant lexical candidates relies on the decoding of sublexical input.This mechanism implies that for pseudowords and real words differing only in terms of tone, their recognition would undergo similar processes because they all need tonal information to rule out those activated lexical candidates given the same segmental inputs.In this case, a critical prediction on the form priming effects in minimal tone pairs generated by the bottom-up processing approach, be it the successful role version (Lee, 2007) or the weaker role version (Sereno & Lee, 2015), could be that similar priming patterns should be observed with pseudoword primes and real-word primes, which is apparently not consistent with our observations in the current study.
We must mention that the present study was not aimed at refuting the idea of investigating the role of lexical tone during spoken-word recognition by referring to the processing efficacy of tonal information, which is a hotly debated topic for psycholinguistics (see Yang & Chen, 2022 for a recent review).Instead, we modestly intended to test whether the hierarchical structure of tonal representation should be considered in the modelling of tone-word recognition.
As a result, the hierarchical representation perspective may provide some tentative interpretations for the mixed priming results in minimal tone pairs, as reported previously in Lee (2007) and in Sereno and Lee (2007).Accordingly, the absence of a clear priming effect in minimal tone pairs in Lee (2007) can be interpreted by the same lexical-sublexical priming approach that we adopt to explain our current data.Furthermore, the tendency of facilitation in Sereno and Lee (2007) may be explained as reflecting the combination of a relatively stronger sublexical priming effect and a weaker lexical-level priming effect.This tentative explanation may be possible given that a small but growing number of studies have suggested a relatively more important role of the segmental syllable (onset + rime) than tone, onset, or prime alone during spoken-word recognition (e.g., Ho et al., 2019;Yang & Chen, 2022).These findings suggest that in a priming context, where listeners are asked to focus on target words, the processing priority of a segmental syllable during the perception of prime words may become more prominent, leading to a relatively stronger sublexical priming effect.Some common ground can be found between this explanation and the interpretation of the facilitatory priming effects in two early Cantonese studies (Yip, 2001;Yip et al., 1998), in which the facilitation effects were considered to suggest that native Cantonese speakers are more sensitive to identical segments than to mismatched tone.
Our findings not only suggest how tonal knowledge is stored in the phonological memory system but may also shed some light on the general form of lexical prosody representations in the mental lexicon.Previously, researchers tended to consider that lexical stress or pitch accent are effectively used during spoken-word recognition, based on their observations of null priming effects in words paired by minimal prosodic contrasts (e.g., Cutler & Otake, 1999;Cutler & Van Donselaar, 2001;Slowiaczek et al., 2006).However, little care has been given to their representational structure and organisation, not to mention to include prosody as part of the models of spoken-word recognition.At least one cross-modal priming study with Spanish materials calls for our attention to this representational issue (Soto-Faraco et al., 2001).In this study, a visually presented word was primed by an auditory word fragment (e.g., princi-) set at the end of a spoken sentence.A mismatch in prosody (stress) between the target words and the fragment primes led to inhibited lexical decision responses, suggesting lexical competition between candidates with similar segments but different prosody.This finding means that even for an Indo-European language, the role of lexical prosody may not only be a tool to pinpoint a word to be recognised, it may also play a major role in lexical memory, forming complicated phonological representations with certain structure and organisation.
However, we do not mean to directly generalise the idea of hierarchical representations for lexical tone to other types of lexical prosody in non-tone languages.This is, first, because the information load of lexical prosody in Mandarin monosyllabic words is much higher relative to that in other non-tone languages such as English, Spanish, and Japanese (cf.Lee, 2007).Second, the neural underpinnings of the lexical prosody in a tone language and a nontone language seem to be, at least in part, different (e.g., Friederici & Alter, 2004;Gandour et al., 2003;Politzer-Ahles et al., 2016).Nevertheless, explorations towards this direction may merit future studies with great caution.
The last issue to be discussed is that although our data are supportive of a hierarchical representation approach to the role of lexical tone during spoken-word recognition, we only found direct evidence of sublexical facilitation with pseudowords.Future research will focus on manifesting the inhibitory priming effect that directly indexes lexical competition between the members in a minimal tone pair in a more extended way by using experimental paradigms such as eye-tracking as in Yang & Chen (2022), or by setting contextual messages as in Soto-Faraco et al. (2001).

Conclusion
In this study, we tested whether the representational structure of lexical tone in lexical memory influences tone-word recognition.By eliminating the potential lexical-level inhibition using pseudoword primes in minimal tone pairs, we identified a facilitatory priming effect relative to the phonologically unrelated baseline.In contrast, real-word primes did not produce any overt priming effects compared to the unrelated baseline.The distinctive priming patterns in real and pseudoword primes are predicted by several connectionist models of tone-word recognition, suggesting a hierarchical representation of lexical tone that matters for tone-word recognition.

Figure 1 .
Figure1.The representation and processing of words with identical segments but contrastive tones (/lun1/, /lun2/, and /lun4/) in a connectionist tone-word recognition framework with a structure consisting of lexical and sublexical levels.Cubes with solid lines show the specific representations of the word examples.A cube with dashed lines indicates a category of representational units that the specific representations belong to.The excitatory linkages between levels are denoted by arrows, and the within-level inhibitory connections are represented by curved lines with two dots at the ends.

Figure 2 .
Figure 2. Demonstration of repetition priming (RST versus RUR) and the results of the planned comparisons (RS versus RUR, PS versus PUR, and PS versus RS).

Table 1 .
Demonstration of the priming conditions and examples of trials.

Table 2 .
Results of the priming conditions with real-word targets.

Table 3 .
Summary of the linear mixed-effect model for repetition priming.

Table 4 .
Summary of the linear mixed-effect model of the 2 by 2 factorial design of priming.

Table 5 .
Summary of planned comparisons with references reset to three baselines.