Speech productions of 40 English- and 40 Japanese-speaking children (aged 2–5) were examined and compared with the speech produced by 20 adult speakers (10 speakers per language). Participants were recorded while repeating words that began with “s” and “sh” sounds. Clear language-specific patterns in adults’ speech were found, with English speakers differentiating “s” and “sh” in 1 acoustic dimension (i.e., spectral mean) and Japanese speakers differentiating the 2 categories in 3 acoustic dimensions (i.e., spectral mean, standard deviation, and onset F2 frequency). For both language groups, children’s speech exhibited a gradual change from an early undifferentiated form to later differentiated categories. The separation processes, however, only occur in those acoustic dimensions used by adults in the corresponding languages.