It has been argued that attentional resources are preferentially allocated to information occurring early in the speech stream. However, a burgeoning behavioral literature has demonstrated that phoneme detection is faster and more accurate for prosodically salient (stressed) than unstressed syllables, regardless of their temporal positions in a word. This finding suggests that word prosody is important in capturing attention during speech perception. We investigated whether for unattended speech sounds, temporally or prosodically salient information would capture attention. In an auditory oddball paradigm, native English speakers were asked to ignore binaurally presented stimuli and to watch a silent movie while ERPs were recorded. After completion of that phase, all volunteers participated in a behavioral discrimination task. The unattended phase was divided into two sessions. In the following, capital letters indicate the stressed syllable. In the first session, acoustic stimuli included initially stressed disyllables. The standard was “BAga,” and the two deviants “BAka” and “PAga”. In the second session, stimuli consisted of non-initially stressed disyllables. The standard was “baGA,” and the two deviants “baKA” and “paGA.” While MMNs were observed in the ERPs to all deviants in the unattended session (the largest amplitudes to BAka and baKA), the P3a, indicative of involuntary attentional capture, was only seen to the deviants “PAga” and “baKA”. Behavioral discrimination was higher for the deviants “PAga” and “baKA” than for the other two deviants. It is concluded that prosodic rather than temporal salience triggers involuntary attentional capture for unattended speech sounds.