On Dictionaries & Pronunciation

DictionaryThe fine folks over at Collins contacted me recently about their online dictionary. It’s in beta, but looks to be an excellent addition to a growing body of online word tools. I recommend checking it out here.

Anyway, this got me thinking. Most online dictionaries have pronunciation guides featuring IPA characters and a recording of an actor reading the word aloud. This actor is going to have an accent, of course. So how does a dictionary compiler choose which accent to feature as the ‘standard’ pronunciation?

Collins is a British dictionary, so they use Received Pronunciation (more on this in a moment). But note that the pronunciations of words in their online English-Spanish dictionary, by contrast, are typical of this side of the world. They opt for the Latin American variety of the language, as shown by the pronunciation of ‘cierto’ as sjeɾto (in much of Spain, this would be θjeɾto, with the ‘c’ pronounced the same as English ‘th’).

This is true of more than one Romance language. I used to spend hours at the foreign language dictionary shelf of Barnes & Noble, because, well, I have a rather unusual sense of fun. One thing I found most interesting is that Portuguese-English dictionaries typically use the Brazilian variety of the language for their pronunciation guides. For the Latin American languages, why do many dictionaries go with the language typical of the colonies and not the original imperial power?

(While I’m on the subject, another question about (foreign language)-English dictionaries: Why are there never, or very rarely, IPA transcriptions for words in Asian languages?)

But back to English. I should mention that I know not a thing about the compilation of dictionaries, or what the editorial process is like. But I’d imagine it’s a herculean task to make the near-endless number of choices that go into creating a work so large and definitive. After you’ve decided which accent to go with, another question arises.  As accents evolve, when do dictionary writers decide to evolve with them?

Take the vowel in the word ‘goat.’ In British dictionaries over a century ago, this vowel was represented by the diphthong , in line with the pronunciation of the word in Victorian-era Received Pronunciation. At what point did they decide to switch over to the more ‘contemporary’ vowel, əʊ?

And here in America, the contrast between the vowels in ‘caught’ and ‘cot’ has been steadily eroding throughout the country.  When will pronunciation guides in dictionaries reflect this?

Obviously, I’m not discussing books like the Longman Pronunciation Dictionary, which are compiled under the direction of renowned linguists. But for more traditional dictionaries, who makes choices about pronunciation?

*By the way, I’m nearly as fascinated by the many accents of Spanish as I am by the accents of English. Anyone who wants to talk about Mexican vowel reduction or the various allophones of the Puerto Rican /r/ is more than welcome to do so.


About Ben

Ben T. Smith launched his dialect fascination while working in theatre. He has worked as an actor, playwright, director, critic and dialect coach. Other passions include linguistics, urban development, philosophy and film.
This entry was posted in English Phonetics, Uncategorized and tagged , , , . Bookmark the permalink.

41 Responses to On Dictionaries & Pronunciation

  1. Aniruddha Gupta says:

    Well, the decision to go with Brazilian pronunciation over Portuguese pronunciation seems to be a recognition of the geopolitical power wielded by Brazil today, which far outstrips that of its former colonizer. For English, while the US is more powerful than the UK, the latter is still counted among the great powers.

  2. dw says:

    For the Latin American languages, why do many dictionaries go with the language typical of the colonies and not the original imperial power?

    Is “because the vast majority of native speakers of both languages live in the Americas” too obvious an answer”?

    By the same principle, one might expect a North American variety of English to be used, but then Collins in a UK-based publisher.

    • trawicks says:

      That is the obvious answer. What’s interesting, though, is that for Spanish and Portuguese, the language of the majority is the ‘standard’ in these situations; while for English it’s the accent of an extreme minority (RP), even in the UK.

  3. The IPA does not seem to be internationally used, in spite of what the “I” stands for.

    • trawicks says:

      It’s used in most places (or at least anywhere where linguists are!) I’ve seen IPA characters used in an Italian pronunciation guide for Sicilian, just to cite one example.

  4. ella says:

    have you seen this one yet?

    • trawicks says:

      I haven’t! Reminds me of a rant I once heard a Spanish person go on about trying to speak the language in Puerto Rico (which I’d say is roughly comparable to an American trying to get on in a very working-class pub in Glasgow).

    • Elizabeth D. says:

      I don’t get it, because…you know…it’s in Spanish. I mean, obviously I understand the title, but not too much besides that.

      • ella says:

        sorry for the late reply – basically it’s a pastiche on the myriad different dialects that exist in the umbrella language ‘Spanish’. It’s a story about a Spanish student travelling the Spanish speaking world and having to re-learn how to say things everywhere he travels.

      • ella says:

        and if you’re interested, you can find the lyrics by clicking through to the page on YouTube.

  5. Sooryan FM says:

    UK-made US English dictionaries (what a paradox, but they exist, opt for the cot-caught distinction of Back East-type which sounds regional to West Americans [and Canadians]; Hollywood is in the West, and not in New Jersey 😉 ).

    Merriam Webster’s Learner’s dictionary recommends that foreign learners of American English pronounce COT as CAUGHT, CALLER as COLLAR, PAUL as POL, all with the unrounded back vowel:

    • Ellen K. says:

      No, pronouncing them unrounded would be pronouncing CAUGHT (rounded) as COT (unrounded). (Though they in different places, in addition to the rounding difference.)

  6. Sooryan FM says:

    The Collins Spanish Dictionary
    is in fact a UK English – Peninsular Spanish dictionary
    [many peninsularisms are not labeled as such;
    PILLAR is entered as ”to catch, to get” but it should be like this:

    PILLAR (Spain, informal) = to catch, to get
    PILLAR (Argentina, informal) = to (take a) piss ]

    The Oxford Spanish dictionary is much better,
    it features all varieties of both Spanish and English,
    and it is not as Europe-centric.

    • boynamedsue says:

      Pillar is still used in Argentina, to mean catch, and is very much needed given they can’t use coger because that’s even worse. I’ve never heard “pillar” to mean piss in Argentinian, is it lunfardo or actual Rioplatense Spanish?

      Pillar (catch) /Pillar (piss) have probably only been universal homophones for 30 years, the zh pronunciation of “ll” has spread a lot recently, to exclude all other forms. Pillar meaning “piss” almost certainly comes from Italian “pisciare”, and would always have been used with a sh/zh consonant in the middle, while the other could be pronounced ly/y as well as zh/sh.

      • Sooryan FM says:

        Well, the Argentinian dictionary does not agree with you,
        enter PILLAR here and see for yourself:


        It’s the online version of the
        ”Diccionario integral del español de la Argentina”
        pubulished by Voz Activa and Clarín, in 2009

        It is a national dictionary used in all schools in Argentina.

        • boynamedsue says:

          Maybe you are right, I’ve certainly heard pillar from Argentinians in Spain though. I’m also certain that the Arg use comes from pisciare, I don’t see how it’s possible to be otherwise.

      • Claudia says:

        Yup, “pillar” (pronounced pe-shar) means to piss; but if you said it like a spaniard (pe-lyar), we would understand it means to fetch. Context also helps.
        Hence, we never use it to substitute “coger”. You can use tomar, agarrar, sostener, tener, pegar and some more, but rio platenses don’t use pillar as “to fetch”.
        I’m loving this blog!

  7. AL says:

    For Asian languages not being in IPA, is it because spelling them out with Latin letters is already (typically) done purely for phonetics? For example, using PinYin for Mandarin Chinese is already giving you a precise letter-to-sound correlation because the letters were only chosen to represent the sound?

    • trawicks says:

      Possibly. It depends on the language. One possibility with Chinese is that dictionary writers don’t want to bother with the tones (which would be a little awkward in small text).

  8. Ed says:

    The change in GOAT from to əʊ was by AC Gimson in 1962. As UCL Chair of Phonetics, his word was like an order from the Pope.

    I actually work with several people who use in GOAT. I think that some Northerners who are trying to produce a diphthong go from [o:] to [oʊ], as they’re used to having [o] as the first element.

  9. Sooryan FM says:

    It’s strange: Oxford dictionaries aimed at EnglishL2 users feature the traditional IPA system, while the ones aimed at EnglishL1 users/native speakers (like the [New] Oxford Dictionary of English and the Shorter Oxford English dictionary) use the newer IPA description of Standard British English( called Upton system) with TRAP being shown as [trap].

  10. IVV says:

    Growing up, we could never understand why there were different symbols for ä, ô, and ŏ.(What, an American dictionary using IPA? Never heard of it.) They all had the same sound! The exact same sound!

    • Sooryan FM says:

      yeah, father ~ bother ~ daughter

      • Elizabeth D. says:

        For you maybe, but not for everyone. According to the Atlas of North American English, most Americans make a distinction between caught and cot.

        • Julie says:

          Most? Maybe, since the East is more densely populated. But most Westerners don’t. As a kid, I ran afoul of a dictionary which defined the ô sound (as in daughter) as the vowel in “court.” Not in any accent I’d ever heard!

        • Sooryan FM says:

          Even in Vermont and in the northern parts of NY state, people are cot/caught merged (to a low back unrounded vowel). The merger is spreading. Listen to Lana del Rey, a c/c merger from Lake Placid NY.

        • Elizabeth D. says:

          I e-mailed William Labov myself and he said most people make the distinction, but the merged area is larger.

          @ Sooryan FM:
          It may be spreading, but the majority of people don’t have it right now. The area you mention is very sparsely populated.

        • IVV says:

          I’m aware that where I grew up is not indicative of the population as a whole, but it doesn’t change the fact that when you have no difference, and your teachers have no difference, and everyone you hear in your day-to-day activities have no difference, and GenAm is cot/caught merged… it’s frustrating for a child.

    • m.m. says:

      its funny you noticed this as a child [as you mentioned ‘growing up’].

      i never noticed the different symbols in american dictionaries (never IPA of course) until i learned IPA, and then i was like “WHY ARE THERE HIEROGLYPHS IN THE DICTIONARY?!”

      • Julie says:

        You mean you weren’t one of us nerdy types who actually read the dictionary? 🙂

        • m.m. says:

          Oh no, I read it, I just never EVER payed attention to what I now know to be a ‘pronunciation guide’. But after learning IPA, I went back and noticed something IPA like in the dictionarys that I’d never noticed, and I was surreal xD

  11. Sooryan FM says:

    Most people with the Cot/Caught merger speak standard General American (people in the West, excluding sociolects like Surfer Dude or Valley Girl variants which do sound accented). Most people with the Cot/Caught unmerger sound distinctively regional (Southerners, people living on the East Coast, between Tampa and NYC, people in the Great Lakes area with the Northern Cities Vowel Shift).
    So, while the Queens (NYC) accent and the Chicago accent may be the same thing phonologically, these accents have nothing in common phonetically: you can hear young Chicagoans pronouncing ”caught” as [kä:t] and ”wall” as [wä:l] which is so different than the pronunciation in Queens (NYC) or New Jersey. In Alabama, even cot/caught unmerged people pronounce ”all, y’all” with the unrounded vowel [äl] and ”solve” with the rounded vowel, which is exactly opposite from the NYC accent.

    • Julie says:

      Hmmm….not necessarily, Sooryan. I think you may not have noticed the midwestern “unmerger.” There are people who don’t actually merge cot and caught, but pronounce both within the range of my (merged) vowel. That is, unless it’s brought to my attention, perhaps by hearing both sounds in quick succession, I won’t notice that they’re not merged. I think I was married for years before I noticed that my husband has an unmerged accent, and he speaks GA with no obvious regionalisms.

      • gaelsano says:

        Chicago (non-AAVE speakers) generally are cot-caught unmerged, but due to a chain shift, the THOUGHT vowel sounds like a merged person’s PALM-LOT vowel.

        Chicago people would say (from the perspective of a Pennsylvanian or non-Great Lakes New York Stater) to say “My mom and Dad got a call” as “My mam and day-id gat a cahl.” The distinction is there. You can’t take one Chicagoan’s “caught” and compare it to a GenAm’s “cot” and declare that they’re merged.

        That being said, there are people with cot-caught merger and Northern Cities Vowel Shift (NCVS is quite variable in quality and degree from speaker to speaker). They seem to be scattered randomly and to be a product of NCVS parents growing up in a cot-caught merged area.

        I substituted in the Adirondacks of New York after doing some ESL teaching in a somewhat neutral accent (American cadence but father-bother artificially unmerged and cot-caught fully unmerged and Wales-Whales somewhat unmerged and atom-Adam unmerged–in fact, I now I do Canadian raising on writing vs. riding while also maintaining a pure t sound).

        I found that nearly all students found my pronunciation of daughter a little off (they were used to hearing a clear t from broadcasters, but my THOUGHT vowel was more NORTH than a prolonged COT).

        They were all naturally cot-caught merged like their neighbors in Vermont.

        One girl though had VERY advanced NCVS despite being a native of the area. I had a laugh and a tease about the fact we were going to be reading about “Anne Frank” and not “Ian Frank.” I admit to having a similar confusion when young about Don and Dawn, since I grew up cot-caught merged in auto-otto and Don-Dawn, but didn’t rhyme doll with all. Her classmates tried to help her pin down the TRAP vowel, but they had to give it up. Poor kid’s gonna have the Don-Dawn and Ian-Anne mixup again someday and screw up the Mr/Ms distinction.

        Then again, Brits will screw up Donna-Donner in both directions due to intrusive “r.” I thought the BBC had recruited reindeer for the fourth season of Doctor Who when I heard about “Donner and er adventchas wiv the Docta.”

        • gaelsano says:

          “…a product of NCVS parents growing up in a cot-caught merged area.”

          “…a product of having NCVS parents and growing up in a cot-caught merged area.”*****

        • Sooryan FM says:

          Here is a nice research article on the spreading of the cot/caught merger in the State of New York, a very interesting read:

          ”Progress toward the low back merger in New York State” (the complete article is available free of charge)


      • gaelsano says:

        I think Sooryan was aware that both Chicagoans and NYCers both pronounce wall and caught with a vowel different from their respective cots.

      • gaelsano says:

        I can’t say that I’m impressed with Collins here.

        I understand the traditional lack of yod coalescence in words like education and of course in tune.

        I do NOT understand why Asia is two syllables while glacier is three, though. Why has “real” been reduced from ree-el to rearl? If you’re going to go with traditional and neutral forms, then keep real as two syllables.

        I am also baffled as to why HAPPY and KIT have the same symbol. ARRGH! This means that someone can’t tell if a word like utopia is four syllables (u-top-i-a) or if it’s three syllables (rhyming with You-toe-peer). The OED is even more perplexing since it says that HAPPY and KIT are different, but that rapier is ray-peer and utopia is you-toe-peer.

        I avoided IPA since I want you to try reading aloud utopia with three syllables as you-toe-peer and see how strange it is. Idea (I-D-a) as I-deer, sure, but utopia as you-toe-peer?

  12. Pingback: This Week’s Language Blog Roundup | Wordnik ~ all the words

  13. Harlan says:

    May I just say that after finding On Dictionaries & Pronunciation | Dialect Blog on Concrete5, what a comfort to come across somebody who basically understands
    what they’re writing about when it comes to this. You definitely grasp how to bring a problem to light and make it worthwhile. A lot more people should have a look at this and see this side of the story. It’s surprising you’re not more prevalent, since you most truly possess the gift.