Nearly every vowel of English can be pronounced as a diphthong in at least some variety of the language. In fact, modern English largely emerged due to vowels that were once monophthongs (single vowels) shifting to become diphthongs. (The ‘i’ and ‘night’ and the ‘ou’ in ‘mouth’ are two examples of these.) Our language undergoes a perpetual process of single vowel phones evolving into two.
In American accents specifically, a specific kind of diphthong emerges. Namely, monophthongs are often followed by a schwa, the little ‘uh’ sound in the word ‘afraid.’ We generally refer to this as a centering diphthong, with the schwa itself an example of an off-glide.
Perhaps the most famous example of an American centering diphthong is the New York pronunciation of ‘coffee’ and ‘thought,’ which are roughly ‘caw-uhfee’ and ‘thaw-uht’ (IPA kɔəfi and θɔət). You can find parallels in a few non-American accents; Belfastians can pronounce the word ‘saw’ in a similar way (sɔə) and Cockneys do the same with words like ‘bore’ (bɔə). But in neither accent is this as widespread or systematic as it is in New York.
Then there is the ‘a’ in ‘trap,’ which has an off-glide in some accents, ranging from the Inland North to New England and California. Out West, I’ve noticed that it’s quite common for a schwa to follow many short/lax vowels, such as the vowel in ‘kit’ (kɪət) and ‘dress’ (dɹɛəs). Bostonians, meanwhile, can add a schwa after the vowel in ‘lot’ (lɒət), although this strikes as indicative of the strongest accents.
To summarize, then, vowels don’t simply become longer in American English; they often become longer and become diphthongs. Why?
There are no clear answers, and those that I can think of are mere speculation. As I mentioned, schwa off-glides are typical in Northern Irish accents, and as one of the early settlement groups in the US was the Scots-Irish, one might see a correlation there (although many of these early settlers would have presumably spoken Scots, so maybe not).
Earlier varieties of Dutch (an important language in 17th-Century America) would have featured centering diphthongs of the type discussed here. In fact, contemporary Afrikaans features a long ‘o’ sound that is not dissimilar to the New York City vowel in ‘thought’ (oə). This is possibly pure coincidence, and I don’t know enough about 17th-Century Dutch to say if this was a common pronunciation then.
Of course, many non-American accents have centering diphthongs as well. British Received Pronunciation has three: the ‘ear’ in ‘fear’ (fɪə), the ‘air’ in ‘fair’ (feə), and the ‘oor’ in ‘poor’ (pʊə). But these all serve a very specific phonemic purpose. In American English, by contrast, it seems that a schwa or schwa-like off-glide can be tacked on to almost any vowel without it changing the meaning of a word*.
So at the end of the day, it’s puzzling why this tendency seems so much a part of the American dialect landscape. Why are we so inclined to add a schwa after vowels?
*An important exception are the ‘oo’ in ‘goose’ and the ‘ee’ in ‘fleece.’