North American Dialects On Twitter and YouTube

Posted on Wednesday, January 12th, 02011 by Austin Brown
link Categories: Rosetta, Technology, The Big Here   chat 0 Comments

Using data from the Atlas of North American English (ANAE) by William Labov, Sharon Ash, and Charles Boberg combined with his own research, linguist Rick Aschmann created the detailed map above to show regional dialects throughout North America.  One of the coolest features is that he’s linked over 600 YouTube videos to the map, so that clicking a region will take you to video clips of (mostly famous) people raised in that area so that you can hear a sample of the dialect.

Researchers at Carnegie Mellon have done some similar research, though they’re using social media – Twitter specifically – as the data source, rather than just to illustrate linguistic nuance. Jacob Eisenstein and his colleagues looked at 380,000 geo-tagged tweets recently and explored the geographical dialects represented within. They saw differences in the way people abbreviate words to fit the short medium and the slang terms they used in informal messaging and were able to create a statistical model from the variation they saw that could predict the location of a user to within about 300 miles based on the dialect used.

The existence of Twitter and other informal, microblogging platforms affords a newly accessible, low-cost source of data for linguistics researchers since they don’t require labor-intensive in-person interviews to uncover patterns of informal speech:

Studies of regional dialects traditionally have been based primarily on oral interviews, Eisenstein said, noting that written communication often is less reflective of regional influences because writing, even in blogs, tends to be formal and thus homogenized. But Twitter offers a new way of studying regional lexicon, he explained, because tweets are informal and conversational. Furthermore, people who tweet using mobile phones have the option of geotagging their messages with GPS coordinates.

- Carnegie Mellon University

Eisenstein also points out that the identifiable regional variation could be an indicator that the internet is less a force for homogenization than often thought.

The Georgetown University Round Table on Languages and Linguistics later this year will explore many ways in which these, “new worlds of words occasion innovative uses of language and new spaces for constructing identities, forming relationships, and expressing social meanings.” (GURT 2011)

So, expect to see plenty more research mining social media and remember to act normal online so you don’t throw off the results.


navigateleft Previous Article

Next Article navigateright