Twitter is a place filled with #content. And sometimes (read: frequently), that #content also has at least tinge of #sarcasm. So it’s no surprise that two computer scientists found away to use Twitter’s proclivity for dry humor to help teach computers to identify when people are being sarcastic over text.
In the paper, titled “Contextualized Sarcasm Detection on Twitter,” researchers said that they trained their computers to detect sarcasm based on a variety of factors, including keywords ( “clearly,” “shocked,” “gasp”) as well as hyperbole (“really”) and even hasthtags like “#lol” and, yes, “#sarcasm.” The computers also analyzed for a variety of relating factors, such as geolocation, age, gender, and historical communication of the user and between the user and his or her audience.
Turns out, unverified male users from U.S. timezones are more likely to be sarcastic. Who knew?
By compiling both text and context signifiers, the researchers were able to train the computers to guess sarcasm at an 85 percent success rate. Additionally the “#sarcasm” hashtag is actually misleading — it’s more commonly used to reinforce that an idea is sarcastic to unfamiliar audiences, rather than being a good indicator of sarcasm itself.
While some of the applications of this research are obvious — can anyone say ‘Turing Test’? — the detection of sarcasm in social media posts could lead to better, more sophisticated efforts at tracking and filtering specfic kinds of language online. That is, a computer might be able to distinguish between when a statement is meant to be a joke, and when it is meant to be serious. It also could mean that computers can better understand the rapport and context between a network of users, which has its own promises down the line.
So now we know Twitter is at least good for one thing. Cue my shocked expression.
➤ My Favorite Thing about the Internet? Definitely the Sarcasm [MIT Tech Review]
Get the TNW newsletter
Get the most important tech news in your inbox each week.