Whether you’re abbreviating a common phrase or excluding outsiders with your own lingo, slang is a constantly changing linguistic phenomenon seen in every subculture across the globe.
As a previous Luminoso blog pointed out, words like “wicked” can go from meaning evil or morally wrong, to really, very, or great depending on the population or locale using it. Suddenly, the wicked witch of the east isn’t so bad after all.
And the truth is, trying to understand dramatically different use-cases for the same (or new) words can have a huge impact on your ability to understand customers.
Where did slang originate?
The origins of slang are not exactly cut and dry, but many linguists agree that the Scandinavian word “slengenavn” (nickname) may have been the O.G. term. The same root, “sling” (to throw) also aligns well with slang being a quick and honest way to make a point.
In many ways, slang is an emerging phenomenon that humans use to define new experiences that don’t fit perfectly within their existing vocabulary. In fact, words like strenuous and terrific (originally meaning, something terrifying) were once slang terms, but you’d never know it today.
How do AI systems interpret slang?
Most natural language technologies can’t generate a catch-all for slang due to the constant evolution of these terms, and dictionaries simply lack the context these systems are really looking for when deciphering new meaning.
To get around this issue, many natural language tools seek out massive learning samples with data they can use to begin understanding domain-specific vocabulary like slang. Word embeddings are also commonly used to help train these systems to define the semantic proximity of words.
The Luminoso + ConceptNet advantage
One key point about Luminoso’s text parsing is that it learns from the data itself. In other words, Luminoso doesn’t need anyone to tell it what “unknown” words mean. By analyzing 10,000 comments for the new drug called Symdeko, for instance, Luminoso can understand the usage and what Symdeko “means” when people say “my doctor prescribed Symdeko,” or “I wasn’t happy with Symdeko's side effects.”
On the other hand, if Luminoso doesn’t see a slang term in the data often enough for it to understand the context, we rely on ConceptNet to supplement that knowledge. And because ConceptNet is kept up-to-date, Luminoso already knows that stan is a megafan and fo shizzle means for sure.
This incredibly valuable (and growing) layer of real-world terms and the relationships they have to one another allows Luminoso to stay hip to the newest lingo as we continue to create it.