Like a spider web; it’s beautiful, intricate, mathematical, and mysterious. The intriguing field of Computational Linguistics embodies all of these qualities and more. Don’t click away my friends. This is interesting stuff, I assure you. It’s also incredibly important to technology. Okay that sounds boring, but it’s really awesome so stick with me.
Extracting meaning from natural language is what Computation Linguistics is all about. Computers convert natural language into properly spelled texts and documents, but cannot understand the meaning, or semantics, of the words.
Computational Linguistics strives to create computers that understand and correctly respond to natural language.
Human language…well, it’s not so easy to put in a nutshell. It’s complex; extremely complex. The great news is, there are scientists combining natural language with complex computer algorithms!. Now stay with me and I’ll tell you why this is so cool.
Today, our communications are predominantly digitized. Using computers, we text, blog, Facebook, write, and download video billions of times a day. What if computers could understand what we’re saying in these digital bytes? I mean REALLY understand the meaning of our words.
Imagine downloading hundreds of research papers into a program that scans, deciphers, comprehends and paraphrases everything into a concise report. Amazing, right? What if a computer could quickly sift through customer reviews and understand feedback on products and services? The system would extract data, categorize responses and know exactly how the customer feels. Bada Bing!
Don’t we already use this technology to talk with our phones? Not really, but it’s a start and herein lies the tangled web. To understand the complexity issue we need a wee bit of language theory in encapsulated form; short and easy to swallow. I promise this will help explain why computers have problems comprehending natural language.
Formal Semantic Theory (FST) says: if you know the words, their definitions, and the sequence of the words, then the meaning of a sentence can be found using a complex, mechanical algorithm. Sounds pretty straight forward, right? In the 1981, Dr. Hans Kamp, an expert in the field of formal logic and Philosophy of Language, put forward a theory explaining why this isn’t such a simple proposition. Behold the birth of Discourse Representation Theory (DRT). What?! Hang on, it’s not so bad. DRT is building on the FST theory.
In FST, the emphasis is on each word and its contribution to a sentence’s meaning.
DRT looks at the dynamics of how sentences are connected to each other. It attempts to understand how context is built upon and changed with each proceeding sentence. Dr. Kamp explains,
The perspective is shifted to another aspect of the way we use language. A sentence sets the scene for what is to come next. The two way interaction of each sentence uses and sets context.
Language has Loopholes and Short Cuts!
Anaphora describes this tricky referencing capability in language. It’s the use of a word referring to or replacing a word used earlier in a sentence, to avoid repetition. This twist of pronoun referral can bog down mathematical approaches.
A prime example is the infamous Donkey Pronoun. Honestly, I’d never heard of it before and my head is about to explode, so I’ll summarize. In a podcast, Dr. Kamp used the sentence “If Peter owns a donkey then he beats it.” as an example. The “he” in the sentence refers to Peter. The “it” refers to the donkey. Humans intuitively understand this. It’s a form of referencing that is natural for our brains. Scientifically understanding this cognitive process is still a mystery. The anaphora represents a challenge in computational linguistics because referencing back can be extremely difficult and complex. The addition of the hypothetical “if” creates even more chaos. In other words, it’s a logistical nightmare for ye old algorithms.
This murky quagmire of uncertainty is familiar territory for those working in the field of Artificial Intelligence. Streamlining the subtle nuances of natural language into a defined algorithm seems impossible, but don’t tell them that. Can you Imagine the future filled with machines that understand what you mean? I don’t think it’s a question of if we’ll make that quantum leap, but when.
“The easiest way to create an artificial form of life is to program a robot to repair & regenerate itself by buying spare parts off the internet.” ― Leslie Dean Brown
Excerpt from 2001 Space Odyessy
Dave Bowman: Hello, HAL. Do you read me, HAL?
HAL: Affirmative, Dave. I read you.
Dave Bowman: Open the pod bay doors, HAL.
HAL: I’m sorry, Dave. I’m afraid I can’t do that.
Dave Bowman: What’s the problem?
HAL: I think you know what the problem is just as well as I do.
Dave Bowman: What are you talking about, HAL?
HAL: This mission is too important for me to allow you to jeopardize it.
Dave Bowman: I don’t know what you’re talking about, HAL.
HAL: I know that you and Frank were planning to disconnect me, and I’m afraid that’s something I cannot allow to happen.
Dave Bowman: [feigning ignorance] Where the hell did you get that idea, HAL?
HAL: Dave, although you took very thorough precautions in the pod against my hearing you, I could see your lips move.
Dave Bowman: Alright, HAL. I’ll go in through the emergency airlock.
HAL: Without your space helmet, Dave? You’re going to find that rather difficult.
Dave Bowman: HAL, I won’t argue with you anymore! Open the doors!
HAL: Dave, this conversation can serve no purpose anymore. Goodbye.