What is Natural Language Processing?
Natural language processing is the process of systemising the understanding of text or speech for computers to understand. Since it’s not possible to teach a computer every reasonable question it could expect, there needs to be a smarter solution.
While computers have a finite vocabulary and strict syntax, humans, on the other hand, do not. We can quickly understand what we have heard and make reasonable assumptions of the meaning. It often leads to clear, funny, or even frustrating communication.
For computers to better serve our needs in the future, as we develop, it will need to learn our language. This predicament is where natural language processing can help break down our sentences, process meaning, and attempt to answer questions it has never seen before.
Here’s a short video from Crash Course, helping you to understand more about natural language processing.
While every language may be different in the words they use, they can also differ in sentence structures. In the English language, we use the following rule:
- Sentences must include a noun phrase.
- Sentences must include a verb phrase.
- A verb phrase must follow the noun phrase.
However, the way these phrases are constructed can change dramatically. There are nine fundamental types of words in the English language. These are all called Parts of Speech, and they play an essential role in a sentence structure.
When constructing a sentence, your noun phrase can be constructed in several different ways:
- [Noun Phrase] includes [Noun]
- [Noun Phrase] includes [Article][Noun]
- [Noun Phrase] includes [Adjective][Noun]
The same is true for verb phrases which can be constructed in many different ways:
- [Verb Phrase] includes [Verb]
- [Verb Phrase] includes [Verb][Noun Phrase]
- [Verb Phrase] includes [Verb][Prepositional Phrase]
- [Verb Phrase] includes [Verb][Noun Phrase][Prepositional Phrase]
- [Verb Phrase] includes [Verb][Noun Phrase][Adverb]
When computers are performing Natural Language Processing, they turn these phrases into a Parse Tree. It helps the computer to analyse each word and then construct an understanding of the sentence based on that tree.
Google's Knowledge Graph.
While NLP includes the understanding of natural language, it’s not limited to it. For it to count as natural language processing, there needs to some further computation involved. An example of this is the Google Knowledge Graph.
By processing data and understanding the content they find, they’re able to draw connections between several topics and words. This process is different from Latent Semantic Indexing, which focuses on understanding the relationship between words in isolation.
An example of how this differs is the artist Leonardo Da Vinci. When performing a search on him, Google recognises that he is a person from the Renaissance period. He was relevant around topics of art, science, engineering – but also part of an era of artists including Vincent Van Gogh or Pablo Picasso.
Google created a video announcing their Knowledge Graph back in 2012. While the video does a great job of explaining what and why they created it, they don’t discuss how it works. However, watching the video does give you a vague idea of what is required to process content that computers can understand.