Most of the models in machine learning requires working with numbers.  After all, much of the machine learning algorithms we’ve seen are derived from statistics (Linear Regression, Logistic Regression, Naive Bayes, etc.).  Additionally, machines can understand and work with numbers a lot easier than us human.

However, machines just process the numbers and execute algorithms.  They don’t interpret the numbers returned.  They don’t understand the context of the data.  They especially don’t understand human intricacies and can easily be taken advantage by rouge players.

So then, is it actually possible for computers to understand humans?  Can we ever have conversations with computers?  In a sense, we already can!  This is thanks to a branch of AI called Natural Language Processing.

What is Natural Language Processing?

Natural Language Processing (NLP) focuses on the ability for computers to understand the natural language of human beings.  Despite being around since the 1950s, the recent advancement of AI has greatly advance NLP for many different types of problems.

When doing a Google search on NLP, please be aware that Neuro-linguistic programming often goes by the same abbreviation.  Thus, it’s best to be more descriptive in your searches

How To Represent Words?

Unlike numbers, machines cannot easily work with words.  To a computer, a string is just an array of bytes.  Most machine learning models simply won’t be able work with words.  However, there are two popular ways to allow computers to somewhat understand natural language.

Here’s a quick overview of the two methods.  Please be aware that there is more to these methods than described below.

N-gram

In N-gram, word ordering matters.  For example, N-gram would process the sentence “The quick brown fox jumped over the lazy dog” from left-to-right (or right-to-left in other languages).  The N in N-gram allows for the amount of words in a sequence.  For example, referring back to our example sentence, a 2-gram would result in the following sequences:

(“The”, “quick”), (“quick”, “brown”), (“brown”, “fox”), …, (“lazy”, “dog”)

Bag-of-Words

With Bag-of-Words, word ordering does not matter.  Instead, the words are collected in an array and are counted by occurrence.  In our example sentence “The quick brown fox jumped over the lazy dog,” our words would be represented as

\begin{tabular}{ l l l l l } The & quick & brown & fox & jumped & over & lazy & dog \\ 2 & 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ \end{tabular}

If we decided to convert the sentence “The brown fox jumped,” the representation would be

\begin{tabular}{ l l l l l } The & quick & brown & fox & jumped & over & lazy & dog \\ 1 & 0 & 1 & 1 & 1 & 0 & 0 & 0 \\ \end{tabular}

NLP Applications

Due to natural language being the best way to communicate to humans, there is a wide range of applications that apply NLP to solve problems.  Some of them include

  • Sentiment Analysis – With this application, a program takes in a sentence as input and determines whether a sentence is positive or negative.
  • Document Classification – Using the number of word occurrences in a document, we can determine whether a document pertains to a subject such as sports, drama, or science
  • Translation – Can’t understand a webpage in an unfamiliar language?  Simply launch an app and input the sentence.  Using NLP techniques, it’s easy to know exactly what people are saying.
  • Chatbots – Probably the most well-known application in the last few years, users can interact with these programs to answer questions and perform actions on their behalf.

Challenges Of NLP

Despite the potential of human-computer interactions, there are several challenges that exist within NLP research.

Foreign to Machines

First, humans can understand language much more easily compared to computers.  This is because, unlike numbers, language contains many nuances that can completely change the context of what was meant.  For many humans, all these nuances can be taken for granted.  Computers, on the other hand, have a much harder time understand what was meant, let alone what a word even means.  While there are advancements in NLP that improves on these challenges, that doesn’t mean that the recent advancements are a panacea for all problems within the domain.

In fact, if you go into robotics, you’ll end up hitting this issue across many domains, such as locomotion and vision.  This phenomenon is known as Moravec’s paradox.

Technical

While numbers are simple to process and follow a set size within a dataset, natural language is very flexible.  There are a lot of word and a lot of ways to represent the same meaning.  Due to these attributes, natural language requires much more computing resources.  There is a reason why many chatbots are ran off the hardware of the major tech companies.

Linguistics

Linguistics is the study of human language.  Areas include, but not limited to, context, parts-of-speech, phonics, and morphology.  Humans have devised many different languages, each having specific properties.  With all these properties, there are languages that can be easier handled in NLP problems.

Let’s suppose you’re adding text-to-voice conversion to an app that’s serving customers internationally.  We know that saying the same word differently can have different meanings.

For simplicity, our supported languages are English and Japanese.    When I studied Japanese in my free time during college, one of the first things I learned was that each symbol was monotonic (meaning only producing one sound).  For example, the word car in Japanese is “kuruma” (くるま or 車).  It is not to be said any other way.  On a same note, many foreigners incorrectly say “konichi wa” instead of “konnichi wa” when saying hi to somebody.

The fact that monotonic symbols exist in Japanese makes saying words easy since there are no odd cases that could change the sound of a word.  For a computer, this property easily allows for computers to speak.

With English, a letter can have all sorts of sounds depending on the word and letters used within a word.  For example, the past and present tense of “read” are pronounced differently despite being spelled the same way.  You can also have words that sound the same, but have totally different meanings, as with the words “hear” and “here”.  This makes pronouncing English words much harder for computers to speak due to many odd cases in phonetics.

On the flip side, when it comes with working with text, English is much easier as there are only 26 letters.  Japanese, on the other hand, is a nightmare when it comes to written text.

Japanese contains three alphabets, hiragana, katakana, and kanji.  Hiragana and Katakana, the main alphabets of Japanese, each have 46 symbols and their usage depends on whether a word is foreign.  When adding kanji to the mix, there are thousands of them.  Unlike the first two alphabets, kanji can have many different sounds and alter meanings depending on the word.  Finally, a computer has to use Unicode in order to even work with Japanese.  With the intricacies of the Japanese writing system, working with text is harder than English for a computer.

The takeaway from these problems is the difference in language can greatly affect the difficulty of a problem.

Conclusion

In recent years, we have been able to build programs that can understand human language.  It is only natural that humans will eventually only communicate with computers via speech.  Computers could eventually understand humans without typing down words.  However, there are still challenges that need to be tackled before this occurs.  But when that time comes, who know what life would be like for society.

Are you currently using NLP?  How are you using it?  What interests you about NLP?  Share your comments down below.