Kaggle's Digit Recognizer dataset

One of the hottest tech disciplines in 2017 in the tech industry was Deep Learning.  Due to Deep Learning, many startups placed AI emphasis and many frameworks have been developed to make implementing these algorithms easier.  Google's DeepMind was even able to create AlphaGo Zero that didn't rely on data to master the game of Go.  However, the analysis is much more basic than anything that was recently developed.  In fact, the dataset is the popular MNIST database dataset.  In other words, the dataset consists of hand written digits to test out computer vision.

Read more


Introduction to Cluster Analysis

If you were to go online and start shopping, chances are you're getting plowed by many suggestions from online sites.  However, these suggestions aren't random, but rather based on what you recently browsed and purchased.  How did they determine what to recommend and what to ignore?

The system described above is called a recommendation system.  The actual implementation, though, is through the use of a method called clustering.  Clustering, in itself, is part of Cluster Analysis.

Read more


Algorithm: Bernoulli Naive Bayes

In my post on Naive Bayes, I mentioned that there are multiple variants that can be used towards different problems.  In this post, I will be introducing another variant of Naive Bayes that utilizes the Bernoulli distribution.

Read more


Introduction to Support Vector Machines

So far, I mainly discussed about classification algorithms that use probabilities to make decisions.  However, there are algorithms that don't require the computation of probabilities.  One of the algorithms that do this is called a support vector machine.

Read more


Algorithm: ID3

In my decision tree post, I mentioned several different types of algorithms that can be used to create a decision tree.  Today, I'll be talking about a decision tree called the Iterative Dichotomiser 3 (ID3) algorithm.

Read more


Algorithm: Gaussian Naive Bayes

Recall from my Naive Bayes post that there are several variants.  One of the variants that I'll be talking about today is Gaussian Naive Bayes.

Read more


Algorithm: Decision Trees

In my previous algorithm post, I talked about a family of algorithms called Naive Bayes.  These algorithms used Bayes' theorem, independence, and probabilities to determine whether a test case can be positively categorized.  However, these algorithms don't take into account the relationships between features.  Additionally, it would be nice to visualize how the model actually made decisions.  Fortunately, decision trees allow us to visualize the relationship of each property for classifying categories.

Read more


Algorithm: Naive Bayes

So far, the algorithms that I talked about consisted of modeling the data in a linear manner.  While these algorithms can be effective for simple problems, they don't suit well where there is a non-linear relationship between features and the output.  Such problems include voice, text, and image recognition, anomaly detection, game playing bots, and any problem where there is no straightforward relationship with the features.

Some non-linear algorithm classes that can solve these kind problems include neural networks, decision trees, and clustering.  These classes often have variants that suit different purposes.  In this post, I'll be talking about a different classification algorithm called Naive Bayes.

Read more


Algorithm: Logistic Regression

As I previously mentioned in my Linear Regression post, linear regression works best with regression type problems.  Linear regression wouldn't work well on classification problem since only numeric values on a continuous interval would be returned.  Now, you could describe a range to classify the test case when an interval is reached, but it's not very good practice to use the algorithm in this manner.

So if you can't use linear regression for classification, what kind of algorithm can be used to classify test cases?  While there are many different algorithms that can be used in classification, one of the most basic algorithms is logistic regression.

Read more