### Introduction to Support Vector Machines

June 13, 2017algorithm,classification,supervised,Support Vector MachineAlgorithms,Machine Learning

So far, I mainly discussed about classification algorithms that use probabilities to make decisions. However, there are algorithms that don't require the computation of probabilities. One of the algorithms that do this is called a support vector machine.

### Dataset: Belgium retail market

June 10, 2017dataset,association rule learning,aprioriDatasets

In this week's dataset, I worked with the Belgium retail market dataset. In my previous post, I talked about how Apriori can be used to generate association rules. So, I search for a good dataset that I can use to apply the Apriori algorithm. The dataset consists of over 88,000 transactions with over 16,000 different items. While the dataset only contains numbers, we can still apply the algorithm. This analysis demonstrates how support and confidence influences the amount of rules generated.

### Algorithm: Apriori

June 7, 2017algorithm,association rule learning,aprioriAlgorithms,Machine Learning

So far, I've talked about regression or classification algorithms that can be used to solve problems. Sometimes though, we just want to discover some associations within our data. These associations can, in turn, be used by a business to optimize profits.

One of the fundamental algorithms that can be used to solve these kind of problems is called Apriori algorithm.

### Dataset: Anime Recommendations Database

June 2, 2017dataset,Kaggle,recommendation systemDatasets

This week's dataset is to determine the most recommended anime from a list of anime shows and user ratings. To determine a list of recommended shows, I built a very primitive recommendation system based on two criteria:

### Algorithm: ID3

May 30, 2017algorithm,classification,supervised,decision tree,ID3Algorithms,Machine Learning

In my decision tree post, I mentioned several different types of algorithms that can be used to create a decision tree. Today, I'll be talking about a decision tree called the Iterative Dichotomiser 3 (ID3) algorithm.

### Dataset: Mushroom Data Set

May 26, 2017dataset,Naive BayesDatasets

This week's dataset is classifying the edibility of mushrooms given several attributes. I was originally going to do a comparison between Naive Bayes and decision trees on the dataset, but scikit-learn doesn't allow for string arguments when training models. Additionally, I'm not yet equipped with writing up a decision tree algorithm from scratch. Despite these setbacks, running Naive Bayes against this dataset yields very good results with 99% accuracy.

### Algorithm: Gaussian Naive Bayes

May 23, 2017algorithm,mathematics,classification,supervised,Naive Bayes,probability,Normal distributionAlgorithms,Machine Learning

Recall from my Naive Bayes post that there are several variants. One of the variants that I'll be talking about today is Gaussian Naive Bayes.

### Probability Distributions and Random Variables

May 21, 2017mathematics,Naive Bayes,probabilityMathematics

Suppose I had two coins and I flipped both of them. The possible combinations can be two heads, two tails, or one of each. These combinations are all part of a *sample space*.

Now let's take this a step further. Taking the above demonstration, we want to determine the probability that each combination would occur.

Assuming independence, we derive the following probabilities:

All of these probabilities belong in a *probability distribution*.

### No dataset this week

I just want to inform you guys that since it's my birthday, I would like to take a quick rest from analyzing a dataset. However, I will prepare another one for you guys next week.

If you guys have any questions for me, feel free to leave a comment down below or contact me.

Happy coding!

### Algorithm: Decision Trees

May 17, 2017algorithm,regression,classification,supervised,decision treeAlgorithms,Machine Learning

In my previous algorithm post, I talked about a family of algorithms called Naive Bayes. These algorithms used Bayes' theorem, independence, and probabilities to determine whether a test case can be positively categorized. However, these algorithms don't take into account the relationships between features. Additionally, it would be nice to visualize how the model actually made decisions. Fortunately, decision trees allow us to visualize the relationship of each property for classifying categories.