Algorithm: Logistic Regression

As I previously mentioned in my Linear Regression post, linear regression works best with regression type problems.  Linear regression wouldn't work well on classification problem since only numeric values on a continuous interval would be returned.  Now, you could describe a range to classify the test case when an interval is reached, but it's not very good practice to use the algorithm in this manner.

So if you can't use linear regression for classification, what kind of algorithm can be used to classify test cases?  While there are many different algorithms that can be used in classification, one of the most basic algorithms is logistic regression.

Read more


Dataset: Most Popular Baby Names by Sex and Mother's Ethnic Group, New York City

This week's dataset focuses on popular baby names and their ethnicity in New York city. Read more


Algorithm: Gradient Descent

When working with machine learning algorithms, you often have to train your algorithm prior to predicting future data.  One of the most popular methods that is used for training is called gradient descent.

Read more


Dataset: Housing Data Set (Boston Massachusetts)

This week's dataset covers some housing date from Boston Massachusetts.  The dataset is provided by UCI and is primarily geared towards regression. The main point of this analysis is to determine how the cross validation error and testing error behaved as the number of cases increased. Read more


Algorithm: Linear Regression

For anyone who starts out with machine learning, one of the first algorithms that one would learn is linear regression.

Read more


Dataset: Los Angeles Crimes 2012-2016

This week's dataset explores crimes that occurred in Los Angeles between 2012-2016.  I had two objectives in mind when working with this dataset.  The first was observing crime patterns to see whether anything interesting popped out.  The second was getting more experience manipulating data with pandas.Read more