This week’s dataset is to determine the most recommended anime from a list of anime shows and user ratings.  To determine a list of recommended shows, I built a very primitive recommendation system based on two criteria:

  • For each user, the shows are considered if their ratings fall within the range (\mu - \sigma,\mu + \sigma).  Meaning, we’ll be recommending shows with similar ratings.
  • Using the first criteria, the list is shortened even further by only listing those shows that have an above average membership base.  This would prevent shows that have very few ratings from being recommended.

Of course, this recommendation system is really inefficient and doesn’t take into account other factors.  In fact, the idea of even investing in building an efficient recommendation system could be seen as ridiculous.  After all, we’re in the age of self-driving cars, image recognition, and, even, music generating bots.  Those things have to be cooler than recommendation systems and thus businesses should invest in these things.  While some businesses do invest in these concepts, there are many businesses that have no interest in doing so.  However, improving services for their customers is a very important business problem.  This is where recommendation systems become important.

Recommendation systems can be used to provide better services to customers.  They track user activity and recommend similar products based off of other users.  Here are several examples that are currently being used:

  • One of Facebook’s services is to recommend you possible interests based off of their friend’s interests.
  • Amazon’s recommendation system suggests similar products for users to buy.
  • Netflix’s recommendation system shows movies based on both user preference and what other people like.
  • Some financial online brokers have recommendation systems to suggest asset allocation based on a person’s risk tolerance, time horizon, and objectives.

While I could go on about recommendation systems, this discussion is beyond the scope of this post and is best talked about in a separate post.

The PDF analysis can be found here.  The notebook can be found on github.  You’ll have to decompress the file since the uncompressed version exceeds the 100 MB limit for git push.

For those wanting to build their own recommendation systems on the dataset, the link can be found here.  How would you build a recommendation system?

Have any questions or comments?  Feel free to leave a comment down below and I’ll get reply back to you ASAP.