In my previous post, you saw the derivative of the cost function for logistic regression as:

I bet several of you were thinking, “How on Earth could you derive a cost function like this:

Into a nice function like this:

?”

Well, this post is going to go through the math.  Even if you already know it, it’s a good algebra and calculus problem.

Before we begin, I want to make a few notes.  First, you would normally apply multi-dimensional calculus by deriving .  However, to make things a bit easier, I’m going to derive  as .  For those wanting to use multivariate calculus, we’ll define the derivative of  as:

We define  as:

We also denote log as the natural logarithm (ln).

Finally, we define the function g(x) as follows:

Now that we got the notes in, let’s begin:

The first thing we want to do is expand

Now, using the property , we get:

Using the property , we get:

Using the property , we get:

Since we already know that  can be reduced to g(x), we get:

If one instead used multi-dimensional calculus, with , we get:

The lesson

When I was attempting to prove the derivative of the cost function, you have to know your logarithms very well.  It allows you to simplify the equation prior to deriving.  Otherwise, if you started deriving at the wrong step, you would end up causing yourself a lot of grief down the road.  I eventually found this link to see where I was getting stuck.  From that point on, I was able to obtain the solution.