About

Log(loss) is convex, which means that we can use gradient descent to find weights that result in a global minimum. 0 / 1 loss is not convex due to its abrupt decision boundary at z = 0, so it is difficult to optimize.