The objective function being minimized when training neural networks is not a Convex Function of the parameters.

Strictly convex functions have only 1 global minima. Since all units in a layer are symmetric, given a solution, the weights can just be moved around hence the network is invariant to permutations multiple local minima exists Not a convex function

Since this is a non-convex optimization problem, plateauing does not necessarily mean being close to optimality. Changing the learning rate in such cases can be used as a way to see if the training procedure can escape the local minimum