I am working on a project that uses the nested logit model and non-linear optimization. I figured I would want to find the gradient of my estimation moment that requires the derivative of the nested logit function with respect to the parameters. This is pretty tedious to calculate, so I was hoping someone else had already done it. I was hopeful when I saw Nathan Miller’s note on nested logit derivatives based on the Berry1994 / BLP1995 framework, but it was not what I needed. This is because the structural error in the Berry/BLP tradition is different than my procedure, and as such they require different derivatives.

Thus, here are some notes on the direct derivatives of the nested logit function with respect to the parameters: watson_nested_logit_gradient.pdf. Please let me know if I calculated anything wrong.