Talk:Hessian matrix

From Wikipedia, the free encyclopedia

WikiProject Mathematics
This article is within the scope of WikiProject Mathematics, which collaborates on articles related to mathematics.
Mathematics rating: Start Class Mid Priority  Field: Analysis

Contents

[edit] Initial discussion

This would really be a lot easier to understand if we could see a visual representation.. something like


Hf(x,y) = [Partial 2 derivative with respect to x2], [partial 2 derivative yx]

         [Partial 2 derivative with respect to xy], [partial 2 derivative y2]

[edit] Is this correct?

The second phrase of 'Second derivative test' (If the Hessian is positive definite...) should not be 'If the determinat of the Hessian is positive definite...' ?

A positive-definite matrix is a type of symmetric matrix. A determinant is just a real number, which may be positive or negative, but not positive definite. Follow the link. -GTBacchus(talk) 23:12, 5 March 2006 (UTC)

[edit] Del

With regard to the del operator, is it that

H=\nabla\otimes\nabla\cdot f?

Or am I just confused? —Ben FrantzDale 08:13, 28 March 2006 (UTC)

I think that is close, but you need to transpose one of the dels as well as write f as a diagonal matrix:

H=\nabla\otimes\nabla^T\cdot \mathrm{diag}(f) = \begin{bmatrix}\frac {\partial}{\partial x_1} \\ \frac {\partial}{\partial x_2} \\ \vdots \\ \frac {\partial}{\partial x_n}\end{bmatrix}  \otimes   \begin{bmatrix}\frac {\partial}{\partial x_1} & \frac {\partial}{\partial x_2} & \cdots & \frac {\partial}{\partial x_n}\end{bmatrix} \cdot \mathrm{diag}(f)
H = \begin{bmatrix}
\frac{\partial^2}{\partial x_1^2} & \frac{\partial^2}{\partial x_1\partial x_2} & \cdots & \frac{\partial^2}{\partial x_1\partial x_n} \\
\frac{\partial^2}{\partial x_2\partial x_1} & \frac{\partial^2}{\partial x_2^2} & \cdots & \frac{\partial^2}{\partial x_2\partial x_n} \\
\vdots & \vdots & \ddots & \vdots \\
\frac{\partial^2}{\partial x_n\partial x_1} & \frac{\partial^2}{\partial x_n\partial x_2} & \cdots & \frac{\partial^2}{\partial x_n^2}
\end{bmatrix} \cdot \begin{bmatrix}
f & 0 & \cdots & 0 \\
0 & f & \cdots & 0 \\
\vdots & \vdots & \ddots & \vdots \\
0 & 0 & \cdots & f\end{bmatrix} = \begin{bmatrix}
\frac{\partial^2 f}{\partial x_1^2} & \frac{\partial^2 f}{\partial x_1\partial x_2} & \cdots & \frac{\partial^2 f}{\partial x_1\partial x_n} \\
\frac{\partial^2 f}{\partial x_2\partial x_1} & \frac{\partial^2 f}{\partial x_2^2} & \cdots & \frac{\partial^2 f}{\partial x_2\partial x_n} \\
\vdots & \vdots & \ddots & \vdots \\
\frac{\partial^2 f}{\partial x_n\partial x_1} & \frac{\partial^2 f}{\partial x_n\partial x_2} & \cdots & \frac{\partial^2 f}{\partial x_n^2}
\end{bmatrix}


I'm pretty sure this is right--hope it helps. 16:56, 3 Apr 2006 (UTC)

[edit] Examples

It would be good to have at least one example of the use of Hessians in optimization problems, and perhaps a few words on the subject of applications of Hessians to statistical problems, e.g. maximization of parameters. --Smári McCarthy 16:01, 19 May 2006 (UTC)

[edit] HUH??

The hessian displayed is incorrect, it should be 1\2 of the second derivative matrix. Charlielewis 06:34, 11 December 2006 (UTC)

[edit] Bordered Hessian

Is not clear how a Bordered Hessian with more than one constrain should look like. If I knew I would fix it.. --Marra 16:07, 19 February 2007 (UTC)

See the added "If there are, say, m constraints ...". Arie ten Cate 15:08, 6 May 2007 (UTC)

[edit] The Thereom is Wrong

I had learned that Fxy=Fyx is Youngs Thereom, not Swartz —The preceding unsigned comment was added by Lachliggity (talk • contribs) 03:02, 16 March 2007 (UTC).

[edit] What if det H is zero?

It would be nice if someone could include what to do when the determinant of the hessian matrix is zero. I thought you had to check with higher order derivatives, but I'm not too sure. Aphexer 09:52, 1 June 2007 (UTC)

[edit] "Univariate" function?

Please note that "univariate" in the intro refers to a statistical concept, which I believe does not apply here. Even in function (mathematics) there is no mention of "univariate functions", that anyway to me suggest function of one independent variable, which is not what we are discussing. I'll be bold and remove, please fix if you know what was meant. Thanks. 83.67.217.254 05:43, 9 August 2007 (UTC)

"Single-valued" perhaps? But do we really need to specify that? In the intro? I would just leave "function". 83.67.217.254 05:45, 9 August 2007 (UTC)

I think I made that change. My thought was, I wanted to differentiate "single valued" from (keeping in this notation's spirit) "multi-valued". Or to quote from the second sentence and from the fifth section, "real valued" vs. "vector valued". I did not want the first sentence to be ambiguous that in general the Hessian is a matrix, which then has a tensor extension for vector valued functions.
The term "univariate" does show my professional bias, and while I still think it's appropriate, "single-valued" is completely acceptable as well. I still have a concern that not qualifying the first sentence at all allows for the tensor to be considered a case of a Hessian matrix, when I think that is better thought of an extension of the concept, since it's not a matrix per se. However, I will not revert it and will chime in on any discussion and clarification here. Baccyak4H (Yak!) 14:15, 9 August 2007 (UTC)

[edit] Vector valued functions

"If is instead vector-valued, ..., then the array of second partial derivatives is not a matrix, but a tensor of rank 3."

I think this is wrong. Wouldn't the natural extension of the Hessian to a 3-valued (i.e.) function just be 3 Hessian matrices?

Is this sentence instead trying to generalize Hessian matrices to higher-order partial derivative tensors of single-valued functions?

68.107.83.19 07:17, 2 October 2007 (UTC)

I'm sure someone can point to a reference which will answer your question, but it would seem that analogous to the Jacobian of a vector valued function, which is a matrix and not (just) a set of derivative vectors, that a rank 3 tensor makes sense: one could take inner products with such a tensor, say like in a higher order term of a multivariate Taylor series. That operation doesn't make as much sense if all one has is a set of matrices. And it would seem one could always have one extant of the tensor index the elements of the vector of the function, with an arbitrary number of additional extants indexing arbitrary numbers of variables of differentiation. My $0.02. Baccyak4H (Yak!) 13:32, 2 October 2007 (UTC)
I can't see what you're getting at.
What I mean is that if f = (f_1, f_2, ..., f_n)\,\! where f maps to R^n and each f_i maps to R, then isn't
H(f) = (H(f_1), H(f_2), ..., H(f_n))\,\!
And the only function returning a tensor that makes sense is the higher-order partial derivatives of a real-valued function g(). E.g. if rank-3 tensor T holds the 3-rd order partial derivatives of g(), then:
T_{i,j,k} = \frac{\partial^3 g}{\partial x_i\partial x_j\partial x_k}\,\!
If you disagree with this, can you explicitly state what entry T_{ijk}\,\! should be (in terms of f=(f1,f2,...,fn)) if T is supposed to be the "hessian" of a vector-valued function? 68.107.83.19 22:57, 3 October 2007 (UTC)

[edit] Riemannian geometry

Can someone write on this topic from the point of view of Riemannian geometry? (there should be links e.g. to covariant derivative). Commentor (talk) 05:15, 3 March 2008 (UTC)