Talk:Convex optimization

From Wikipedia, the free encyclopedia

WikiProject Mathematics
This article is within the scope of WikiProject Mathematics, which collaborates on articles related to mathematics.
Mathematics rating: Start Class Mid Priority  Field: Applied mathematics

The layout and presentation could be much better. I know it's a stub, but what little there is could at least be readable. 163.1.159.21 16:48, 14 May 2005 (UTC)

[edit] Convex functions

I see "convex optimization" applied to nonlinear functions with multiple minima. In that context are people really talking just about some convex portion of the domain around a local minimum? Similarly, is it still convex optimization if you are looking for the minimum of a Y-shaped valley where the minimum is at the bottom of the Y? What if it wasn't Y-shaped but more V-shaped so that you could draw an ellipse over either valley in which the function is convex? —Ben FrantzDale 19:50, 31 January 2007 (UTC)

For convex optimization the function to be optimized has to be convex. But then any local minimum is global minimum, so local optimization is the same as global optimization. At least that's how I know it. Oleg Alexandrov (talk) 02:38, 1 February 2007 (UTC)
Perhaps some edge-case examples would be helpful. For example, is Rosenbrock's banana function a convex function? I tend to think not (since the valley curves). —Ben FrantzDale 20:56, 5 February 2007 (UTC)
Well, if a function is not convex, then you can't do convex optimization. But you can still do optimization however, it is just you are not guaranteed a global minimum. I fail to see what your point is, with the above. Sorry. :) Oleg Alexandrov (talk) 04:42, 6 February 2007 (UTC)
My point is I think I understand what it means for a function to be convex, but I'm not 100% sure. For that reason I'd like to see some edge-cases explained. Is a curving valley such as the banana function convex? (I think it is not.) My sense is that a convex function is one that has an n-dimensional valley that doesn't curve too much so that if you keep heading down hill you'll be moving closer to the minimum. (Whereas in a curved valley you might be heading downhill but away from the actual minimum.) Is that right? —Ben FrantzDale 13:37, 6 February 2007 (UTC)

There is no graph on the "banana function" page, so I can't tell. Here is a convex function, Image:Convex-function-graph-1.png, and here is a function which is not convex, Image:Quasiconvex function.png. A greater help would be to read the convex function article rather than this one. Oleg Alexandrov (talk) 02:15, 7 February 2007 (UTC)

I've read up on convex function. The thing I want to understand is what makes convex optimization special. It feels like it's because, with a convex function, moving by ε downhill always moves you closer to the minimum. (Intuitively it seems like that property would make an optimization problem much more tractable.) Alternately, I'd like to understand what goes wrong in optimizing a function that isn't quite convex. —Ben FrantzDale 04:28, 7 February 2007 (UTC)
Not much goes wrong if the function to optimize is not convex. You may just have more than one local minimum. Try to see how your "ε downhill" intuition would work for Image:Nonquasiconvex function.png which is not convex. Oleg Alexandrov (talk) 04:47, 7 February 2007 (UTC)
Actually that picture is not a very good example since the function in question is not convex in general, but it is convex around the local minima. See again Image:Quasiconvex function.png which is not convex at all but for which your "ε downhill" argument seems to still work. Oleg Alexandrov (talk) 04:55, 7 February 2007 (UTC)
Both of those pictures obviously satisfy my "ε downhill" argument around their minima. I folloed the external links to a full-length PDF textbook. On page 16 of that (p. 30 of the PDF) it says
In this context Rockafellar stated, in his 1993 SIAM Review survey paper [Roc93],
In fact the great watershed in optimization isn’t between linearity and nonlinearity, but convexity and nonconvexity.
Presumably there is something special about convex functions that I am missing. (Or else, the analisys doesn't work for non-convex functions but actual algorithms might work reasonably well for "almost convex" functions.) Thanks. —Ben FrantzDale 05:39, 7 February 2007 (UTC)
OK, I don't know convex optimization so well. What I know is that convexity is profoundly important in the sense that if the function is convex, you are guaranteed to have a global minimum, and if it is strictly convex, that minimum is unique. This is immensely important, since then you will arrive at the minimum starting from anywhere. If your function is not convex, you always need to worry, especially in problems involving millions of variables, that the solution you found is only a lousy local minimum, and there are many other solutions in the search space which are much more optimal but which can't be easily found. There are many algorithms which can help in that case, which search for a solution randomly, etc, but the problem of finding the global minimum of a non-convex function is (I think!) NP-hard, meaning that it is impossible to do in practice. Oleg Alexandrov (talk) 16:30, 7 February 2007 (UTC)
Thanks. Perhaps someone with more background on the subject can comment. What you said sounds right, although as I understand convexity, there are plenty of non-convex functions for which there is exactly one minimum and that is the only point with a zero derivative, which seems naively like it would be just as easy to minimize. Like I said, I'm trying to grock what makes convexity special, hence the "what goes wrong if..." questions. Thanks agian. —Ben FrantzDale 16:50, 7 February 2007 (UTC)
It is just very hard to prove in general that a given non-convex function has a unique minimum. And most non-convex functions have many local minima. So practically speaking, non-convex function == no unique minimum, then what I said above applies. Oleg Alexandrov (talk) 04:17, 8 February 2007 (UTC)
After thinking about it more, it looks like gradient descent solvers can work on nonconvex functions, but e.g., the simplex method does not because the real minimum could "leak" out of the simplex as the simplex shrinks. —Ben FrantzDale 20:43, 10 April 2007 (UTC)
Gradient descent works for non-convex problems, but one end up only with local minima only. Now, the simplex method works only for linear problems as far as I know, so even more particular than convex problems. Oleg Alexandrov (talk) 03:03, 11 April 2007 (UTC)
Hmm. There are non-convex nonlinear functions with a single global minimum (e.g., a bowl with an extra dent at the bottom). Basically I'm trying to figure out why convexity is important. If it is important rather than just "functions with a single global minimum, then there has to be a case for which nonconvex functions that may still have a single global minimum cause an algorithm to fail. It could be that I don't understand the simplex method exactly. —Ben FrantzDale 04:42, 11 April 2007 (UTC)