Talk:Isotonic regression

From Wikipedia, the free encyclopedia

This article may be too technical for a general audience.
Please help improve this article by providing more context and better explanations of technical details to make it more accessible, without removing technical details.

This article is within the scope of WikiProject Statistics, which collaborates to improve Wikipedia's coverage of statistics. If you would like to participate, please visit the project page.

[edit] Pool-Adjacent Violators Algorithm

I've created this here since I'm not a user. I apologize but I do not recall my sources. I found something on Google (I think I searched "pool-adjacent violators" and it was the first hit), and borrowed it for my purposes. I suppose I could claim the exact syntax of the psuedocode is somewhat original, although it is based on something I read, so that some symbol conventions are clearly borrowed. I'll let people who care about this stuff fix it up. All I personally care about is the math itself, not the people behind it. Math transcends people. I think it is silly for people to claim to "own" math. Still, I am very grateful for all who came before me and all deserve credit, but I would rather fill my brain with ideas, rather than record who created them (waste of disk space).


Suppose we have a set of  n\ numbers  X_1 , ... , X_n\ with weights  W_1 , ... , W_n\ . The goal of the algorithm is to optimally shift these X-values so that the result, which we'll call  Y_1\ , ... , Y_n\ , are monotone increasing; that is,  Y_1 \le Y_2 \le ... \le Y_n . The shifting takes into account the weights, so that values with larger weights are shifted less than values with smaller weights. It is optimal since it can be shown (I don't know how) that the weighted mean squared deviation of the X's from the Y's is minimimzed.


We start with the first value and proceed right until we reach the first pair of values that are not in order. Suppose we call these  X_i , X_{i+1}\ . Note that  X_i > X_{i+1}\ . Next, we "pool" these values and replace both with the weighted mean (call it  M\ ) of the two values. Then, we proceed backward one value at a time, and check to see if anything that was in order before, has now become out of order. Suppose we reach a  j\ where  X_j > M \ . Then we pool again, yet this time we pool all the values  X_j , ... , X_{i+1}\ and replace them with their weighted mean (also, we redefine  M\ to be this new mean). Eventually this backtracking will terminate, and then we can jump back to position  i+1\ , and proceed as we did initially. This back and forth process will terminate, leaving a monotone sequence.


Here is pseudocode

( Caution, it is UNTESTED!! )

 For i = 1 to n
 Do
   Y(i) = X(i)
 End
 For i = 1 to n-1
 Do
   If Y(i) > Y(i+1) Then
   Do
     TotY = W(i) * Y(i)  +  W(i+1) * Y(i+1)
     TotW = W(i)         +  W(i+1)
     MeanY = TotY / TotW
     j = i - 1
     While ( j > 0 and Y(j) > MeanY )
     Do
       TotY = TotY  +  W(j) * Y(j)
       TotW = TotW  +  W(j)
       MeanY = TotY / TotW
       j = j - 1
     End
     j = j + 1
     Do k = j to i+1
       Y(k) = MeanY
     End
   End
 End