Talk:Automatic differentiation

From Wikipedia, the free encyclopedia

WikiProject Mathematics
This article is within the scope of WikiProject Mathematics, which collaborates on articles related to mathematics.
Mathematics rating: B Class Low Priority  Field: Applied mathematics


Maybe one can add pictures containing lineraized computational graphs to describe the Forward and Reverse Modes in general. The general method (stick in cartesian basis vectors from the input or output space) is missing, too...


What is the connection between Automatic Differentiation and Dual Numbers?

Well, I would say that if P is a polynomial, and ε2 = 0 in the usual way of the dual numbers, then
P(a + ε) = P(a) +P′(a)ε

is an identity in the dual numbers that expresses 'automatic differentiation'.

Charles Matthews 15:57, 12 Jul 2004 (UTC)


I believe the article should be changed so that the mathematics is separated from the use of the mathematical idea in computer programs. Somwthing like this: First the pure mathematical stuff, then some stuff about the fact that automatic differentiation can either be implemented by operator overloading, wich is nice if you want to do numerical differentiation. Or it can be implemented by automatically transforming a program to calculate the derivatives of functions when they are called, which is nice if you need derivatives of functions but don't bother to edit the source code. Comments? --DanielJanzon 16:14, 8 October 2005 (UTC)

I agree that it's generally a good idea to separate the maths from the implementation. Your comment implies that this isn't done in the current article; however, it seems to me that the implementation is not mentioned at all. Anyway, please go ahead and describe the implementation techniques you mentioned (operator overloading and transforming the source code). -- Jitse Niesen (talk) 17:11, 8 October 2005 (UTC)

Contents

[edit] Partial rewrite

I rearranged parts of the article. The former article presented the augmented arithmetic as part of forward accumulation, which is wrong. Also I think the derivation using dual numbers is appropriate here, even though it is also present in the dual numbers article. Added two images, which needs some more explanation. Berland 07:20, 19 January 2007 (UTC)

There is a technical misunderstanding here. Augmented arithmetic (using Dual numbers) *is* the way forward-mode AD works. Or to be more pedantic, using programming language theory terminology, forward-mode AD is a nonstandard interpretation of a computer program replacing reals by dual numbers and lifting the basis functions appropriately. But reverse-mode AD works in an entirely different way. The current article essentially does not discuss reverse mode, at least not in a way that someone who does not already understand reverse mode could understand. (I suggest segregating discussion of forward versus reverse-mode, and only after both have been explained mentioning hybrids. The discussion of dual numbers belongs inside forward-mode.) Barak 12:56, 29 January 2007 (UTC)

You are right, reverse-mode needs to be "written", it is not considered done yet. Feel free to contribute. Berland 14:32, 29 January 2007 (UTC)
Um, I did. I wrote an introduction that briefly explained the two modes, I wrote a long discourse on forward mode, and had a place ready for reverse mode to go in. Someone ripped out the intro, ripped out the bits of text distinguishing forward from reverse, added inaccuracies (like saying that AD solved the exponential increase in complexity of calculating higher derivatives which it most certainly does not!) et cetera. Right now the article is not ready for reverse mode, because the intro material needs to be fixed, the transitions need to be fixed, all just to have a place for reverse mode. Also I'd feel sort of guilty removing the example and graph so carefully inserted, even though I'm not sure it is the right thing for someone who does not already have an idea of AD. Is the source code for the graph available? This would make it possible to edit the graph instead of just ripping it out. Barak 07:30, 30 January 2007 (UTC)
I have the source for the graphs yes. Berland 07:55, 30 January 2007 (UTC)
I'm moving the dual numbers stuff down because I think it can be a bit intimidating, and it's perhaps peripheral to AD. It's my impression that an understanding of it isn't necessary for understanding or using AD. Feel free to revert if you disagree. Lyonsam (talk) 17:07, 6 April 2008 (UTC)

[edit] Link to "ADIFF Online Derivatives Calculator"

I am removing this link because the site it points to falls under the category of symbolic differentiation, which AD is specifically NOT. I think the distinction is subtle enough that links such as these can only make the article more confusing. Lyonsam 18:34, 29 October 2007 (UTC)

[edit] P(x+ex') = P(x) + P'(x)x'e

This part is really confusing because (from the looks of it, im not entirely sure) the prime symbol ' is being used in 2 different ways. On x, x' just means "some other real number", whereas (I think) P' is literally the derivative of P. Is this correct? Either way, this part could use some clarification. —Preceding unsigned comment added by 98.203.237.75 (talk) 10:32, 21 January 2008 (UTC)

You are correct in that the prime symbol is used in two slightly different ways. On P, it signifies the derivative with respecto to x, but x' is just "some number". However, the latter use is beneficial when you look at the later sections, where it will be evident that x' actually can mean the derivative of some function at x. Any changes to the notation is mostly welcome, as long as it well thought through and consistent with the rest of the text. --Berland (talk) 21:29, 21 January 2008 (UTC)

[edit] Query regarding NP-Hard

"Forward and reverse accumulation are just two (extreme) ways of traversing the chain rule. In order to compute a full Jacobian of f:\mathbb{R}^n \rightarrow \mathbb{R}^m, there is a path somewhere in between which is optimal, however, locating that path is probably an NP-hard problem."

Is there a source for this assertion? I don't doubt that it is true, but I'd like to see something to back it up. Joecainey (talk) 09:56, 20 March 2008 (UTC)

From memory, I would guess that this is discussed in Andreas Griewank (2003). "A Mathematical View of Automatic Differentiation". Acta Numerica. Cambridge University Press. , but I don't have physical access to it right now in order to verify. --Berland (talk) 10:56, 2 April 2008 (UTC)
Well, it really depends very much on how you define the problem exactly. The Acta Numerica paper mentions the NP-hardness as a conjecture, I believe. However, Uwe Naumann has recently shown that this "optimal Jacobian accumulation" problem is NP-complete in general. I will add a reference to the paper to the article. Lyonsam (talk) 16:32, 6 April 2008 (UTC)