Derivative
From Wikipedia, the free encyclopedia
- In some places this article assumes an acquaintance with algebra, analytic geometry, or the limit. For a non-technical overview of the subject, see Calculus.
In mathematics, a derivative is the rate of change of a quantity. A derivative is an instantaneous rate of change: it is calculated at a specific instant rather than as an average over time. The process of finding a derivative is called differentiation. The reverse process is integration. The two processes are the central concepts of calculus and the relationship between them is the fundamental theorem of calculus.
For a real-valued function of a single real variable, the derivative at a point is the slope of the line tangent to the graph of the function at that point. For functions of several variables, the derivative also has an interpretation as a linear approximation to the function at that point.
Differentiation has many applications throughout all numerate disciplines. For example, in physics, the derivative of the position of a moving body is its velocity and the derivative of the velocity is the acceleration. More generally, equations defined using derivatives, called differential equations are fundamental to the description of natural phenomena.
Derivatives can also be used to find where a function has its maximum or minimum values, which is useful in many fields, including optimization, economics and physics. The concept can be extended to functions of complex variables (see complex analysis) where it has many spectacular applications. The useful properties of the derivative have also prompted generalizations throughout mathematics, in fields as diverse as algebra, algebraic topology, differential geometry and functional analysis.
Contents |
[edit] Differentiation and the derivative
Differentiation is a method to compute the rate at which a quantity, y, changes with respect to the change in another quantity, x, upon which it is dependent. This rate of change is called the derivative of y with respect to x. In more precise language, the dependency of y on x means that y is a function of x, and if the graph of y is plotted against x, the derivative measures the slope of this graph at each point. This functional relationship is often denoted y = f(x), where f denotes the function.
The simplest case is when y is a linear function of x, i.e., the graph of y against x is a straight line. In this case, y = f(x) = m x + c, for real numbers m and c, and the slope m is given by
where the symbol Δ (Delta) is used to refer to change in a quantity. This formula is true because
- y + Δy = f(x+ Δx) = m (x + Δx) + c = m x + c + m Δx.
Now, since y = m x + c, it follows that Δy = m Δx.
This gives an exact value for the slope of a straight line. If the function f is not a straight line, however, then the change in y divided by the change in x varies: differentiation is a method to find an exact value for this rate of change at any given value of x.
The idea, illustrated by Figures 1-3, is to compute the rate of change as the limiting value of the ratio of the differences Δy / Δx as Δx becomes infinitely small.
In Leibniz's notation, such an infinitesimal change in x is denoted by dx, and the derivative of y with respect to x is written
suggesting the ratio of two infinitesimal quantities. (The above expression is pronounced in various ways such as "d y by d x" or "d y over d x". The oral form "d y d x" is often used conversationally, although it may lead to confusion.)
The most common approach[1] to turn this intuitive idea into a precise definition uses limits, but there are many other methods, such as non-standard analysis (see Differential (infinitesimal) for an overview).
[edit] Definition via difference quotients
Let y=f(x) be a function of x. The derivative of y with respect to x is geometrically the slope of the tangent line to the graph of f at x. The slope of the tangent line is the limit of the slopes of the secant lines between two points on this curve, as the distance between the two points goes to zero. If the horizontal distance between two points is denoted by h (which can be either positive or negative depending on whether the second point is to the right or the left of the first), then the slope of the line through the points (x,f(x)) and (x+h,f(x+h)) is
This expression is Newton's difference quotient. The derivative is the value of the difference quotient as the secant lines get closer and closer to the tangent line.
Hence, formally, the derivative of the function f at x is the limit
of the difference quotient as h approaches zero — where f '(x) is one of several common notations (see below).
Equivalently, the derivative satisfies the property that
which has the intuitive interpretation (see Figure 1) that the tangent line to f at x gives the best linear approximation
to f near x (i.e., for small h). This has many generalizations (see below).
The derivative cannot be obtained by directly substituting 0 for h in the difference quotient, since it will result in division by zero. Instead, the difference quotient should be regarded as a function Q(h) of the parameter h, for any (small) h not equal to zero. Then, the idea is to look for a value for Q(h) at h = 0, which makes Q continuous at 0 (intuitively, Q(h) does not "jump" in value as h crosses zero). If such a value exists (i.e., the limit exists as a real number) — and it may not — then the function f is said to be differentiable at the point x.
In practice, the continuity of the difference quotient Q(h) at h = 0 is shown by modifying the numerator to cancel h in the denominator. This process can be long and tedious for complicated functions, and many short cuts are commonly used to simplify the process.
[edit] Example
The squaring function f(x) = x2 is differentiable at x = 3, and its derivative there is 6. This is proven by writing the difference quotient as follows:
This function of h is then seen to be continuous at h = 0 with value 6 at h = 0. Hence the slope of the graph the squaring function at the point (3, 9) is 6, and so its derivative at x = 3 is f '(3) = 6.
More generally, a similar computation shows that the derivative of the squaring function f at any x is f '(x) = 2x.
[edit] Aside: continuity and differentiability
If a function y = f(x) is not continuous at a point, then there is no tangent line and the function is not differentiable at that point. However, even if a function is continuous at a point, it may not be differentiable there. For example, the function y = |x| is continuous at x = 0, but it is not differentiable there, because the limit in the above definition does not exist (the limit from the right is 1 while the limit from the left is −1). Graphically, we see this as a "kink" in the graph at x = 0. Even a function with a smooth graph is not differentiable at a point where its tangent is vertical: for instance the function is not differentiable at x = 0. Differentiability implies continuity, but not vice versa. One famous example of a function that is continuous everywhere but differentiable nowhere is the Weierstrass function.
[edit] The derivative as a function
The derivative of a function f at x is a quantity which varies if x varies. Hence, as long as it is well defined, it can be viewed as a function of x.
A function is differentiable on an interval if it is differentiable at every point within the interval. More generally, if the the derivative of a function f exists at every point x in the domain of f, the derivative of f on this domain can be defined as a function, often denoted f ', whose value at a point x is the derivative of f at x.
This means that differentiation is an operation on functions: applied to a differentiable function f, the result is the derivative function f '.
Such a concept is fundamentally more advanced than the elementary algebraic idea of a function which takes a number as its input and produces another number as its output — for example, given the input x = 3, the squaring function f(x) = x2 outputs 9, whereas the doubling function g(x) = 2x outputs 6. In contrast, differentiation takes a function as its input and produces another function as its output. For example, if the input to differentiation is the squaring function f, then the output is the doubling function f ' = g, because the doubling function gives the slope of the squaring function at any given point.
[edit] Higher derivatives
If a function f is differentiable on its domain (e.g., an interval), then its derivative f' is also a function on that domain. If f' is differentiable, then f is said to be twice differentiable and the derivative of the derivative is called the second derivative f'' of f. Similarly, the derivative of a second derivative (if it exists) is called the third derivative of f, and so on. At a given point or on a given interval, a function may have no derivative, a finite number of successive derivatives, or an infinite number of successive derivatives. If k successive derivatives exist, the function is said to be k times differentiable, whereas if infinitely many successive derivatives exist, the function is said to be infinitely differentiable or smooth.
On the real line, every polynomial function is infinitely differentiable. After differentiating a finite number times (given by the degree of the polynomial), a constant function is reached, and all subsequent derivatives are identically 0.
More generally, the derivatives of a function f at a point x provide polynomial approximations to that function near x. For example, if f is twice differentiable, then
in the sense that
If f is infinitely differentiable, then this the beginning of the Taylor series for f.
[edit] Notations for differentiation
[edit] Leibniz's notation
- See also: Leibniz's notation
The notation for derivatives introduced by Gottfried Leibniz is one of the earliest. It is still commonly used when the equation y=f(x) is viewed as a functional relationship between dependent and independent variables. Then the first derivative is denoted by
Higher derivatives are expressed using the notation
for the nth derivative of y=f(x) (with respect to x).
With Leibniz's notation, we can write the derivative of y at the point x=a in two different ways:
Leibniz's notation allows one to specify the variable for differentiation (in the denominator). This is especially relevant for partial differentiation. It also makes the chain rule easy to remember[2]:
[edit] Lagrange's notation
One of the most common modern notations for differentiation is due to Joseph Louis Lagrange and uses the prime mark, so that the derivative of a function is denoted or simply . Similarly, the second and third derivatives are denoted and . Beyond this point, some authors use Roman numerals such as for the fourth derivative, whereas other authors place the number of derivatives in parentheses: in this case. The latter notation generalizes to yield the notation for the nth derivative of f.
[edit] Newton's notation
Newton's notation for differentiation (also called the dot notation for differentiation) requires placing a dot over the function name so that if y is a function of t then denotes the first derivative of y with respect to t and denotes the second derivative. This notation becomes unmanageable for many more derivatives, but is often used in mechanics and ODE theory, where often few derivatives are needed.
[edit] Euler's notation
Euler's notation uses a differential operator D, which is applied to a function f to give the first derivative Df. The second derivative is denoted D2f, and the nth derivative is denoted Dnf.
If y=f(x) is a dependent variable, then often the subscript x is attached to the D to clarify the independent variable x. Euler's notation is then written Dxy or Dxf(x), although this subscript is often omitted when the variable x is understood, for instance when this is the only variable present in the expression.
Euler's notation is useful for stating and solving linear differential equations.
[edit] Computing the derivative
The derivative of a function can, in principle, be computed from the definition by considering the difference quotient, and computing its limit. For some examples, see Derivative (examples). In practice, once the derivatives of a few simple functions are known, the derivatives of other functions are more easily computed using rules for obtaining derivatives of more complicated functions from simpler ones.
[edit] Rules for finding the derivative
In many cases, complicated limit calculations by direct application of Newton's difference quotient can be avoided using differentiation rules. Some of the most basic rules are the following.
- Constant rule: if f(x) is constant, then
- for all functions f and g and all real numbers a and b.
- for all functions f and g.
- Chain rule: If f(x) = h(g(x)), then
- .
[edit] Derivatives of elementary functions
In addition, the derivatives of some common functions are useful to know.
- Derivatives of powers: if f(x) = xr, for some real number r, then
- (wherever this is defined)
When r = 0, this recovers the constant rule.
- Exponential and logarithm functions:
- exp'(x) = exp(x)
- ln'(x) = 1 / x.
- sin'(x) = cos(x) and cos'(x) = − sin(x).
[edit] Example computation
The derivative of
is
Here the second term was computed using the chain rule and third using the product rule: the known derivatives of the elementary functions x2, x4, sin(x), ln(x) and exp(x) = ex were also used.
[edit] Derivatives in higher dimensions
- See also: vector calculus and multivariable calculus
[edit] Derivatives of vector valued functions
The derivative of real-valued functions is easily extended to vector-valued functions y(t) from R (or an interval) to Rn, including in particular parametric curves in R2 or R3. The derivative of such a parametric curve or vector-valued function y at a point t determines its tangent vector y'(t). It can be computed by taking the derivative of each component of y(t) separately, or by defining
(if the limit exists) using subtraction of vectors instead of scalars. If the derivative of y exists at every point t, then y' is another vector valued function.
This generalization is very useful, for example, if y(t) is the position vector of a particle at time t; then the derivative y'(t) is the velocity vector of the particle at time t.
[edit] Several variables
- See also: partial derivative and total derivative
When a function f depends on more than one variable, it has a derivative with respect to each variable. The simplest way to obtain such a derivative is called a partial derivative: one simply forgets that the function depends on other variables. For example if f(x,t) is a function of two variables, then the partial derivative of f with respect to t is the derivative of f as a function of t, where x is assumed to be constant. Such a partial derivative is often denoted ∂f/∂t (where ∂ is a rounded 'd' known as the partial derivative symbol and often pronouced "del" instead of "dee").
In some situations, however, x might depend on t, in which case there is also a total derivative of f with respect to t, which is the derivative of the function g(t) = f(x(t),t).
An important example of a function of several variables is the case of a scalar-valued function f(x1,...xn) on a domain in Euclidean space Rn (e.g., on R2 or R3). In this case f has a partial derivative ∂f/∂xj with respect to each variable xj. At each point, these partial derivatives define a vector
and hence (if f is differentiable at every point in some domain) a vector field on Rn called the gradient of f.
[edit] Directional derivatives
If f is a scalar-valued function on a domain in Rn then the partial derivatives of f can be viewed as the derivatives of f in the direction of the coordinates xj. More generally, the directional derivative of f(x) = f(x1,...xn) in the direction of a vector v = (v1,...vn) is defined to be the limit
If all the partial derivatives of f exist and are continuous at x, then
In particular, it is a linear function of v, and is often denoted Dx f(v) or dfx(v).
A similar idea may be used to define the directional derivative of a vector-valued function on a domain in Rn, or more generally a function with values in Rm for any natural number m.
[edit] The Jacobian and the differential
Suppose f is a function from a domain in Rn to Rm (for instance a scalar or vector valued function on R3). Then f has components (f1,f2,...fm). If the partial derivatives of all of these components exist (at a point x), they form an m by n matrix, called the Jacobian matrix Jx(f) of f (at x), whose (i,j) entry is the partial derivative
Matrix multiplication by the Jacobian defines a linear map from Rn to Rm sending a vector v=(v1,... vn) to the vector whose ith component is
As long as the partial derivatives are continuous, this vector is the directional derivative dfx(v). By the definition of the directional derivative, the linear map dfx has the property[3] that
The intuitive interpretation of this property is that dfx is the best linear approximation to f at x. If function which has a best linear approximation in this sense is said to differentiable at x, and dfx is called the (total) differential[4] or (total) derivative of f at x, and is the linear map whose matrix is the Jacobian matrix of partial derivatives.
This generalizes the characterization of the one-dimensional derivative as the best linear approximation: if f is a function from R to R, then the differential of f exists at x if and only if the derivative of f exists at x, and the matrix of dfx is the 1×1 matrix whose only entry is f '(x). This 1×1 matrix defines a (linear) function from R to R, sending h to f '(x)h and the linear approximation already discussed may be written
[edit] History of differentiation
The concept of a derivative in the sense of a tangent line is a very old one, familiar to Greek geometers such as Euclid (c. 300 BCE), Archimedes (c. 287 BCE – 212 BCE), and Apollonius of Perga (c. 262 BCE – c. 190 BCE)[5]. Archimedes also introduced the use of infinitesimals, although these were primarily used to study areas and volumes rather than derivatives and tangents — see Archimedes' use of infinitesimals.
The use of infinitesimals to study rates of change can be found in Indian mathematics, perhaps as early as 500 CE, when the astronomer and mathematician Aryabhata (476 – 550) used infinitesimals to study the motion of the moon[6]. The use of infinitesimals to compute rates of change was developed significantly by Bhaskara (1114-1185): indeed, it has been argued[7] that many of the key notions of differential calculus can be found in his work.
The modern development of calculus is usually credited to Isaac Newton (1643 – 1727) and Gottfried Leibniz (1646 – 1716), who provided independent[8] and unified approaches to differentiation and derivatives. The key insight, however, that earned them this credit, was the fundamental theorem of calculus relating differentiation and integration: this rendered obsolete most previous methods for computing areas and volumes, which had not been significantly extended since the time of Archimedes[9]. For their ideas on derivatives, both Newton and Leibniz built on significant earlier work by mathematicians such as Isaac Barrow (1630 – 1677), René Descartes (1596 – 1650), Christiaan Huygens (1629 – 1695), Blaise Pascal (1623 – 1662) and John Wallis (1616 – 1703). In particular, Isaac Barrow is often credited with the early development the derivative[10]. Nevertheless, Newton and Leibniz remain key figures in the history of differentiation, not least because Newton was the first to apply differentiation to theoretical physics, while Leibniz systematically developed much of the notation still used today.
Since the 17th century many mathematicians have contributed to the theory of differentiation. In the 19th century, calculus was put on a much more rigorous footing by mathematicians such as Augustin Louis Cauchy (1789 – 1857), Bernhard Riemann (1826 – 1866), and Karl Weierstrass (1815 – 1897). It was also during this period that the differentiation was generalized to Euclidean space and the complex plane.
[edit] Applications of derivatives
[edit] Maxima, minima and critical points
If f is a differentiable function on R (or an open interval) and x is a local maximum or a local minimum of f, the the derivative of f at x is zero; points where f '(x) = 0 are called critical points or stationary points (and the value of f at x is called a critical value). (The definition of a critical point is sometimes extended to include points where the derivative does not exist.) Conversely, a critical point x of f can be analysed by considering the second derivative of f at x:
- if it is positive, x is a local minimum;
- if it negative, x is a local maximum;
- if it is zero, then x could be a local minimum, a local maximum, or neither. (For example, f(x)=x3 has a critical point at x=0, but it has neither a maximum nor a minimum there.)
This is called the second derivative test. An alternative approach, called the first derivative test, involves considering the sign of the f ' on each side of the critical point.
Taking derivatives and solving for critical points is therefore often a simple way to find local minima or maxima, which can be useful in optimization. By the extreme value theorem, a continuous function on a closed interval must attain its minimum and maximum values at least once. If the function is differentiable, the minima and maxima can only occur at critical points or endpoints.
This also has applications in graph sketching: once the local minima and maxima of a differentiable function have been found, a rough plot of the graph can be obtained from the observation that it will be either increasing or decreasing between critical points.
In higher dimensions, a critical point of a scalar valued function is a point at which the gradient is zero. The second derivative test can still be used to analyse critical points by considering the eigenvalues of the Hessian matrix of second partial derivatives of the function at the critical point. If all of the eigenvalues are positive, then the point is a local minimum; if all are negative, it is a local maximum. If there are some positive and some negative eigenvalues, then the critical point is a saddle point, and if none of these cases hold (i.e., some of the eigenvalues are zero) then the test is inconclusive.
[edit] Physics
Calculus is of vital importance in physics: many physical processes are described by equations involving derivatives, called differential equations. Physics is particularly concerned with the way quantities change and evolve over time, and the concept of the "time derivative" — the rate of change over time — is essential for the precise definition of several important concepts. In particular, the time derivatives of an object's position are significant in Newtonian physics:
- velocity is the derivative (with respect to time) of an object's displacement (distance from the original position)
- acceleration is the derivative (with respect to time) of an object's velocity, that is, the second derivative (with respect to time) of an object's position.
For example, if an object's position on a line is given by
then the object's velocity is
and the object's acceleration is
which is constant.
[edit] Generalizations
The concept of a derivative can be extended to many other settings. The common thread is that the derivative of a function at a point serves as a linear approximation of the function at that point.
- A very important generalization of the derivative concerns complex functions of complex variables, such as functions from (a domain in) the complex numbers C to C. The notion of the derivative of such a function is obtained by replacing real variables with complex variables in the definition. However, this innocent definition hides some very deep properties. If C is identified with R2 by writing a complex number z as x + i y, then a differentiable function from C to C is certainly differentiable as a function from R2 to R2 (in the sense that its partial derivatives all exist), but the converse is not true in general: the complex derivative only exists if the real derivative is complex linear and this imposes relations between the partial derivatives called the Cauchy Riemann equations — see holomorphic functions.
- A natural generalization concerns functions between differentiable or smooth manifolds. Intuitively speaking such a manifold M is a space which can be approximated near each point x by a vector space called its tangent space: the prototypical example is a smooth surface in R3. The derivative (or differential) of a (differentiable) map f: M → N between manifolds, at a point x in M, is then a linear map from the tangent space of M at x to the tangent space of N at f(x). The derivative function becomes a map between the tangent bundles of M and N. This definition is fundamental in differential geometry and has many uses — see pushforward (differential) and pullback (differential geometry).
- Differentiation can also be defined for maps between infinite dimensional vector spaces such as Banach spaces and Fréchet spaces. There is a generalization both of the directional derivative, called the Gâteaux derivative, and of the differential, called the Fréchet derivative.
- One deficiency of the classical derivative is that not very many functions are differentiable. Nevertheless, there is a way of extending the notion of the derivative so that all continuous functions and many other functions can be differentiated using concept known as the weak derivative. The idea is to embed the continuous functions in a larger space called the space of distributions and only require that a function is differentiable "on average".
- The properties of the derivative have inspired the introduction and study of many similar objects in algebra and topology — see, for example, differential algebra.
[edit] See also
- Automatic differentiation
- Differentiability class
- Differintegral
- Linearization
- Numerical differentiation
- Techniques for differentiation
[edit] Notes
- ^ Spivak, Calculus (1994), chapter 10.
- ^ In the formulation of calculus in terms of limits, the du symbol has been assigned various meanings by various authors. Some authors do not assign a meaning to du by itself, but only as part of the symbol du/dx. Others define "dx" as an independent variable, and define du by du = dx•f '(x). In non-standard analysis du is defined as an infinitesimal. It is also interpreted as the exterior derivative du of a function u. See differential (infinitesimal) for further information.
- ^ Apostol, T.M., Calculus (1967).
- ^ For one explanation of the terminology, see Differential (infinitesimal).
- ^ See Euclid's Elements, The Archimedes Palimpsest and O'Connor, John J., and Edmund F. Robertson. "Apollonius of Perga". MacTutor History of Mathematics archive.
- ^ O'Connor, John J., and Edmund F. Robertson. "Aryabhata the Elder". MacTutor History of Mathematics archive.
- ^ Ian G. Pearce. Bhaskaracharya II.
- ^ Newton began his work in 1666 and Leibniz began his in 1676. However, Leibniz published his first paper in 1684, predating Newton's publication in 1693. It is possible that Leibniz saw drafts of Newton's work in 1673 or 1676, or that Newton made use of Leibniz's work to refine his own. Both Newton and Leibniz claimed that the other plagiarized their respective works. This resulted in a bitter controversy between the two men over who first invented calculus which shook the mathematical community in the early 18th century.
- ^ This was a monumental achievement, even though a restricted version had been proven previously by James Gregory (1638 – 1675), and some key examples can be found in the work of Pierre de Fermat (1601 – 1665).
- ^ Eves, H. (1990).
[edit] References
[edit] Print
- Anton, Howard (1980). Calculus with analytical geometry. John Wiley and Sons, New York. ISBN 0-471-03248-4.
- Apostol, Tom M (1967). Calculus, 2nd edition. Wiley. ISBN 0-471-00005-1 and ISBN 0-471-00007-8.
- Eves, Howard (1990). An Introduction to the History of Mathematics, Saunders. ISBN 0-03-029558-0.
- Larson, Ron; Hostetler, Robert P.; and Edwards, Bruce H. (2003). Calculus of a Single Variable: Early Transcendental Functions (3rd edition). Houghton Mifflin Company. ISBN 0-618-22307-X.
- Spivak, Michael (1994), Calculus, 3rd edition, Publish or Perish Press. ISBN 0-914098-89-6.
- Thompson, Silvanus Phillips (1998), Calculus made easy : being a very-simplest introduction to those beautiful methods of reckoning which are generally called by the terrifying names of the differential calculus and the integral calculus (introduced by Martin Gardner), St. Martin's Press, New York, ISBN 0-312-18548-0.
[edit] Online books
- Crowell, Benjamin, Calculus, Fullerton College, an online textbook
- Garrett, Paul, Notes on First-Year Calculus, University of Minnesota
- Hussain, Faraz, Understanding Calculus, an online textbook
- Keisler, H. Jerome, Elementary Calculus: An Approach Using Infinitesimals, University of Wisconsin
- Mauch, Sean, Sean's Applied Math Book, CIT, an online textbook that includes a complete introduction to calculus
- Sloughter, Dan, Difference Equations to Differential Equations, an introduction to calculus
- Stroyan, K.D., A Brief Introduction to Infinitesimal Calculus, University of Iowa
- Wikibook of Calculus
[edit] External links
- WIMS Function Calculator makes online calculation of derivatives; this software also enables interactive exercises.
- ADIFF online symbolic derivatives calculator.