User:Justin545/Valuables

From Wikipedia, the free encyclopedia

Contents

[edit] Gravitational Field vs. Electric Force Field. Why?

In the last paragraph of section Quantum Mechanics and General Relativity:

"...it is not clear how to determine the gravitational field of a particle, if under the Heisenberg uncertainty principle of quantum mechanics its location and velocity cannot be known with certainty...".

I am just curious about why the electric force field for the central-force problem (finding the wave function for the electron circling the nucleus of a hydrogen atom) can be determined, but it's not clear how to determine the gravitational field of a particle?

To figure out the wave function Ψ for the electron of the central-force problem, the potential energy field V in the time-independent Schrödinger equation

E\Psi=-\frac{\hbar^2}{2m}\nabla^2\Psi + V\Psi

must be determined. But the potential energy field V is known after the force field \bold{F} exerting on the electron is determined.

In my text book of quantum mechanics, the force field \bold{F} is just the central force caused by the charges of the nucleus and the electron from the hydrogen atom.

The nucleus, a proton, which is a particle should comply with?? the Heisenberg uncertainty principle with uncertain location and velocity. How can we say the electric force field between the proton and the electron is in the form of central force? Or, why can not we say the gravitational field between them IS in the form of central force just like classical mechanics?

p.s. I'm just new to quantum mechanics so the questions here may be ridiculous and stupid. Forgive me please if any.

Justin545 (talk) 11:01, 13 January 2008 (UTC)

Hi Justin - I'd be more than glad to answer some of your questions. The quote above sources from an article on arXiv that attempts to circumvent what's called a singularity. In general, this is nothing out-of-the-ordinary, as for example in electromagnetism (namely: Quantum electrodynamics) such a singularity exists as well, which would result in infinite polarization of the vacuum around an electric point charge - but (for some reason) a procedure called Renormalization happens to be able to resolve this successfully. For gravity, unfortunately, the singularity is more complex: Because the gravitational field itself becomes a source of gravity. Gravitational charges (e.g. point masses) and the resulting gravitational fields are in a dynamic balance, and cannot simply be separated anymore (contrary to electrodynamics, where one can separate the fields from test charges; the electromagnetic field can simply be added through superposition; no so for gravity). Naturally, the singularity of a point mass becomes more complex: In addition to a singularity at the origin, there is an additional singularity (though of a different quality) at the Schwarzschild radius: Space and time get quite weird and counterintuitive. So, the Schrödinger equation that you wrote above still holds as a good approximation for gravity if the field self-interaction could be neglected. Problem is: The gravitational force would, in this case, be so terribly weak that it is futile to even consider: Richard Feynman calculated one time (in Acta Physica Polonica, if I remember correctly) that the gravitational force of a proton in a hydrogen atom would have shifted the quantum mechanical phase of the electron in that same atom just a few docent arcseconds ... during 100 lifetimes of our universe! In order to get meaningfully close to anything that could possible ever be measured, for quantum gravity and to the best of today's knowledge, one would have to go to energies and length scales at which charges/masses and their resulting fields are tightly coupled. So, on first look, the uncertainty principle is just one out of a spectrum of problems (but nevertheless, surely is one of it). Hope this helps! Jens Koeplinger (talk) 00:00, 14 January 2008 (UTC)
I just know very little about the special/general relativity, and nothing about the quantum field theory. It seems they are required to truly understand your explanation. I started to study quantum mechanis because of my curiosity about knowing how quantum computer works, especially for entanglement. I found the more I learn the more qustions bother me. Your answer is a good guidance for me and it helps. Thank your for your patience and time to answer my qustions!
Justin545 (talk) 12:17, 14 January 2008 (UTC)
Ok ... I'm glad my 'sweep' that touches several points of interest seems helpful to you :) - Now, just re-reading what you wrote: "I found the more I learn the more questions bother me." Welcome to the club, you're in good company. You seem interested in Quantum information, Quantum computer, and also Quantum entanglement - from an engineering point of view maybe. If you're interested in the foundations of quantum mechanics, there's one thing I might want to recommend to you studying early on, which is Bell's theorem. Good luck! Jens Koeplinger (talk) 04:22, 17 January 2008 (UTC)

[edit] How Composite Quantum System Relates to Tensor Product?

Consider two noninteracting systems A and B, with respective Hilbert spaces HA and HB. The Hilbert space of the composite system is the tensor product

 H_A \otimes H_B 

 

 (8)

 

My question is why the composite Hilbert space of the two noninteracting systems is their tensor product as (8)?

It always be true the tensor product is accounted for the concept of composite quantum systems, quantum entanglement especially. As well, it is a big deal with respect to quantum computation. The massive Hilbert space of the composite system dramatically boosts the power of quantum computers. According to postulate of quantum mechanics, the Hilbert space of a composite system is the Hilbert space tensor product of the state spaces associated with the subsystems. But it's rare to see any article which can point out where such a postulate comes from. Is the postulate due to the overwhelming, experimental evidence? Is it a derivational consequence from fundamental quantum theory?

It's difficult to convince me of the ability and the power of quantum computation if no one can tell how the composite quantum system relates to the tensor product. Hopefully, the postulate came from the derivational consequence of quantum theory rather than just from the experimental evidence. After reviewed the original EPR paper, I came up with an idea. So I tried to explain it myself. Although the explanation is very likely to be wrong and even seems naive and optimistic, I would like to put it here to see if anyone could give some advice or correction to my faults. For simplicity, the following assumes all relevant state spaces are finite dimensional.

For a composite system of two particles A and B, the wave function is

Ψ(x1,x2) 

 

 (1)

 

where x1 and x2 are respective positions of A and B. Similar to the idea of Separation of variables for solving PDE discovered by Leibniz, if the wave function can be separated into multiplication of two functions such that

Ψ(x1,x2) = U(x1)V(x2) 

 

 (2)

 

As a result, the functions U(x1) and V(x2) can be viewed as wave functions for A and B, respectively. Furthermore, U(x1) and V(x2) are in Hilbert spaces HA and HB, respectively. Therefore, the two functions can be expanded by their related basis such that

U(x_1)=\sum_i a_i |i\rangle_A 

 

 (3)

 

V(x_2)=\sum_j b_j |j\rangle_B 

 

 (4)

 

where \{|i\rangle_A\} and \{|j\rangle_B\} are respective sets of basis for HA and HB. Substitute (3) and (4) into (2), we have

\Psi(x_1,x_2)=\left(\sum_i a_i |i\rangle_A\right) \left(\sum_j b_j |j\rangle_B\right)

=\left(a_1 |1\rangle_A + a_2 |2\rangle_A + ... + a_m |m\rangle_A\right)
\left(b_1 |1\rangle_B + b_2 |2\rangle_B + ... + b_n |n\rangle_B\right)

=\sum_{i,j} a_ib_j|i\rangle_A |j\rangle_B 

 

 (5)

 

Since |i\rangle_A and |j\rangle_B are in different Hilber spaces, their multiplication is equivalent to their tensor product. Thus

|i\rangle_A|j\rangle_B=|i\rangle_A \otimes |j\rangle_B 

 

 (6)

 

Substitute (6) into (5), we have

\Psi(x_1,x_2)=\sum_{i,j} a_ib_j \left(|i\rangle_A \otimes |j\rangle_B\right)
=\sum_{i,j} a_ib_j |ij\rangle_{AB} 

 

 (7)

 

That (7) is a state or vector in Hilber space H_A \otimes H_B. And (7) can be generalized to systems that involve more than two particles or subsystems. However, it is problematic such as

  1. The method of the separation of variables can not guarantees to be the solution for every class of PDE. Likewise, not all wave function of form (1) can be separated into multiplication of two functions of form (2).
  2. Even if the wave function (1) could be separated to the form of (2) "mathematically", but does it make physical sense to say that the functions U(x1) and V(x2) are two "wave functions" which are the component systems of Ψ(x1,x2)?

Well, I am neither a mathematician nor a physicist. I don't mean to offend or mislead someone with my words. I am just hoping to get more clue about answering the question "How Composite Quantum System Relates to Tensor Product?" with this discussion. Thanks! - Justin545 (talk) 00:50, 23 February 2008 (UTC)

Truth is a very difficult concept (with apologies to Alan Clarke MP (deceased))


All of your math looks right. As you say, some states of the composite system can be written in this way and some can't; those that can are called separable. It's correct to refer to U and V as wave functions, and in fact all wave functions are like that. You can't describe the whole universe with a wave function, only separable parts of it.
There's nothing quantum mechanical about the idea of phase space or separability or combining systems by taking the tensor product. For a classical analogy, take a system of three classical bits. This system has 23 = 8 states, which can be written |000\rangle, |001\rangle, and so on. An example of a computational step on these bits might be "flip the third bit if at least one of the first two is set", which can be written with the transition matrix
\scriptstyle\left(\begin{array}{cccccccc}1&&&&&&&\\&1&&&&&&\\&&&1&&&&\\&&1&&&&&\\&&&&&1&&\\&&&&1&&&\\&&&&&&&1\\&&&&&&1&\end{array}\right)
(all other entries zero). You can think of this matrix as acting (by left-multiplication) on a state vector, which is an 8-component column vector that has a 1 at the index corresponding to the state of the system and zeroes everywhere else. (So |000\rangle is (1,0,0,0,0,0,0,0)t, |001\rangle is (0,1,0,0,0,0,0,0)t, and so on.) Or, more generally, you can think of it as acting on a probability distribution over possible states of the computer, for example \frac13|001\rangle + \frac23|011\rangle = (0,\frac13,0,\frac23,0,0,0,0)^t. The probabilities all have to be between 0 and 1 and they must sum to 1. If you only allow reversible computations, then the only matrices that preserve that property are the permutation matrices.
If you have two of these computers, you can describe them as a single system using the 64-dimensional tensor product of the individual 8-dimensional phase spaces, for which the natural basis is |000\rangle|000\rangle, |000\rangle|001\rangle, \ldots, |000\rangle|111\rangle, |001\rangle|000\rangle, \ldots. As long as the two subsystems don't interact, the composite state can be written as a tensor product of the states of the subsystems (as you did above), and the transitions can be written as the matrix tensor product of the transitions of the subsystems. If the subsystems do interact (e.g. a bit in one is flipped or not flipped depending on a bit in the other), then the subsystems may become correlated, in which case they can't be written this way any more.
To get quantum computing from this, all you do is replace the classical probabilities which sum to 1 with complex numbers whose squared absolute values sum to 1. Because the square norm is much more symmetric (the space of valid vectors is a sphere instead of a simplex), there are a lot more reversible computations you can do; in fact, any unitary matrix is a valid computation. Permutation matrices are unitary matrices, so classical computations are a subset of quantum computations. The quantum states that would be called "correlated" classically are called "entangled" instead. I do think a new name is justified because there is something new in the quantum case, namely violation of Bell's inequality, but the mathematics is the same.
It's unfortunately true that a lot of introductions to quantum computing don't explain the connection to classical computing and often attribute the extra power of quantum computers to the exponential size of the phase space or to entanglement. Neither explanation makesThis explanation doesn't make much sense given that both of these properties arethis property is inherited from the classical case. (Edit: I think it was a mistake to mention entanglement here since there are different notions of entanglement, and it's reasonable to relate quantum computing to entanglement in some senses.) The real nature of the extra power of quantum computers isn't well understood. There seems to be a class of problems in between P and NP which is efficiently solvable on quantum but not classical computers. It includes interesting number-theoretic problems like factoring and discrete logarithm, and it may be related to public-key cryptography somehow. To my knowledge the only interesting quantum algorithm outside that class is Grover's algorithm, which is often described as "database search" but is actually a SAT solver. It's faster (in the worst case) than the best known classical algorithm, but still very slow. No one has found an efficient quantum algorithm for any NP-complete problem, and it seems likely that there aren't any. In other words, a quantum computer's power seems to be very limited compared to the naive idea of a parallel-universe computer that does exponentially many calculations in parallel, since such a computer could solve NP-complete problems efficiently (basically by the definition of NP).
If you don't like the Hilbert space and the tensor products and the exponential size, you can look at the path integral formulation of quantum mechanics. It coexists with the Hilbert space approach because a lot of problems are much easier to solve in one than the other. You might also be interested in this paper. -- BenRG (talk) 19:34, 23 February 2008 (UTC)
Your answer is pretty clear and understandable, especially when you are explaining the transition matrix of the 3-bit computer. My understanding of your answer is that interacted=entangled=correlated and non-interacted=separable. But some new problems appear after reading:
1. Consider the two paricles A and B in my qestion above. When they are entangled, or non-separable, the wave function ψ(x1,x2) can NOT be written as separated wave functions multiplication U(x1)V(x2). Therefore, we may NOT write
\Psi(x_1,x_2)=\sum_{i,j} a_ib_j \left(|i\rangle_A \otimes |j\rangle_B\right)
Does that mean the entangled state space of the composite system is NOT in H_A \otimes H_B? However, we can still see several examples of entanglement that the state of the composite system is written in term of the basis of H_A \otimes H_B such as the following entangled state:
|\Psi\rangle={1 \over \sqrt{2}} \bigg( |0\rangle_A \otimes  |1\rangle_B - |1\rangle_A \otimes |0\rangle_B \bigg)
It seems contradictory...
2. How to determine two particles whether they are conposite or non-composite? Can we say the two particles is two non-composite systems when they are distanced far away, and they are one composite system when they are very closed to each other like the electron and the proton in a hydrogen atom?
Well, I am not quite understand the quantum computing. I think the quantum computer can only solve decision problems such as SAT, but not problems which is sort of like programming that needs many step of calculations. There seems to be many problem useful but belong to NP-complete which is not likely to be solved by quantum computer. It sounds somewhat disappointing. We don't know if the quantum computer is an useful and universal machine even if we can really make a 500-bit (or more then 500-bit) of quntum computer. - Justin545 (talk) 03:51, 24 February 2008 (UTC)
Apologies for neglecting this thread. On your first point, the state space IS H_A \otimes H_B, but you can't write arbitrary elements of that space as a sum of products of elements of the subspaces weighted by aibj. You can write arbitrary elements with arbitrary weights cij. There are no a_0, a_1, b_0, b_1 \in \mathbb{C} such that a_0b_1 = -a_1b_0 = 1/\sqrt{2} and a_0b_0 = a_1b_1 = 0\,\!, but there are c_{00}, c_{01}, c_{10}, c_{11} \in \mathbb{C} such that c_{01} = -c_{10} = 1/\sqrt{2} and c_{00} = c_{11} = 0\,\!. On the second point, particles that are causally interacting like the electron and proton need to be treated together, and particles that aren't causally interacting can usually be treated separately even if they're nonclassically entangled. The only case where the entanglement of noninteracting particles matters is if you do measurements on both particles and later compare the results; then you can get nonclassical correlations. If you're only doing measurements on one particle then you can always describe it without reference to the other. If the two particles are unentangled then your particle can be represented by a state vector; otherwise it has to be represented by a density matrix. Measuring a property of your particle destroys its entanglement with the other particle (in that property), so once you've measured all the properties that the density matrix says you're classically uncertain about, you can again represent your particle by a state vector. Incidentally, I shouldn't have said that entanglement is just the quantum name for correlation, since it's often used to mean just the nonclassical part of the correlation (the part that violates Bell's inequality).
Quantum computers are universal; they can solve the same problems as classical computers with the same efficiency as classical computers, in terms of big-O notation. But there's not much point using a quantum computer to run a classical algorithm, especially because the constant factor will probably be enormously higher. There are some specific problems for which specifically quantum algorithms are known, but, as you say, they mostly don't seem very useful. There's a big exception that I forgot to mention, which is simulation of quantum systems. I don't know anything about this, but I think that quantum computers could potentially revolutionize fields like lattice QCD. Also, a large quantum computer is a great test of the principles of quantum mechanics; successful factorization of the RSA challenge numbers would be a dramatic confirmation of quantum mechanics and would definitively falsify a large class of hidden variable theories, and for that reason alone I think it's an experiment worth doing. -- BenRG (talk) 16:20, 27 February 2008 (UTC)
Indeed, you have done a pretty good job to prove why the entangled state below

|\Psi\rangle={1 \over \sqrt{2}} \bigg( |0\rangle_A \otimes  |1\rangle_B - |1\rangle_A \otimes |0\rangle_B \bigg) 

 

 (9)

 

is not separable i.e. the state can not be writter in the form of (7). Since you have proven there are no numbers a_0,a_1,b_0,b_1 \in\mathbb{C} which can satisfy the conditions you listed above. It it understandable and clear. I apologize for obscuring my question last time. I attempt to clarify my question again.
We have proven the state of a composite system is in H_A \otimes H_B if the state is separable. That's because when the state of a composite system is separable, the corresponding wave function can be written in the form of (2) which also implies (7) is true and therefore we can say the state of the composite system is in H_A \otimes H_B. Moreover, we have also proven a separable state has m * n basis, which is just a basic property of tensor product, when HA has m basis and HB has n basis.
However, it is not enough to say "all" of states of composite systems are in H_A \otimes H_B. We have only proven that all of "separable" states are in H_A \otimes H_B (like what I did from step (1) to (7)), but we "have not" proven all of "non-separable" (entangled) states are in H_A \otimes H_B. Since any state of a composite is either separable or non-separable (entangled), we can not say "all" of states are in H_A \otimes H_B until we can prove "both" separable and non-separable (entangled) states are in H_A \otimes H_B.
Then my question last time was "How to prove all of non-separable (entangled) states of any composite system are also in H_A \otimes H_B?". For example, if we take a look at the non-separable (entangled) state (9), we can find the state is in H_A \otimes H_B since its basis |0\rangle_A\otimes|1\rangle_B and |1\rangle_A\otimes|0\rangle_B are in H_A \otimes H_B. But I have no idea where the two basis |0\rangle_A\otimes|1\rangle_B and |1\rangle_A\otimes|0\rangle_B come from. The state (9) is denoted in bra-ket notation, but I have no idea how does its corresponding wave function look like. Can we use the similar way (like what I did from step (1) to (7)) to prove all of non-separable (entangled) states of any composite system are also in H_A \otimes H_B? If we can prove it, we will be able to say "all" of states of composite systems are in H_A \otimes H_B and we can also explain how a wave function for a separable or non-separable state relates to its bra-ket notation. - Justin545 (talk) 01:42, 3 March 2008 (UTC)
Turing has proven that Turing machine is universal. If we can simulate a Turing machine on a quantum computer, we may prove quantum computer has the ability as strong as Turing machine and therefore it is universal. However, as you said, there's not much point using a quantum computer to run a classical algorithm, it would be required to combine quantum computer with classical computer to achieve similar function of Turing machine.
I agree with you it's an experiment worth doing. But I am afraid that quantum computers are not scalable (well, I'm not sure). Although D-Wave has announced a working prototype of 16-qubit (or even more qubits later) quantum computer, there seems to be some coupling problem with the prototype. It sounds like D-Wave simply put four quantum computers, each of which is 4-qubit, together. I can not see any significant advances when we are talking about if quantum computers are scalable. On the other hand, keeping the system entangled is also difficult, especially when more quanta are involved. Which would limit the time to do quantum operation and therefore limit the complexity of the problems it can solve. Some experts predicted useful quantum computers would appear after one or two decades. Is it just a matter of time? Well, I am not so sure. By contrast, DNA computers are more stable than quantum computers. But DNA computing does not provide any new capabilities from the standpoint of computational complexity theory. It seems only quantum computers have such potential. Other than quantum computers and DNA computers, aren't there any other natural analogies to quantum computers but also stable enough? That question drives me to study why quantum computers are so powerful from mathematical point of view. - Justin545 (talk) 08:56, 3 March 2008 (UTC)

[edit] Chain Rule and Higher Derivative

Let

z = f(x,y)
x = g(s,t)
y = h(s,t)

Then

\frac{\partial z}{\partial t}=\frac{\partial z}{\partial x}\frac{\partial x}{\partial t}+\frac{\partial z}{\partial y}\frac{\partial y}{\partial t}=\left(\frac{\partial}{\partial x}\frac{\partial x}{\partial t}+\frac{\partial}{\partial y}\frac{\partial y}{\partial t}\right)z 

 

 (1)

 

\frac{\partial^2 z}{\partial t^2}=\frac{\partial}{\partial t}\frac{\partial z}{\partial t}=\frac{\partial}{\partial t}\left[\left(\frac{\partial}{\partial x}\frac{\partial x}{\partial t}+\frac{\partial}{\partial y}\frac{\partial y}{\partial t}\right)z\right]=\left(\frac{\partial}{\partial x}\frac{\partial x}{\partial t}+\frac{\partial}{\partial y}\frac{\partial y}{\partial t}\right)\frac{\partial z}{\partial t} 

 

 (2)

 

Replace (1) into (2)

\frac{\partial^2 z}{\partial t^2}=\left(\frac{\partial}{\partial x}\frac{\partial x}{\partial t}+\frac{\partial}{\partial y}\frac{\partial y}{\partial t}\right)\left(\frac{\partial}{\partial x}\frac{\partial x}{\partial t}+\frac{\partial}{\partial y}\frac{\partial y}{\partial t}\right)z 

 

 (3)

 

\frac{\partial^2 z}{\partial t^2}=\left[\frac{\partial^2}{\partial x^2}\left(\frac{\partial x}{\partial t}\right)^2+\frac{\partial^2}{\partial x\partial y}\frac{\partial x\partial y}{\partial t^2}+\frac{\partial^2}{\partial y\partial x}\frac{\partial y\partial x}{\partial t^2}+\frac{\partial^2}{\partial y^2}\left(\frac{\partial y}{\partial t}\right)^2\right]z 

 

 (4)

 

\frac{\partial^2 z}{\partial t^2}=\frac{\partial^2 z}{\partial x^2}\left(\frac{\partial x}{\partial t}\right)^2+\frac{\partial^2 z}{\partial x\partial y}\frac{\partial x\partial y}{\partial t^2}+\frac{\partial^2 z}{\partial y\partial x}\frac{\partial y\partial x}{\partial t^2}+\frac{\partial^2 z}{\partial y^2}\left(\frac{\partial y}{\partial t}\right)^2 

 

 (5)

 

Is every step above correct? - Justin545 (talk) 07:07, 27 February 2008 (UTC)

No.
\frac{\partial z}{\partial x}\frac{\partial x}{\partial t}
is not the same as
\left(\frac{\partial}{\partial x}\frac{\partial x}{\partial t}\right)z.
For example, if z = x = t, the former evaluates to 1·1 = 1, and the latter to z = 0. Think of \tfrac{\partial}{\partial x} as an operator, and abbreviate it as D. Also abbreviate \tfrac{\partial x}{\partial t} as U. Then in your first line of equations you replaced Dz × U by DU × z.  --Lambiam 09:46, 27 February 2008 (UTC)
Reply to Lambiam: You are right indeed. According to your answer, does that mean (5) is also an incorrect result? What is the correct answer if (5) is incorrect? (i.e. what is \frac{\partial^2 z}{\partial t^2} equal to?) Thanks! - Justin545 (talk) 12:15, 28 February 2008 (UTC)
If you take f(x,y) = x and g(s,t) = t2, then z = t2, and the result should be 2. However, the right-hand side of (5) evaluates to 0. I think two terms have gone AWOL. Applying the product rule,
\frac{\partial}{\partial t}\left(\frac{\partial z}{\partial x}\cdot\frac{\partial x}{\partial t}\right) = 
\frac{\partial}{\partial t}\frac{\partial z}{\partial x}\cdot\frac{\partial x}{\partial t} + \frac{\partial z}{\partial x}\cdot\frac{\partial}{\partial t}\frac{\partial x}{\partial t}.
The second term seems to be missing, and likewise with x replaced by y. Taking them together, the following should be added to the right-hand side of (5):
+ \frac{\partial z}{\partial x}\cdot\frac{\partial^2 x}{{\partial t}^2} + \frac{\partial z}{\partial y}\cdot\frac{\partial^2 y}{{\partial t}^2}.
 --Lambiam 21:59, 28 February 2008 (UTC)
According to your reply

\frac{\partial}{\partial t}\left(\frac{\partial z}{\partial x}\cdot\frac{\partial x}{\partial t}\right) = 
\frac{\partial}{\partial t}\frac{\partial z}{\partial x}\cdot\frac{\partial x}{\partial t} + \frac{\partial z}{\partial x}\cdot\frac{\partial}{\partial t}\frac{\partial x}{\partial t}.  

 

 (6)

 

Then

\frac{\partial}{\partial t}\left(\frac{\partial z}{\partial x}\cdot\frac{\partial x}{\partial t}\right)=\frac{\partial}{\partial t}\frac{\partial z}{\partial x}\cdot\frac{\partial x}{\partial t}+\frac{\partial z}{\partial x}\cdot\frac{\partial^2 x}{\partial t^2}  

 

 (7)

 

But what does \tfrac{\partial}{\partial t}\tfrac{\partial z}{\partial x}\cdot\tfrac{\partial x}{\partial t} evaluate to? Does it evaluate to 0 or does it evaluate to the rest of terms on the right-hand side of (5)? Thanks! - Justin545 (talk) 03:56, 29 February 2008 (UTC)
The latter (or, more precisely, half of these terms – there is also the same with x replaced by y).  --Lambiam 00:50, 1 March 2008 (UTC)
According to our discussion above, we may conclude that

\frac{\partial^2 z}{\partial t^2}=\frac{\partial^2 z}{\partial x^2}\left(\frac{\partial x}{\partial t}\right)^2+\frac{\partial^2 z}{\partial x\partial y}\frac{\partial x\partial y}{\partial t^2}+\frac{\partial^2 z}{\partial y\partial x}\frac{\partial y\partial x}{\partial t^2}+\frac{\partial^2 z}{\partial y^2}\left(\frac{\partial y}{\partial t}\right)^2
+\frac{\partial z}{\partial x}\frac{\partial^2 x}{{\partial t}^2}+\frac{\partial z}{\partial y}\frac{\partial^2 y}{{\partial t}^2} 

 

 (8)

 

where

z = f(x,y) 

 

 (9)

 

x = g(s,t) 

 

 (10)

 

y = h(s,t) 

 

 (11)

 

The conclusion seems to be correct since I can now verify our conclusion by a related problem in my text book of quantum mechanics:
Show that (12) is true

\frac{\partial^2\psi}{\partial x^2}+\frac{\partial^2\psi}{\partial y^2}=\frac{\partial^2\psi}{\partial r^2}+\frac{1}{r}\frac{\partial\psi}{\partial r}+\frac{1}{r^2}\frac{\partial^2\psi}{\partial \phi^2} 

 

 (12)

 

where

ψ = Ψ(x,y) 

 

 (13)

 

x = rcosφ 

 

 (14)

 

y = rsinφ 

 

 (15)

 

Eq. (14) and (15) also implies

x2 + y2 = r2 

 

 (16)

 

Fortunately and finally, I can solve the problem above with your great help! Although I don't understand how does \tfrac{\partial}{\partial t}\tfrac{\partial z}{\partial x}\cdot\tfrac{\partial x}{\partial t} become half of the rest of terms on the right-hand side of (5) at the moment, I think I would figure it out later or open a new question here. Our discussion may end here. The following is just how I verify our conclusion with the problem. Just want to make the thread more complete. And you could simply skip the stuff below. Thanks for your help :-)
My proof of (12) is as below:
Replace z by ψ, t by r, s by φ, f by Ψ in (8), (9), (10) and (11). We have

\frac{\partial^2\psi}{\partial r^2}=\frac{\partial^2\psi}{\partial x^2}\left(\frac{\partial x}{\partial r}\right)^2+\frac{\partial^2\psi}{\partial x\partial y}\frac{\partial x\partial y}{\partial r^2}+\frac{\partial^2\psi}{\partial y\partial x}\frac{\partial y\partial x}{\partial r^2}+\frac{\partial^2\psi}{\partial y^2}\left(\frac{\partial y}{\partial r}\right)^2
+\frac{\partial\psi}{\partial x}\frac{\partial^2 x}{{\partial r}^2}+\frac{\partial\psi}{\partial y}\frac{\partial^2 y}{{\partial r}^2} 

 

 (17)

 

ψ = Ψ(x,y) 

 

 (18)

 

x = g(φ,r) 

 

 (19)

 

y = h(φ,r) 

 

 (20)

 

Evaluate derivatives

\frac{\partial x}{\partial r}=\cos\phi,    \left(\frac{\partial x}{\partial r}\right)^2=\cos^2\phi,    \frac{\partial^2 x}{\partial r^2}=0,

\frac{\partial y}{\partial r}=\sin\phi,    \left(\frac{\partial y}{\partial r}\right)^2=\sin^2\phi,    \frac{\partial^2 y}{\partial r^2}=0,

\frac{\partial x\partial y}{\partial r^2}=\frac{\partial y\partial x}{\partial r^2}=\sin\phi\cos\phi 

 

 (21)

 

Replace (21) into (17)

\frac{\partial^2\psi}{\partial r^2}=\frac{\partial^2\psi}{\partial x^2}\cos^2\phi+\frac{\partial^2\psi}{\partial x\partial y}\sin\phi\cos\phi+\frac{\partial^2\psi}{\partial y\partial x}\sin\phi\cos\phi+\frac{\partial^2\psi}{\partial y^2}\sin^2\phi+\frac{\partial\psi}{\partial x}\cdot 0+\frac{\partial\psi}{\partial y}\cdot 0 

 

 (22)

 

\frac{\partial^2\psi}{\partial r^2}=\cos^2\phi\frac{\partial^2\psi}{\partial x^2}+2\sin\phi\cos\phi\frac{\partial^2\psi}{\partial x\partial y}+\sin^2\phi\frac{\partial^2\psi}{\partial y^2} 

 

 (23)

 

By similar way from (17) to (23), we have

\frac{\partial^2\psi}{\partial\phi^2}=r^2\sin^2\phi\frac{\partial^2\psi}{\partial x^2}-2r^2\sin\phi\cos\phi\frac{\partial^2\psi}{\partial x\partial y}+r^2\cos^2\phi\frac{\partial^2\psi}{\partial y^2}-r\cos\phi\frac{\partial\psi}{\partial x}-r\sin\phi\frac{\partial\psi}{\partial y} 

 

 (24)

 

Multiply (23) by r2

r^2\frac{\partial^2\psi}{\partial r^2}=r^2\cos^2\phi\frac{\partial^2\psi}{\partial x^2}+2r^2\sin\phi\cos\phi\frac{\partial^2\psi}{\partial x\partial y}+r^2\sin^2\phi\frac{\partial^2\psi}{\partial y^2} 

 

 (25)

 

Add (24) and (25) together

r^2\frac{\partial^2\psi}{\partial r^2}+\frac{\partial^2\psi}{\partial\phi^2}=r^2\left(\sin^2\phi+\cos^2\phi\right)\frac{\partial^2\psi}{\partial x^2}+r^2\left(\sin^2\phi+\cos^2\phi\right)\frac{\partial^2\psi}{\partial y^2}-r\cos\phi\frac{\partial\psi}{\partial x}-r\sin\phi\frac{\partial\psi}{\partial y} 

 

 (26)

 

r^2\frac{\partial^2\psi}{\partial r^2}+\frac{\partial^2\psi}{\partial\phi^2}=r^2\frac{\partial^2\psi}{\partial x^2}+r^2\frac{\partial^2\psi}{\partial y^2}-r\left(\cos\phi\frac{\partial\psi}{\partial x}+\sin\phi\frac{\partial\psi}{\partial y}\right) 

 

 (27)

 

Divide (27) by r2

\frac{\partial^2\psi}{\partial r^2}+\frac{1}{r^2}\frac{\partial^2\psi}{\partial\phi^2}=\frac{\partial^2\psi}{\partial x^2}+\frac{\partial^2\psi}{\partial y^2}-\frac{1}{r}\left(\cos\phi\frac{\partial\psi}{\partial x}+\sin\phi\frac{\partial\psi}{\partial y}\right) 

 

 (28)

 

By chain rule, we know

\frac{\partial\psi}{\partial r}=\frac{\partial\psi}{\partial x}\frac{\partial x}{\partial r}+\frac{\partial\psi}{\partial y}\frac{\partial y}{\partial r}=\cos\phi\frac{\partial\psi}{\partial x}+\sin\phi\frac{\partial\psi}{\partial y} 

 

 (29)

 

Replace (29) into (28)

\frac{\partial^2\psi}{\partial r^2}+\frac{1}{r^2}\frac{\partial^2\psi}{\partial\phi^2}=\frac{\partial^2\psi}{\partial x^2}+\frac{\partial^2\psi}{\partial y^2}-\frac{1}{r}\frac{\partial\psi}{\partial r} 

 

 (30)

 

\frac{\partial^2\psi}{\partial x^2}+\frac{\partial^2\psi}{\partial y^2}=\frac{\partial^2\psi}{\partial r^2}+\frac{1}{r}\frac{\partial\psi}{\partial r}+\frac{1}{r^2}\frac{\partial^2\psi}{\partial \phi^2} 

 

 (31)

 

Q.E.D. - Justin545 (talk) 07:19, 4 March 2008 (UTC)
The dy/dx notation is not exactly a fraction, although often you can treat it as a fraction (chain rule, integration by substitution) --wj32 t/c 05:47, 28 February 2008 (UTC)
Reply to wj32t/c: That is ture. I think (well, I'm just a layman) Leibniz notation is somewhat confusing to me. I didn't see (yes, I am not experienced enough) any formal introduction which states when can I treat it as a fraction and when can't I after I learned the derivative for many years. It seems to be a mystery to me! :) - Justin545 (talk) 12:15, 28 February 2008 (UTC)

[edit] Quantum Mechanics: Operator and Eigenvalue

For a given wave function ψ(x) of a particle at position x, the momentum p of the particle is the eigenvalue of (1)

\hat{P}\psi(x)=p\psi(x) 

 

 (1)

 

where

\hat{P}=-i\hbar\frac{\partial}{\partial x} 

 

 (2)

 

For example, if the wave function of a particle is

\psi(x)=\exp\left(i\frac{p_0}{\hbar}x\right)=e^{i(p_0/\hbar)x} 

 

 (3)

 

, the corresponding momentum p will be

-i\hbar\frac{\partial}{\partial x}e^{i(p_0/\hbar)x}=pe^{i(p_0/\hbar)x} 

 

 (4)

 

-i\hbar\cdot e^{i(p_0/\hbar)x}\cdot\left(i\frac{p_0}{\hbar}\right)=pe^{i(p_0/\hbar)x} 

 

 (5)

 

p_0e^{i(p_0/\hbar)x}=pe^{i(p_0/\hbar)x} 

 

 (6)

 

p = p0 

 

 (7)

 

Therefore, the momentum of the particle (3) is p0. But does it make sense to say the coordinate x of the particle (3) is the eigenvalue of (8)? It seems that we will always get x=\hat{X} if we replace (3), or any other wave function, into (8)!

\hat{X}\psi(x)=x\psi(x) 

 

 (8)

 

Justin545 (talk) 11:05, 9 March 2008 (UTC)

I think the problem you are running into here is Heisenbergs uncertainty principle which states that there is an unavoidable minimum uncertainty in the product of the momentum and position observables;

\Delta x \Delta p \ge \frac{\hbar}{2}

If you claim to know with certainty that the momentum of the particle is p0 then the position must, of necessity, be completely indeterminate. There is a similar relationship between other pairs of observables, such as Energy and Time. SpinningSpark 13:17, 9 March 2008 (UTC)

Uncertainty principle aside, most wavefunctions are not eigenvectors of most Hermitian operators. In fact no proper (normalizable) wavefunction satisfies \hat{P}\psi(x)=p\psi(x) for any p. An equation like \hat{A}\psi=a\psi is not meant to be solved for a as a function of ψ, it's meant to be solved for ψ as a function of a. Most wave functions won't be in the solution set, but they'll be expressible as a sum of elements of the solution set. If you like you can think of Hermitian operators like \hat{P} as an odd way of specifying an orthogonal basis with a real number attached to each basis vector. -- BenRG (talk) 15:50, 9 March 2008 (UTC)
It turns out what I did was just replace the solution, or the basis, (3) into (1) according to the reply. I also forgot the position of a particle is uncertain, it is good to recall Heisenberg's uncertainty principle. I just did some ridiculous generalization and thought that the operator \hat{X} can be used as (8) which is similar to (1) :p But it seems the operator \hat{X} is useless except it is only used to calculate the mean value \langle\hat{X}\rangle=\int_{-\infty}^\infty\psi^*(x)\left[\hat{X}\psi(x)\right]dx - Justin545 (talk) 03:37, 10 March 2008 (UTC)

[edit] Quantum Mechanics: Entangled Wave Function

The equation 9, or (EPR9) for short here, in the original paper of EPR paradox gives a wave function of two entangled particles

\Psi(x_1,x_2)=\int_{-\infty}^\infty e^{(2\pi i/h)(x_1-x_2+x_0)p}dp 

 

 (EPR9)

 

where h is Planck's constant, x1 and x2 are the variables describing the two particles and x0 is just some constant. According to reduction of the wave packet, when an observable B of the first particle is measured, (EPR9) can be expanded by the eigenfunctions v1(x1),v2(x1),v3(x1),... of B in the form

\Psi(x_1,x_2)=\sum_{s=1}^\infty\varphi_s(x_2)v_s(x_1) 

 

 (EPR8)

 

where \varphi_1(x_2),\varphi_2(x_2),\varphi_3(x_2),... are the corresponding coefficients to the eignefunctions. If B is a continuous observable, the coordinate of the first particle, (EPR8) can be written as

\Psi(x_1,x_2)=\int_{-\infty}^\infty \varphi_x(x_2)v_x(x_1)dx 

 

 (EPR15)

 

According to the paper, the eigenfunctions of B is

vx(x1) = δ(x1x) 

 

 (EPR14)

 

which has corresponding eigenvalue x. The first question is how come the eignefunction and the eigenvalue of B are (EPR14) and x, respectively? It seems that

B = x1 

 

 (1)

 

and if we let

f(x1) = x1x 

 

 (2)

 

then

vx(x1) = δ(x1x) = δ(f(x1)) 

 

 (3)

 

Find the solution of

f(x1) = 0 

 

 (4)

 

we have

x1x = 0 

 

 (5)

 

x1 = x 

 

 (6)

 

the right-hand side of (6) is the eigenvalue of B. Similarly, the eigenvalue of the observable

Q = x2 

 

 (EPR17)

 

can be found by knowing

\varphi_x(x_2)=\int_{-\infty}^\infty e^{(2\pi i/h)(x-x_2+x_0)p}dp=h\delta(x-x_2+x_0) 

 

 (EPR16)

 

and let

g(x2) = xx2 + x0 

 

 (7)

 

The solution of

g(x2) = 0 

 

 (8)

 

is

x2 = x + x0 

 

 (9)

 

Again, the right-hand side of (9) is the eigenvalue of Q which complies with the paper. But it still doesn't explain how to figure out the eignefunction (EPR14).

To continue the unsolved discussion last time, the second question is how to denote the entangled wave function (EPR9) in bra-ket notation? If it can be done, it should help with respect to the last discussion. The bra-ket notation of (EPR9) is supposed to be in the Hilbert space which is the tensor product of the state spaces associated with the the two particles. - Justin545 (talk) 06:36, 12 March 2008 (UTC)

Hi, I'm sorry I haven't followed up to the old thread yet, but maybe a response here will serve the same purpose.
There are many ways to write (EPR9) in bra-ket notation; for example I could just write |\Psi\rangle where Ψ is defined by (EPR9). In terms of tensor products of kets inhabiting the state spaces of the individual particles, I could write for example \Psi = \int_{-\infty}^\infty |x\rangle_1 \, |x+x_0\rangle_2 \,dx = \int_{-\infty}^\infty e^{(2\pi i/h) x_0 p} \, |p\rangle_1 \, |{-}p\rangle_2 \,dp. I'm not sure those are properly normalized, to the extent that these mathematical monstrosities can be considered to be normalized to begin with. The product |a\rangle|b\rangle might also equivalently be written |a\rangle \otimes |b\rangle or |a,b\rangle or |ab\rangle. The subscripts 1 and 2 just indicate which subspaces the kets inhabit; they could be left off since the two subspaces are isomorphic in this case.
I'm not sure I understand your first question. Finding eigenfunctions of the position operator in a single-particle space involves solving equations of the form BΨ = xΨ where B(z) = z and BΨ is a pointwise function product. It should be clear enough that the only possibilities for Ψ here are functions that are zero everywhere except at a point, and the "normalized" versions of these functions are the delta functions, which form an orthonormal eigenbasis. In the two-particle space things are a bit more interesting. You're now solving BΨ = xΨ where B(z1,z2) = z1. The normalized solutions here are Ψx(x1,x2) = δ(xx1)g(x2) where g is any normalized function of x2. These do not form a basis; there are far too many of them for that. You have to choose arbitrarily some orthonormal basis for the functions g. This happens because there are degenerate eigenvalues; the discrete analogy is that there's only one orthonormal eigenbasis for diag(1,2,3) but many for diag(1,1,2). -- BenRG (talk) 12:55, 12 March 2008 (UTC)
It's reasonable making the ket be the function of the corresponding eigenvalue since each eigenvalue identifies an unique basis or eigenfunction. But, I am a bit confused with the bra-ket notation \Psi=\int_{-\infty}^\infty|x\rangle_1|x+x_0\rangle_2\,dx since I expect the bra-ket notation should be in the form

\Psi=c_1|a_1\rangle_1|b_1\rangle_2+c_2|a_2\rangle_1|b_2\rangle_2+c_3|a_3\rangle_1|b_3\rangle_2+... 

 

 (10)

 

rather than in the form

\Psi=\int_{-\infty}^\infty\left(c_1|a_1\rangle_1|b_1\rangle_2+c_2|a_2\rangle_1|b_2\rangle_2+c_3|a_3\rangle_1|b_3\rangle_2+...\right)dx 

 

 (11)

 

It seems the integral \int_{-\infty}^\infty\cdot\,dx surrounding the ket can not be removed. But, will the integral of the ket yield another "ket" in the same space? Another confusion is about the momentum part of the bra-ket example \Psi=\int_{-\infty}^\infty e^{(2\pi i/h)x_0p}|p\rangle_1|{-}p\rangle_2\,dp. I am not able to figure out e^{(2\pi i/h)x_0p} in it.
Apologies for obscuring my first question. My first question is just to understand why the eigenfunction of B is a "delta function". Just wonder how the delta function (EPR14) is mathmatically derived. As you said "Finding eigenfunctions of the position operator in a single-particle space involves solving equations of the form BΨ = xΨ where B(z) = z and BΨ is a pointwise function product." But I can not understand why it's pointwise. Excuse my poor quantum mechanics, I left so many question marks here :-) Justin545 (talk) 08:43, 13 March 2008 (UTC)
The integral is the sum, it just happens to be a sum with uncountably many terms. You need an uncountable sum here because the wave function is a superposition of uncountably many tensor-product states—the particles could be at x and x+x0 for any real x. I picked somewhat arbitrarily the position basis vectors |x\rangle(x') = \delta(x-x') and the momentum basis vectors |p\rangle(x) = e^{(2\pi i/h) x p}. They're somewhat arbitrary because they're only unique up to scalar multiplication, but they're eigenvectors of the appropriate operators with the appropriate eigenvalues (unless I got the sign convention backwards). Then |x_1\rangle|x_2\rangle(x_1',x_2') = \delta(x_1-x_1')\delta(x_2-x_2') = \delta((x_1,x_2) - (x_1',x_2')) and |p_1\rangle|p_2\rangle(x_1,x_2) = e^{(2\pi i/h) (x_1 p_1 + x_2 p_2)}. So in particular e^{(2\pi i/h) x_0 p} |p\rangle\,|{-}p\rangle(x_1,x_2) = e^{(2\pi i/h) (x_1 - x_2 + x_0) p}, which is where my momentum integral form came from. The position integral one is odder. When x_1 - x_2 + x_0 \ne 0, EPR9 gives \int_{-\infty}^\infty e^{i(\text{some nonzero real})p}\,dp, which to a mathematician is undefined but to a physicist is zero. When x1x2 + x0 = 0, EPR9 gives \int_{-\infty}^\infty dp, which to a physicist is the peak of a delta function. So EPR9 describes a "function" that's zero everywhere except on the line x1x2 + x0 = 0 where it's infinity, and my position integral expressed that more directly.
Let me explain the operators in a finite-dimensional case. Let's say we have a four-state system with position states \left\{\left(\begin{array}{r}1\\0\\0\\0\end{array}\right), \left(\begin{array}{r}0\\1\\0\\0\end{array}\right), \left(\begin{array}{r}0\\0\\1\\0\end{array}\right), \left(\begin{array}{r}0\\0\\0\\1\end{array}\right)\right\} and momentum states \frac12 \left\{\left(\begin{array}{r}1\\1\\1\\1\end{array}\right), \left(\begin{array}{r}1\\i\\-1\\-i\end{array}\right), \left(\begin{array}{r}1\\-1\\1\\-1\end{array}\right), \left(\begin{array}{r}1\\-i\\-1\\i\end{array}\right)\right\} (the Fourier basis). We can arbitrarily assign a distinct real number to each position and to each momentum. Say the positions are 1,2,3,4 and the momenta are 0,1,2,−1. Then there exists a matrix which scales each position/momentum axis by the corresponding real number. For the position basis it's just diag(1,2,3,4), while for the momentum basis it's U diag(0,1,2,−1) U−1, where U is the unitary matrix whose columns are the aforementioned Fourier basis vectors. This matrix will always be Hermitian (it's a theorem that a matrix is Hermitian if and only if it can be written in the form UAU−1 where U is unitary and A is real diagonal). In this case the scaling factors were all distinct, so by solving the eigenvalue equation we can recover the original basis from the Hermitian matrix. If some scaling factors are equal then all you can tell is that a particular (hyper)plane was scaled by that factor; you can't uniquely recover the basis vectors lying in that hyperplane. That's the case for a four-state system that's the product of two single-particle two-state systems, where the position of the two particles might be represented by the matrices diag(1,2,1,2) and diag(3,3,4,4) respectively. In the continuous case you can't write down matrices any more, but the differential operators serve the same purpose. In order to get the right eigenbasis and eigenvalues, the position operator has to multiply the wave function by a real number corresponding to the position, which is why I described it as a pointwise function product. It might have been better to say that B is an operator defined by (B f)(x) = x f(x). -- BenRG (talk) 14:32, 14 March 2008 (UTC)

[edit] Speed of ...

Say you have a rigid object, such as a meter long titanium pole. When you pull one end, the other end presumably doesn't move instantly, because the force would have to move faster than the speed of light. It must be determined by the flexibility between the bonds of the titanium atoms/molecules. So ignoring the impracticality of its weight and size, if you were to pull a light-year, or some similar length, long titanium pole, the other end wouldn't notice it's been pulled until all the bonds are at their maximum pulling length? At that point, it still can't be instant if it's pulled further? Could anyone elaborate on what goes on? -- MacAddct  1984 (talkcontribs) 15:16, 17 March 2008 (UTC)

I'm not sure what you are asking but yes, it is impossible to pull all of it instantaniously. The rod will deform elastically and the pull will travel down it at the speed of sound in that material. (much slower than the speed of light). Theresa Knott | The otter sank 15:28, 17 March 2008 (UTC)
This is a Reference Desk Frequently-Asked Question but I think you've already sussed-out the answer: What we consider to be the "structural strength" of solid materials is actually the electromagnetic interaction of the electron shells of the constituent atoms. And these electromagnetic interactions can never propagate faster than the speed of light. So if you pull or push on that light-year-long titanium rod, a compression or expansion wave propagates through the material. It certainly doesn't go faster than the speed of light and probably only travels at the speed of sound in that material.
Atlant (talk) 15:31, 17 March 2008 (UTC)
Thanks! Yeah, that's about what I was thinking. It's hard to think in terms of familiar objects moving in ways we're not familiar with.
As for it being FAQ , is there an FAQ (official or unofficial) for the Reference Desk? If not, there really should be one started... -- MacAddct  1984 (talkcontribs) 15:53, 17 March 2008 (UTC)
There is a FAQ page, at Wikipedia:Reference_desk/FAQ. It is embryonic and underused, because it is hidden. We should link to it from the main RD page, and maybe mention it in the Before asking a question/Search first section at the top of each RD page. --169.230.94.28 (talk) 19:04, 17 March 2008 (UTC)
So what happens if you move one end faster than the speed of sound ? StuRat (talk) 17:36, 17 March 2008 (UTC)
You'd be breaking the rod. By definition you would be moving the atoms faster than they could convey that movement on to the atoms next to it; the rod wouldn't be structurally stable if you could do that, by definition. Remember it's the speed of sound in that material, not the speed of sound in air, which is normally what we think the speed of sound as being. --Captain Ref Desk (talk) 18:08, 17 March 2008 (UTC)
Or, if you are moving in a compressive direction, and buckling doesn't occur, you may create a shock wave. --169.230.94.28 (talk) 19:04, 17 March 2008 (UTC)
It's not speed that would break or buckle the rod, but acceleration 196.2.113.148 (talk) 22:07, 17 March 2008 (UTC)

[edit] Proof of Chain Rule

Let

y = f(u) 

 

 (1)

 

u = g(x) 

 

 (2)

 

where f(u) and g(x) are both differentiable functions. Then

\frac{dy}{dx} 

 

 (3)

 

=\lim_{\Delta x\rightarrow 0}\frac{f(g(x+\Delta x))-f(g(x))}{\Delta x} 

 

 (4)

 

=\lim_{\Delta x\rightarrow 0}\left[\frac{f(g(x+\Delta x))-f(g(x))}{g(x+\Delta x)-g(x)}\cdot\frac{g(x+\Delta x)-g(x)}{\Delta x}\right] 

 

 (5)

 

=\lim_{\Delta x\rightarrow 0}\frac{f(g(x+\Delta x))-f(g(x))}{g(x+\Delta x)-g(x)}\cdot\lim_{\Delta x\rightarrow 0}\frac{g(x+\Delta x)-g(x)}{\Delta x} 

 

 (6)

 

Treat the arrow \rightarrow as equal sign = . We can do the same operation on both sides of the arrow without changing the relationship

\Delta x\rightarrow 0 

 

 (7)

 

x+\Delta x\rightarrow x 

 

 (8)

 

Function g(x) is continuous since it is differentiable. Apply g(\cdot) to both sides of (8)

g(x+\Delta x)\rightarrow g(x) 

 

 (9)

 

g(x+\Delta x)-g(x)\rightarrow 0 

 

 (10)

 

Let

Δu = g(x + Δx) − g(x) 

 

 (11)

 

Replace (11) into (10)

\Delta u\rightarrow 0 

 

 (12)

 

Therefore

\lim_{\Delta x\rightarrow 0}\cdot  implies \lim_{\Delta u\rightarrow 0}\cdot 

 

 (13)

 

Replace (2) into (11)

g(x + Δx) = u + Δu 

 

 (14)

 

Replace (2), (11), (13) and (14) into (6)

\frac{dy}{dx}=\lim_{\Delta u\rightarrow 0}\frac{f(u+\Delta u)-f(u)}{\Delta u}\cdot\lim_{\Delta x\rightarrow 0}\frac{g(x+\Delta x)-g(x)}{\Delta x}=\frac{dy}{du}\frac{du}{dx} 

 

 (15)

 

Q.E.D.

Is the proof of chain rule above correct and rigorous? - Justin545 (talk) 06:25, 18 March 2008 (UTC)

There are some questionable details. First, if we want a proof we can consider "rigorous", we would want to avoid treating functions as quantities (e.g., u instead of g(x)) and using Leibniz notation (\tfrac{du}{dx}). So as a first step you should try formulating the proof without using u or y, only f, g and their composition h = f \circ g (equivalently, h(x) = f(g(x)). Second, the limit notation, \lim_{x \to x_0}f(x)=L, is one unit. You shouldn't take out the x \to x_0 and treat it as something that stands on its own. This would be acceptable for a handwaving proof, but not for a rigorous one. -- Meni Rosenfeld (talk) 07:37, 18 March 2008 (UTC)
>> "the limit notation, \lim_{x \to x_0}f(x)=L, is one unit. You shouldn't take out the x \to x_0 and treat it as something that stands on its own."
I think you mean the result of (13) is incorrect or not rigorous. Does it mean the whole proof should be re-derived in a completely different way or we can somehow fix the problem so that we don't have to re-derive the whole proof? If (13) is not rigorous, is there any example which opposes it? Thanks! - Justin545 (talk) 09:00, 18 March 2008 (UTC)
(13) and the derivations that lead to it are "not even wrong" in the sense that in the standard framework of calculus they are pretty much meaningless - if you look at the standard rigorous definitions of limits, you will see that they do not allow a function to be used as a variable. It is "correct" in the sense that intuitively, the limit of a function "when" the variable approaches some value is equal to the limit when some othee function approaches its appropriate limit value. However, this "when" business lacks a rigorous interpretation and is haunted by Bishop Berkeley's ghosts.
I have thought about how one might amend the proof, and realized that you also have a mistake much earlier. Step (5), dividing and multiplying by g(x + Δx) − g(x), is only valid if g(x+\Delta x)-g(x) \neq 0, but there is no reason to assume that should be the case. Take, for example
g(x)=\left\{\begin{array}{ll}x^2\sin\tfrac1x&x\neq0\\0&x=0\end{array}\right.
- a perfectly differentiable function at 0, and yet g(x) = 0 = g(0) infinitely many times in any neighborhood of 0. Thus your proof will not work for it. Those kinds of pathological counterexamples are one of the things that separates rigorous proofs from not-so-rigorous ones. -- Meni Rosenfeld (talk) 10:58, 18 March 2008 (UTC)
>> "Step (5), dividing and multiplying by g(x + Δx) − g(x), is only valid if g(x+\Delta x)-g(x) \neq 0, but there is no reason to assume that should be the case. Take, for example..."
I think g(x + Δx) − g(x) will never be zero since Δx "is not zero", Δx is just a value that "very close to zero". Thus, g(x + Δx) − g(x) will only close to zero but g(x + Δx) − g(x) will not be zero, and I believe the step (5) would be still correct. As for your example, we may first need to evaluate

\lim_{\Delta x\rightarrow 0}g(0+\Delta x) 

 

 (16)

 

=\lim_{\Delta x\rightarrow 0}(0+\Delta x)^2\sin\frac{1}{0+\Delta x} 

 

 (17)

 

=\lim_{\Delta x\rightarrow 0}{\Delta x}^2\sin\frac{1}{\Delta x} 

 

 (18)

 

=\lim_{\Delta x\rightarrow 0}{\Delta x}^2\cdot\lim_{\Delta x\rightarrow 0}\sin\frac{1}{\Delta x} 

 

 (19)

 

But what will \lim_{\Delta x\rightarrow 0}\sin\frac{1}{\Delta x} evalute to? I'm not sure... - Justin545 (talk) 01:41, 19 March 2008 (UTC)
You've made two mistakes here. First, g(x + Δx) − g(x) can be zero for arbitrarily small values of Δx. That's what Meni's example shows. Your (18)=(19) is also mistaken: it would be valid if both limits in (19) existed, but as it happens the second one doesn't. Btw, your error at step (5) is a reasonably common one: IIRC, it occurs in the first few editions of G H Hardy's A Course of Pure Mathematics. Though there are other ways round it, perhaps the best is to avoid division at all in the proof. This has the advantage that your proof immediately generalises to the multi-dimensional case. Algebraist 02:36, 19 March 2008 (UTC)
>> "First, g(x + Δx) − g(x) can be zero for arbitrarily small values of Δx. That's what Meni's example shows."
Meni's example is not so obvious to me why g(x + Δx) − g(x) = 0 where

g(x)=\left\{\begin{array}{ll}x^2\sin\tfrac1x&x\neq0\\0&x=0\end{array}\right. 

 

 (20)

 

Could you provide more explanation for it? Or could you tell what theorem supports that g(x + Δx) − g(x) could be exactly zero?
>> "perhaps the best is to avoid division at all in the proof."
Division could be avoided at all, but it is "intuitive" since the definition of derivative involves division. Besides, even this proof involves division I think. If it does involve division, the proof would be considered non-rigorous. - Justin545 (talk) 03:08, 19 March 2008 (UTC)
>> "(13) and the derivations that lead to it are "not even wrong" in the sense that in the standard framework of calculus they are pretty much meaningless - if you look at the standard rigorous definitions of limits, you will see that they do not allow a function to be used as a variable."
I'm afraid I don't get it that "rigorous definitions of limits do not allow a function to be used as a variable" and why the derivations lead to (13) is meaningless. - Justin545 (talk) 03:37, 19 March 2008 (UTC)
A better question is how are they not meaningless. Where in your textbook did anyone mention taking the \Delta x \to 0 notation, treating it as a formula on its own, and doing manipulations on it? -- Meni Rosenfeld (talk) 16:48, 19 March 2008 (UTC)
This proof is really from my textbook except the steps from (7) to (14) are missing. The missing steps is my creation since I have no idea how does step (6) become step (15). I want to know, in detail, how does step (6) become step (15) so I added those steps and make discussion here to see if it's correct or not. - Justin545 (talk) 05:30, 20 March 2008 (UTC)
In this case, the proof in your book is wrong (that happens too). Step 5 cannot be justified without more assumptions on g. Your steps 7-12 describe intuitively correct ideas but are far from being rigorous. If g is "ordinary" enough for step 5 to hold, it is possible to justify the leap from (6) to (15), but if you want it to be rigorous you need to rely only on the definition of limits, not on your intuitive ideas of what they mean. -- Meni Rosenfeld (talk) 12:03, 20 March 2008 (UTC)
If you want a similar proof that really works, one way would be to apply the mean value theorem to f at (4). This allows you to replace f(g(x + h)) − f(g(x)) 163.1.148.158 (talk) 12:54, 18 March 2008 (UTC)
The mean value theorem I found is
f'(c)=\frac{f(b)-f(a)}{b-a}
where c\in(a,b). But I have no idea how to apply it to f at (4) and why it's needed to replace f(g(x + h)) − f(g(x))? Thanks! - Justin545 (talk) 02:38, 19 March 2008 (UTC)
To the OP: it is not necessary to avoid division to make the proof rigorous, but it is one way of doing it. I meant division specifically by values of the domain or codomain of f and g (since these are the things that become vectors when you generalise), but I see I failed to say it. Apologies. The definition of the derivative need not involve such division (the one lectured to me didn't, for example), and one could argue that it shouldn't. Not sure if one would be right, mind. To your specific question, Meni's function is zero whenever x is 1/(nπ) (n a non-zero integer). Thus we have g(x)=0 for arbitrarily small x. Algebraist 03:34, 19 March 2008 (UTC)
>> "The definition of the derivative need not involve such division (the one lectured to me didn't, for example), and one could argue that it shouldn't."
The familiar definition of derivative is

\frac{dy}{dx}=f'(x)=\lim_{\Delta x\rightarrow 0}\frac{f(x+\Delta x)-f(x)}{\Delta x} 

 

 (21)

 

It seems you was were saying that (21) is not a "rigorous" definition. It sounds pretty odd to me. I thought (21) is the only way of defining derivative. There are many lemmas or theorems about derivative in my textbook are originated from (21). It's not easy to imagine there other there are other definition without division. - Justin545 (talk) 05:13, 19 March 2008 (UTC)
No, that's not what he was saying. He said that you can define the derivative without division, not that you should. Definition (21) (at least the f'(x)=\lim_{\Delta x\rightarrow 0}\frac{f(x+\Delta x)-f(x)}{\Delta x} part) is rigorous and is indeed the standard definition. There is nothing wrong with division, except for division by zero. The main flaw in your proof is dividing by g(x + Δx) − g(x) which may be zero. Just because \Delta x \neq 0 doesn't mean that g(x+\Delta x) \neq g(x). This is just common sense, you don't need my complicated example for that. -- Meni Rosenfeld (talk) 16:48, 19 March 2008 (UTC)
>> "To your specific question, Meni's function is zero whenever x is 1/(nπ) (n a non-zero integer). Thus we have g(x)=0 for arbitrarily small x."
I'm afraid I'm not able to proof (20) is zero when x\in\left\{\frac{1}{n\pi}\Bigg| n\in\mathbb{Z}\land n\ne 0\right\}. But I think g(x + Δx) − g(x) will be zero when g(x) = a where a is any fixed constant. (Edit: which means I was ridiculously wrong. Apologies.) - Justin545 (talk) 05:42, 19 March 2008 (UTC)
x2sin(1/x) is zero whenever sin(1/x) is zero, which happens whenever 1/x is a multiple of pi, which happens whenever x = 1/npi for some integer n. You know, you're not really all that wrong. You have the right idea, you just don't have the tools to implement it. Here's roughly how my analysis textbook solves the problem. First, you define a new function h(y). I'll skip the details about intervals and mappings, and just say that it's focused on f and ignoring g, and assumes some interesting value c has been chosen. Let h(y) = (f(y)-f(g(c)))/(y-g(c)) if y does not equal g(c), and let h(y) = f'(y) if y=g(c). All that should be possible by assumption. Since g is differentiable at c, g is continuous at c, so h of g is continuous at c, so lim x->c (hog)(x)=h(g(c))=f'(g(c)). By the definition of h, f(y)-f(g(c))=h(y)(y-g(c)) for all y, so ((fog)(x)-(fog)(c)) = (hog(x))(g(x)-g(c)), so for x not equal to c we have ((fog)(x)-(fog)(c))/(x-c) = (hog(x))(g(x)-g(c))/(x-c). Taking the limit of both sides as x->c, then (fog)'(c)=lim x->c ((fog)(x)-(fog)(c))/(x-c) = (lim x->c hog(x))(lim x->c (g(x)-g(c))/(x-c)) = f'(g(c))g'(c). Black Carrot (talk) 06:36, 19 March 2008 (UTC)
>> "x2sin(1/x) is zero whenever sin(1/x) is zero, which happens whenever 1/x is a multiple of pi, which happens whenever x = 1/npi for some integer n."
Thanks! Now I understand it.
>> "You know, you're not really all that wrong. You have the right idea, ..."
Excuse my rewiring of your response for readability:
Here's roughly how my analysis textbook solves the problem. First, you define a new function h(y). I'll skip the details about intervals and mappings, and just say that it's focused on f and ignoring g, and assumes some interesting value c has been chosen. Let
h(y)=\begin{cases}
\frac{f(y)-f(g(c))}{y-g(c)},&y\ne g(c)\\
f'(y),&y=g(c)
\end{cases}
All that should be possible by assumption. Since g is differentiable at c, g is continuous at c, so h of g is continuous at c, so
\lim_{x\rightarrow c}h(g(x))=h(g(c))=f'(g(c)).
By the definition of h,
f(y) − f(g(c)) = h(y)[yg(c)], \forall y\ne g(c),
so
f(g(x)) − f(g(c)) = h(g(x))[g(x) − g(c)],
so for x not equal to c we have
\frac{f(g(x))-f(g(c))}{x-c}=\frac{h(g(x))[g(x)-g(c)]}{x-c}.
Taking the limit of both sides as x\rightarrow c, then
(f\circ g)'(c)=\lim_{x\rightarrow c}\frac{f(g(x))-f(g(c))}{x-c}=\left[\lim_{x\rightarrow c}h(g(x))\right]\left[\lim_{x\rightarrow c}\frac{g(x)-g(c)}{x-c}\right]=f'(g(c))g'(c).
Did I misunderstand your response? Thanks! - Justin545 (talk) 09:01, 19 March 2008 (UTC)
After "By the definition of h", it should be for all y. If y=g(c), both sides are equal to zero, and the equality still holds. That one line is pretty much the goal of the whole thing, finding a way to get that conclusion without dividing by zero anywhere. Black Carrot (talk) 16:02, 19 March 2008 (UTC)
But I think it should be

\begin{cases}
f(y)-f(g(c))=h(y)[y-g(c)]&\forall y\ne g(c)\\
f'(y)=h(y)&\forall y=g(c)
\end{cases} 

 

 (22)

 

by definition of h.
By the way, I think derivatives of composition functions should be able to rewritten rewrite to be rewritten in Leibniz notation as below

(f\circ g)'(c)=\frac{df(g(c))}{dc}=\frac{d}{dc}f(g(c)) 

 

 (23)

 

f'(g(c))=\frac{df(g(c))}{dg(c)}=\frac{d}{dg(c)}f(g(c)) 

 

 (24)

 

Justin545 (talk) 07:02, 20 March 2008 (UTC)

[edit] Gödel's Incompleteness Theorems: Is The Math Reliable?

Many sciences depend on the math to prove something and use it for rigorous study. But Gödel's incompleteness theorems states:

For any consistent formal, computably enumerable theory that proves basic arithmetical truths, an arithmetical statement that is true, but not provable in the theory, can be constructed.1 That is, any effectively generated theory capable of expressing elementary arithmetic cannot be both consistent and complete.

Therefore, I would like to know are all the theories we use (for biology, chemistry, physics, medicine, computer science, etc.) considered to be consistent theories themself? And are all of maths we learn from elementary school to university considered to be reliable and don't contradict each other? - Justin545 (talk) 07:00, 19 March 2008 (UTC)

What do you mean by "reliable"? I would say the mathematics underlying biology, chemistry, etc is far less likely to be in error than the biology and chemistry themselves. But if you're looking for apodeictic certainty -- the sort of thing that, by its nature, cannot be wrong -- well, sorry, we don't have any of that. In my humble opinion, anyway. We'll settle for being right; we don't have to be completely certain.
Or as the Eagles put it -- "I could be wrong, but I'm not". --Trovatore (talk) 07:18, 19 March 2008 (UTC)
Math is used as a tool for studying many sciences. If the tool itself is "problematic" or "questionable", the consequences of employing it are very likey to be wrong! "reliable" means "consistent" and "don't contradict". Incompleteness theorems, in other words, states: if every arithmetical statement that is true and is provable in the theory, the theory is inconsistent but it is complete. So what I want to know is: the math we use is either
(1) consistent but not complete, or
(2) complete but not consistent - Justin545 (talk) 07:48, 19 March 2008 (UTC)
Well, we don't know for certain, but the general view is that we are in the consistent but not complete case, which is really not as bad as it sounds at first. If you know any group theory, consider that there are plenty of facts about groups that cannot be deduced from the axioms for a group alone -- the theory of groups, as given by the most basic group axioms, is not complete. In some sense this is because there are different models, different groups, that all meet those basic axioms, and thus have truths that are not derivable from just those axioms. You can think of arithmetic as being similar, with different models, just with the proviso that, unlike groups, we haven't found models that disagree on any arithmetic facts you or I would generally care about. -- Leland McInnes (talk) 12:11, 19 March 2008 (UTC)
Could you give an example of what you mean, there? I've done quite a bit of group theory and have never come across something that's true but can't be proven from the axioms of a group (together with ZF). --Tango (talk) 13:39, 19 March 2008 (UTC)
>> "which is really not as bad as it sounds at first"
It sounds bad to me... since we are not able to justify our math.
>> "there are plenty of facts about groups that cannot be deduced from the axioms for a group alone...the theory of groups is not complete."
I don't know any of group theory, but: Could those set of un-deducible facts themself be considered as axioms? Will group theory be complete if we make those facts axioms? - Justin545 (talk) 02:50, 20 March 2008 (UTC)
What I mean is that the group axioms don't uniquely define the group, but rather a whole slew of possible objects each of which satisfies the axioms of being a group. Thus there isn't a unique model of "group" specified by the axioms, but rather each and every different group is a different model that satisfies the basic group axioms. There are things that are true of particular groups that you can't deduce from just the group axioms -- you need more information (more axioms in essence) to pin down which group (or class of groups) you are talking about. Thus there are truths that occur in systems that fulfill the group axioms that are not provable from the group axioms alone. Does that make more sense? -- Leland McInnes (talk) 17:26, 19 March 2008 (UTC)
Right, but arithmetic (and set theory) are quite a different case from group theory. Arithmetic is not the study of models of arithmetic; it's the study of numbers. All models of arithmetic have (copies of) all the true natural numbers, but some of them also have fake natural numbers. The one true Platonic intended model of arithmetic has only the true ones, and none of the fake ones, and is unique up to a canonical isomorphism. There's a limit to what we can find out about the behavior of the true natural numbers from a fixed set of axioms and first-order logic alone. That doesn't mean we have to stop there. --Trovatore (talk) 17:49, 19 March 2008 (UTC)
Let me insert a response here: essentially, yes. I was going for a loose analogy suggesting that incompleteness isn't really a horrible thing. As to models of arithmetic, there is the question of what the intended model is, and, for the sufficiently messy cases where we can't practically distinguish is from some fake model, whether it even matters. I would liken it (again, an analogy, so don't take it too literally) to science trying to model (in a different sens of the word) some objective reality -- we can't know the objective reality, only our model of it, but as long as we can't tell the difference between our model and the reality (i.e. where our model hasn't been falsified) we may as well consider our model as true. -- Leland McInnes (talk) 20:55, 19 March 2008 (UTC)
Sure, there are plenty of things that can't be proven using just the axioms of a group, but those things aren't true. gh=hg\quad\forall g,h\in G can't be proven just from the group axioms, because it isn't true in general. That's not incompleteness, it's just a false statement. If you want it to be true, you have to add an additional assumption (that the group G be cyclic, say). If the statement can be stated in terms of only the group axioms, and is true, then it can be proven using only the group axioms. If it can't be stated using only those axioms, then it being impossible to prove isn't a case of incompleteness. A framework is incomplete if there are unprovable true statements within that framework. --Tango (talk) 18:06, 19 March 2008 (UTC)
Tango, I think you have not thought these things through terribly well. At least it isn't clear to me what you mean by a framework, or unprovable but true within a framework. Is a framework a first-order theory, or a model, or what exactly? --Trovatore (talk) 18:19, 19 March 2008 (UTC)
Let me be a little less Socratic and hopefully more constructive (took me more time to figure out how to say this than it did to ask a question). Let's take a specific example. Peano arithmetic neither proves nor (we suppose) refutes the claim "Peano arithmetic is consistent" (the claim is usually abbreviated Con(PA). Therefore there are models of PA in which Con(PA) is true, and there are models of PA in which Con(PA) is false. So we can make an analogy with your example statement "multiplication is commutative": There are models of group theory (that is, groups) in which "multiplication is commutative" is true, and there are other models of group theory in which it's false.
Here's the big difference: There's no such thing as "the intended group", the group that defines the truth value of "multiplication is commutative in group theory". We're interested in Abelian groups, and we're also interested in non-Abelian groups, and you just have to specify which ones you're talking about.
But Peano arithmetic (we suppose) really is consistent. The models of PA that think otherwise are wrong about that. That's not to say they're not interesting (people devote whole careers to them), but merely by their opinion on this one issue, they prove that they are not the intended model. --Trovatore (talk) 18:35, 19 March 2008 (UTC)
Ok, I think I understand what you're saying now. I'm not sure I agree, though. Group theory is defined in terms of set theory. Once you've determined a model of set theory, your model of group theory is completely determined (a group is simply a set together with a function - both concepts defined outside of group theory). Is there a (reasonable) model of set theory in which all groups are abelian? --Tango (talk) 18:56, 19 March 2008 (UTC)
Whoah, we have to be careful here -- the phrase "group theory" is being used in two different ways (my fault, probably). When I say "model of group theory==group", I'm using "group theory" to mean the first-order theory defined by the three axioms (identity existence, existence of two-sided inverses, associativity). That's different of course from "group theory" as in "the study of groups", which is not a formal first-order theory at all. Please re-read my remarks keeping this clarification in mind -- they won't have made any sense at all if you were thinking of "model of group theory" as meaning "model of the study of groups(?)". --Trovatore (talk) 19:02, 19 March 2008 (UTC)
Ok, but I think my point still stands. Group theory, in that sense, is still built on set theory. Any model of group theory must be a model of set theory, since it has to satisfy ZF plus the 3 axioms of a group. Can you have such model of set theory in which all groups are abelian? For example, set theory provides all kinds of methods of combining sets to produce sets - those method can be used to combine groups and produce other groups. Is there a model in which all such possible combinations are abelian? --Tango (talk) 19:22, 19 March 2008 (UTC)
No, of course not. It's a theorem of ZF that there exist non-Abelian groups. But you're still mixing things in a confusing way -- whatever a "model of group theory" is, it's certainly not something that "satisfies ZF plus the three axioms of a group"; that doesn't even make sense; the ZF axioms are in a different language from the group axioms. If by "group theory" we mean the three axioms, then "model of group theory" means precisely "group", and does not imply that the model satisfies the ZF axioms. That's the sense in which I was using the phrase "model of group theory". --Trovatore (talk) 19:47, 19 March 2008 (UTC)
[edit conflict] I think you're still misunderstanding. Sure, it's possible to define groups as a special kind of set in ZF set theory. But that is not what we are talking about here. You are probably confused by the fact that ZF is an immensely more complex system then the meager 3 axioms of groups (to which I will refer as GP). But they are the same thing for this discussion. Each of them is a collection of rules governing a world of objects. A bag of objects can either satisfy these rules, in which case it is called a model of the theory, or not. In the case of ZF, the models are very complicated and hard to point out, but I think Godel's constructible universe is an example of one. For GP, every simple little group is a model, and the elements of the group are the basic objects. In ZF, you can have models that satisfy choice, and models that don't; in GP, you can have models that satisfy commutativity, and models that don't. -- Meni Rosenfeld (talk) 19:55, 19 March 2008 (UTC)
Ok, I get you. So, if I'm understanding your definition of completeness correctly, a theory being complete is basically equivalent to there only being one model satisfying it? Since, if there are two models of the theory, they must differ in some way and that way gives rise to a statement which is true in one model and not true in the other. --Tango (talk) 20:05, 19 March 2008 (UTC)
Not quite. It's possible for two models to satisfy all the same first-order statements, but to be nonisomorphic. For example the theory of torsion-free abelian groups is complete, but there are nonisomorphic torsion-free abelian groups. --Trovatore (talk) 20:38, 19 March 2008 (UTC)
If memory serves, all torsion-free abelian groups are of the form \Bbb{Z}\times\cdots\times\Bbb{Z}. I sometimes get a little confused with the orders of logical statements, but is \exists g\in G, \forall h\in G,\exists n\in\Bbb{N}, h=g^n not a first order statement satisfied by only one of those groups? --Tango (talk) 21:32, 19 March 2008 (UTC)
It's not a first-order statement in the language of groups. The language of groups has no function symbol for nth power and no symbol for the set of all natural numbers. --Trovatore (talk) 21:40, 19 March 2008 (UTC)
Ah, good point. I think we've got there, I have no more questions! Thank you. (Well, I'm sure one will come to me at 3am, but that can wait until tomorrow. ;)) You know... I really do wish my Maths dept. had a proper course on logic, it seems a really major topic to miss out (we did a bit in 1st year, but it was really just half a module on set theory in rather vague terms - the phrase "first order logic" did not appear once). I've done some reading on the subject, I should do some more... --Tango (talk) 21:47, 19 March 2008 (UTC)
For the record, the rationals Q, the reals R, and the p-adic integers Zp are all torsion-free abelian groups. Your structure theorem holds for finitely generated abelian groups. Tesseran (talk) 16:03, 21 March 2008 (UTC)
Excellent point. That doesn't change my (nevertheless incorrect) point, though. --Tango (talk) 16:11, 21 March 2008 (UTC)
Theories in physics (thought to be the trunk of the science "tree") are not necessarily consistent. Helpfulness of established theories begin and end with orders of magnitude. This is why we have semiclassical physics, and the mesoscopic scale, and why we differentiate "Physics in the Classical Limit," Relativity, and quantum theory. Mac Davis (talk) 08:01, 19 March 2008 (UTC)
Did I misunderstand Gödel's Incompleteness Theorems or the Incompleteness Theorems is really about distinguishing between classical physics and modern physics? I thought Incompleteness Theorems is just all about the math but not the physics. And Incompleteness Theorems should be able to be applied to all kinds of science, not just physics. I'm not offending, just hope someone can clarify the concept. - Justin545 (talk) 09:23, 19 March 2008 (UTC)
The incompleteness theorems don't apply particularly well to the kind of math you're probably familiar with. That is, they aren't relevant. They claim that a specific very sensible, very general way of justifying things doesn't work very well in certain contexts. That doesn't mean that what we were trying to justify is wrong, just that we'll have to look somewhere else for confidence in it. It also throws essentially no doubt on actual arithmetic, which deals only with fairly small numbers and can be justified by direct experience and some common sense. Black Carrot (talk) 08:10, 19 March 2008 (UTC)

Theorems are proved based on axioms. Experience in proving theorems made mathematicians conjecture that every true statement could eventually be proved. This conjecture turned out to be naive. The incompletenes theorem states that the conjecture is not true: the fact that some statement cannot be proved does not imply that the statement is false. The incompleteness theorem does not threaten the reliabilty of mathematics. Bo Jacoby (talk) 11:06, 19 March 2008 (UTC).

>> "The incompleteness theorem does not threaten the reliabilty of mathematics."
I think you mean the mathematics we use is consistent but not complete since there are still some true statements can not be proven by mathematics, and also you said the mathematics is reliable. But your opinion sounds a bit different with the other. For example, some one said "mathematicians believe that mathematics is consistent". Which means mathematicians "can not prove" mathematics is consistent. - Justin545 (talk) 01:20, 20 March 2008 (UTC)
Before the incompleteness theorem mathematics was supposed to be consistent and complete. After the incompleteness theorem mathematics is known to be incomplete. The incompleteness theorem does not clarify whether mathematics is consistent or not. So I do not say that mathematics is reliable as a consequence of the incompleteness theorem, nor do I say that mathematics is unreliable as a consequence of the incompleteness theorem. Bo Jacoby (talk) 05:03, 22 March 2008 (UTC).

In general, mathematicians believe that mathematics (however we may choose to define that term) is consistent. This is mainly because we have not found an inconsistency (a statement P such that both P and not-P can be proved). We can even express this "conjecture" as a (humungously complex) arithmetical statement. Problem is that we also know, thanks to Gödel, that we cannot prove this statement - at least, not without stepping up to some more powerful axiom system, which then leads a "turtles all the way down" type of regression. Bottom line is, most mathematicians say "that's interesting and slightly weird" but they don't lose sleep worrying that mathematics might be inconsistent. On a scale of rational evidence-based confidence, you can put the consistency of mathematics right up at the 99.99% mark. Gandalf61 (talk) 12:21, 19 March 2008 (UTC)

>> "(a statement P such that both P and not-P can be proved) ... thanks to Gödel, that we cannot prove this statement"
I believe Gödel used "logic" to build his Incompleteness Theorems. But isn't logic a kind of mathematics? If logic is a kind of mathematics, Gödel was using a tool, about which its consistency can not be sure, to prove his Incompleteness Theorems. In other words, Incompleteness Theorems is questionable since the logic itself is questionable. - Justin545 (talk) 01:57, 20 March 2008 (UTC)
It sounds like we can not use logic to justify the logic itself. It's meaningless! If we are doubt of the logic, we should also doubt of the natural language, such as English, Chinese,... etc., we use, since the logic is just a symbolization of our natural language. We can do inference by the logic and we can also do inference by our language. - Justin545 (talk) 02:15, 20 March 2008 (UTC)
Learn to live with uncertainty. (Like you have a choice....) --Trovatore (talk) 02:23, 20 March 2008 (UTC)
Maybe, learn to live with confidence in the logic and the language. - Justin545 (talk) 02:28, 20 March 2008 (UTC)
Confidence is one thing; fully justified certainty is quite a different thing. You seem to be looking for the latter. You're not going to find it. --Trovatore (talk) 02:33, 20 March 2008 (UTC)
Knowing the incompleteness theorems somewhat shakes my confidence in the logic and math. I was just trying to find my confidence in them by this discussion. I'm not pursuing the absolute certainty. Imperfection is allowed. - Justin545 (talk) 03:01, 20 March 2008 (UTC)
Ah, I see. Well, I suppose maybe they should shake your confidence. Just not very much. The take-away message is that mathematics is not really different in kind from the experimental sciences -- you can have confidence in it because it's observed to work, not because it's built up from an unassailable foundation via unassailable steps. The latter idea never really did make sense, even before Gödel -- there was always an infinite regress built into it, as you've noticed. But Gödel does seem to have made people come to terms with this more. --Trovatore (talk) 03:13, 20 March 2008 (UTC)
Literally, your prior response "I would say the mathematics underlying biology, chemistry, etc is far less likely to be in error than the biology and chemistry themselves." seems to contradict "mathematics is not really different in kind from the experimental sciences". Well, just my picking hobby, I'm not trying to "offend" you again. And excuse my English, I don't know why you use "kind" in italic.
>> "not because it's built up from an unassailable foundation via unassailable steps."
The foundation may not be unassailable, but the stpes is unassailable I think. That's why I like deduction more than induction.
>> "The latter idea never really did make sense"
The latter idea? - Justin545 (talk) 03:40, 20 March 2008 (UTC)
I said the mathematics was "far less likely" to be in error. That's a difference in degree, not a difference in kind. Your English seems to be pretty good, but I see on your user page that you're not a native speaker -- are you familiar with the phrases "different in degree" and "different in kind"? --Trovatore (talk) 04:15, 20 March 2008 (UTC)
Sure, I know what are "different in degree" and "different in kind". Maybe I understand your point now. Well, thanks for the compliment. But actually I can not write articles without a dictionary. Besides, my English grammar is questionable. - Justin545 (talk) 05:03, 20 March 2008 (UTC)

[edit] Quantum Mechanics: Orthogonality of Dirac Delta Function

The functions u(x) and v(x) are said to be orthogonal on interval 11) if their inner product is zero

\int_{\alpha_1}^{\beta_1}u(x)v(x)\,dx=0 

 

 (1)

 

For complex-valued functions or kets f(x)=|f\rangle and g(x)=|g\rangle, they are said to be orthogonal on interval 22) when

\langle f|g\rangle=\int_{\alpha_2}^{\beta_2}f^*(x)g(x)\,dx=0 

 

 (2)

 

To continue my last discusstion, Quantum Mechanics: Entangled Wave Function, my question is how to prove the orthogonality of the Dirac delta function δ(x) mathematically? Or some related resource? Thanks! - Justin545 (talk) 08:53, 20 March 2008 (UTC)

This is really a maths question, but the way to prove it is to look at the definition. The dirac delta function is zero at all the places except for the argument. If you multiply this zero value by another dirac delta you, will get a zero, unless the arguments are the same, when it will be greater than zero. δ(tx)δ(ty) = 0 if x not equal y. Graeme Bartlett. Another way to look at the dirac delta is that it is a sampling function, when you integrate its product with another function, it samples the other function at the argument to the dirac delta. (talk) 11:14, 20 March 2008 (UTC)

[edit] Quantum: Measurement vs. Schrödinger Equation

1. Article Copenhagen interpretation: Each measurement causes a change in the state of the particle, known as wavefunction collapse.

2. Article Schrödinger equation: The Schrödinger equation is commonly written as an operator equation describing how the state vector evolves over time.

Although I don't fully understand quantum mechanics, the two items above seem to be related to each other.

When an observable of a quantum system is measured, the state |\psi(t)\rangle of the system can be expressed as

|\psi(t)\rangle=\sum_i\psi_i|i\rangle 

 

 (1)

 

where |i\rangle is the ith eigenfunction, which is associated to eigenvalue i, of the observable and

\psi_i=\langle i|\psi(t)\rangle 

 

 (2)

 

which will "suddenly" or "discretely" collapse from |\psi\rangle to one of terms, say \psi_a|a\rangle, of the right-hand side of (1). The rest of the terms not associated to eigenvalue a simply vanish after the measurement.

On the other hand, Schrödinger equation

\hat H(t)\left|\psi\left(t\right)\right\rangle = \mathrm{i}\hbar \frac{d}{d t} \left| \psi \left(t\right) \right\rangle 

 

 (3)

 

where

\hat H(t)=-\frac{\hbar^2}{2m}\nabla^2+V\left(\mathbf{r},t\right) 

 

 (4)

 

describing how the state vector |\psi(t)\rangle evolves over time. When the state |\psi(t)\rangle of the system is measured, the apparatus measuring the system will interact with the system and makes change to the potential field V\left(\mathbf{r},t\right). Therefore, the state |\psi(t)\rangle should evolve "smoothly" or "continuously" according to the varying potential V\left(\mathbf{r},t\right) during the measurement. According to Schrödinger Equation (3) and (4) together with V\left(\mathbf{r},t\right), we should be able to figure out the final state of the system after the measurement.

It seems that the measuring process can be explained by the two ways, wavefunction collapse & Schrödinger equation, above. Do they contradict? Is "wavefunction collapse" compatible with "Schrödinger Equation"? - Justin545 (talk) 08:12, 26 March 2008 (UTC)

Yes, they do contradict. There is no place for a collapse in Schrödinger's Equation, which is one reason why David Bohm concluded that there can be no collapse of a wave function, that it's a figment of the model. — kwami (talk) 08:34, 28 March 2008 (UTC)
Figment of the model? I'm amazed that they do contradict since the two items are considered to be postulate of quantum mechanics in some textbook of quatum mechanics IIRC. It should imply at least one of the two items is wrong. So has David Bohm or some one else solved the contradiction? And how about the experimental evidence? Experimental evidence supports which one? - Justin545 (talk) 08:55, 28 March 2008 (UTC)
The Copenhagen interpretation is just that, an interpretation. It has no empirical support (or at least it didn't some years ago) and is in no way an axiom of QM. I've heard people who use it make the excuse that none of the other interpretations have any empirical support either, even though some of them are less counter-intuitive than Copenhagen. Bohm attempted to create a deterministic hidden-variable QM, but was unable to solve some fundamental problems before he died. One of his students continued with his work, but I don't know if he ever got anywhere. — kwami (talk) 09:04, 28 March 2008 (UTC)
I think neither Schrödinger equation nor wavefunction collapse could be axiom of QM. Therefore, they are considered to be "postulates" of QM. Schrödinger equation seems to correctly predict the spectral lines of each atomic models. On the other hand, wavefunction collapse seems to correctly predict the phenomenon of quantum entanglement. And both of the predictions has been observed by many experiments. The experimental results seem to support both of the two items. But there may be some subtle differences are missing (enough precision? relativity?). When reading the article Copenhagen interpretation, we should also notice the sentence "The Copenhagen interpretation consists of attempts to explain the experiments and their mathematical formulations in ways that do not go beyond the evidence to suggest more (or less) than is actually there." - Justin545 (talk) 09:41, 28 March 2008 (UTC)
>> "There is no place for a collapse in Schrödinger's Equation"
Theoretically, is it possible to build a thought experiment in which the measuring process is simulated and use the Schrödinger equation to find out the result of the experiment? Had some one done this job before? - Justin545 (talk) 10:02, 28 March 2008 (UTC)
As above, it is an open problem. There are ongoing efforts to create "measurement" systems that can be fully modeled quantum mechanically via Schrodinger's equation for all parts of the system. Observationally it is certainly true that wavefunctions "collapse", by which one means that a single particle state interacting with a much larger collection of particles will usually be observed to reside in an eigenstate, however the mechanics of how this occurs is not well understood. The dynamical timescale is apparently quite short, and the systems that need to be modelled fairly large (e.g. 30 or 40 plus particles evolving simultaneously). Dragons flight (talk) 16:02, 28 March 2008 (UTC)
It is likely to get only numerical solution to Schrödinger's equation for so many particles. Finding the solution of exact expression for so many particles seems impossible.
After reviewed the article wavefunction collapse this morning, I noticed this:
By the time John von Neumann wrote his famous treatise Mathematische Grundlagen der Quantenmechanik in 1932[1], the phenomenon of "wave function collapse" was accommodated into the mathematical formulation of quantum mechanics by postulating that there were two processes of wave function change:
1. The probabilistic, non-unitary, non-local, discontinuous change brought about by observation and measurement, as outlined above.
2. The deterministic, unitary, continuous time evolution of an isolated system that obeys Schrödinger's equation (or nowadays some relativistic, local equivalent).
In general, quantum systems exist in superpositions of those basis states that most closely correspond to classical descriptions, and -- when not being measured or observed, evolve according to the time dependent Schrödinger equation, relativistic quantum field theory or some form of quantum gravity or string theory, which is process (2) mentioned above. However, when the wave function collapses -- process (1) -- from an observer's perspective the state seems to "leap" or "jump" to just one of the basis states and uniquely acquire the value of the property being measured, ei, that is associated with that particular basis state. After the collapse, the system begins to evolve again according to the Schrödinger equation or some equivalent wave equation.
It seems that we should treat wave function change as an if-then-else statement in programming. If the change is discrete then use wavefunction collapse method else if the change is continuous then use Schrödinger's method. Not quite a an elegant way in science. - Justin545 (talk) 02:30, 29 March 2008 (UTC)
Any mathematical model that involves "alakazaam!" is obviously fundamentally flawed. However, QM is also the most precisely confirmed theory in human history. As a result, you get the null "Shut up and calculate!" interpretation, which seems to be what most people actually abide by. — kwami (talk) 18:19, 28 March 2008 (UTC)
It sounds like you (kwami) are really sick of quantum theories and those people who are learning it. Unfortunately, I am not here trying to pick a fight with someone over my post. I mean maybe you want to ignore this post and take a rest for a while. - Justin545 (talk) 00:41, 29 March 2008 (UTC)

[edit] Quantum: Determine the Force between the Electrons

According to Coulomb's law, when two electrons are put close to each other, there will be electrostatic forces act on them and the force can be determined as

F = {1 \over 4\pi\varepsilon_0}\frac{q_1q_2}{r^2}

where r is the distance between the two electrons. But according to Uncertainty Principle, we can not make sure the positions of the electrons so how can we decide r? - Justin545 (talk) 09:18, 19 April 2008 (UTC)

Coulomb's law is a classical approximation and only applies when r is large compared to inter-atomic distances and the charges are stationary or moving slowly compared to the speed of light. For the full quantum Monty, you need quantum electrodynamics, which explains non-classical phenomena such as the Casimir effect. Gandalf61 (talk) 11:45, 19 April 2008 (UTC)
Well, you don't actually need QED to make use of Coulomb's law in quantum mechanics. I just finished an elementary quantum mechanics course (where we never touched QED), and we used Coulomb's law all the time. It turns out that "force" is not a very useful concept in quantum mechanics. Much more often, one speaks of the potential, which is
V = {1 \over 4\pi\varepsilon_0}\frac{q_1q_2}{r}
(If you know any vector calculus, the potential is defined so that F = -\nabla V, where \nabla is the gradient operator.) So the way you actually use Coulomb's law is that you have a space of possible positions of some particles (say, a proton and an electron in a hydrogen ataom), and you use Coulomb's law to assign a value of the potential to every point of this space. Then you solve the Schrödinger equation on this space to obtain the possible energy states of the system. As you say, the distance between the two particles is uncertain, but this is not a problem because the electrostatic contribution to the energy is also uncertain. It is only the total energy that is certain; it is uncertain what fraction of that is electrostatic potential energy and what fraction is kinetic energy. (If you're confused about how the sum of two uncertain numbers can be uncertain, imagine I flip a penny and a nickel and don't tell you the results, but I tell you I got one head and one tail. The total number of heads is now certain, but the number of pennies or nickels that came up heads is uncertain.) —Keenan Pepper 14:42, 19 April 2008 (UTC)
Suppose a wave function of two electrons is ψ(x,y) where x and y are the postions of the respective electron. Then | ψ(x,y) | 2 should be the probability density function for finding the first electron at x and the second electron at y. And the normalization condition should be \int_{-\infty}^\infty\int_{-\infty}^\infty|\psi(x,y)|^2\,dx\,dy=1. The distance between them should be r = | xy | . The probability of finding the first electron at x' and the second electron at y' should be P(x',y')=\int_{y'}^{y'+dy}\int_{x'}^{x'+dx}|\psi(x,y)|^2\,dx\,dy. So do you mean the potential is determined by the weighted sum in discrete form
V={1\over 4\pi\varepsilon_0}\left(P_1\frac{q_1q_2}{r_1}+P_2\frac{q_1q_2}{r_2}+P_3\frac{q_1q_2}{r_3}+...\right)
\begin{align}
V=&{1\over 4\pi\varepsilon_0}\left[P(x'_1,y'_1)\frac{q_1q_2}{|x'_1-y'_1|}+P(x'_1,y'_2)\frac{q_1q_2}{|x'_1-y'_2|}+P(x'_1,y'_3)\frac{q_1q_2}{|x'_1-y'_3|}+...\right]\\
+&{1\over 4\pi\varepsilon_0}\left[P(x'_2,y'_1)\frac{q_1q_2}{|x'_2-y'_1|}+P(x'_2,y'_2)\frac{q_1q_2}{|x'_2-y'_2|}+P(x'_2,y'_3)\frac{q_1q_2}{|x'_2-y'_3|}+...\right]\\
+&{1\over 4\pi\varepsilon_0}\left[P(x'_3,y'_1)\frac{q_1q_2}{|x'_3-y'_1|}+P(x'_3,y'_2)\frac{q_1q_2}{|x'_3-y'_2|}+P(x'_3,y'_3)\frac{q_1q_2}{|x'_3-y'_3|}+...\right]\\
&\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\vdots\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\vdots\\
\end{align}
V={1\over 4\pi\varepsilon_0}\sum_{x'}\sum_{y'}P(x',y')\frac{q_1q_2}{|x'-y'|}
V={1\over 4\pi\varepsilon_0}\sum_{x',y'}P(x',y')\frac{q_1q_2}{|x'-y'|}
or in continuous form
V={1\over 4\pi\varepsilon_0}\int_{-\infty}^\infty\int_{-\infty}^\infty|\psi(x,y)|^2\frac{q_1q_2}{|x-y|}\,dx\,dy
? - Justin545 (talk) 06:00, 20 April 2008 (UTC)
Getting out of my depth here so this might be a stupid comment. I don't see how you could use that in a real calculation. ψ(x,y) and hence P(x',y') is not a given. The starting information is usually the potential function V(r,φ,z) which is then fed into Schödinger to get the answer. Also, I cannot understand why you are working in two dimensions only, you need x,y,z in cartesian co-ordinates - or was that just for brevity? SpinningSpark 08:53, 20 April 2008 (UTC)
The use of symbol y would be confusing, but y doesn't mean the y-axis which is perpendicular to the x-axis. For brevity, I assume both of the two electrons lie on the same line so that they can only move in one-dimentional space (imaging two electrons in the same wire of infinite length).
The starting information, the potential field V is where my qustion came from. For a system which consists of only one charged particle, an electron for example, the potential field V of the particle should be determined by its environment. But consider a system of more than one charged particles, the potential field V should be a function of those charged particles as well as their environment since the charged particles will interact with each other (either attraction or repulsion because of electrostatic forces). - Justin545 (talk) 11:02, 20 April 2008 (UTC)

[edit] 11:11, 26 April 2008 (hist) (diff) User:Justin545/Valuables‎ (+Quantum: Determine the Force between...) (top)