Wikipedia:Reference desk archive/Mathematics/2006 September 12

From Wikipedia, the free encyclopedia

< Wikipedia:Reference desk archive | Mathematics

< September 11	Mathematics desk archive	September 13 >

Humanities

Science

Mathematics

Computing/IT

Language

Miscellaneous

Archives

The page you are currently viewing is an archive page. While you can leave answers for any questions shown below, please ask new questions at one of the pages linked to above.

< August	September	October >

1 September 12

[edit] September 12

[edit] Fermat's Factoring Method

Suppose that n is odd composite. Then, Fermat assures us that it ay be written $m 2 - d 2 = n$ for some m and d integers. Suppose $n \cong 1 \ (\mbox{mod 4})$ . Then we may show that $m^2 \cong 1 \ (\mbox{mod 4}) \and d^2 \cong 0 \ (\mbox{mod 4})$ . Equivalently, $m \cong 1 \ (\mbox{mod 2}) \and d \cong 0 \ (\mbox{mod 2})$ . For what moduli does a theorem of this form hold, and how do we lift from a statement for a small modulus to a statement about a larger modulus without a quadratic increase in the number of cases to be retained? (I.e. if we lift to the modulus 16, then depending on the residue of n we find that one of m and d is constrained to one value (mod 4) and the other is constrained to two values (mod 8) (that are not congruent (mod 4). If we then lift (mod 3) then we get two or four cases (depending on whether either of m or d can be congruent to zero (mod 3)), and using the Chinese remainder theorem to glue these cases to the cases derived from n (mod 16), we end up with four or eight cases -- some from residue classes (mod 12) and some from residue classes (mod 24).)

So, how do we encode the retained cases without creating an exponential explosion in the encoding of the retained cases and retaining the ability to perform additinoal lifting and additional applications of the CRT. -- Fuzzyeric 03:33, 12 September 2006 (UTC)

[edit] STATISTICS

Require a method to calculate confidence intervals for weighted sums. The sum is of the form SUM wi xi and SUM wi = 1.

Thank You

Gert Engelbrecht Pretoria South Africa

This is not possible without further information about the distributions of the random variables x_i. --Lambiam ^Talk 15:50, 12 September 2006 (UTC)

[edit] square centimeters

need to know how many square centimeters are in a piece of tisse that measures 4cm X16cm --need the answer - not the formula. I am not a student needing assistance with homework ---

4 x 16. --jpgordon^∇∆∇∆ 15:38, 12 September 2006 (UTC)

Why do they teach pupils to do multiplication before they teach them to read the large notice at the top of the page saying 'NO HOMEWORK'? They also seem to have omitted the lesson in which pupils are taught to thank people who answer their questions. —D a niel (‽) 20:03, 12 September 2006 (UTC)

The answer is 4 * 16 = 2101₃ square centimeters. There I have just given you the exact answer. 202.168.50.40 23:06, 12 September 2006 (UTC)

[edit] octahedral rotational symmetry

Can you draw a graph to show octahedral rotational symmetry

rotation about an axis from the center of a face to the center of the opposite face by an angle of 90°: 3 axes, 2 per axis, together 6
ditto by an angle of 180°: 3 axes, 1 per axis, together 3
rotation about a body diagonal by an angle of 120°: 4 axes, 2 per axis, together 8

Many thanks!--82.28.195.12 20:17, 12 September 2006 (UTC)Jason

No. What variables would be plotted on the graph?

There is some discussion of symmetries in octahedron. ColinFine 23:21, 12 September 2006 (UTC)

Better still, try octahedral symmetry. It has many figures. Consider whether you wish to deliberately exclude any reflection symmetry; most simple examples naturally include it. --KSmrq^T 23:28, 12 September 2006 (UTC)

[edit] Calculus, Limits and First Principles.

Hello. This is one of those problems which hits you hard when realise you don't know how to do it.

-So- much mathematics is based on the result that d/dx(e^x) is e^x, or written in a different way.. the integral of 1/x is ln(x). The question is, how do we prove this?

We can go back to first principles easily enough, and say that the derivative of a^x, as h tends to zero, is:

(a^(x+h) - a^x)/h

Factorise out a^x, and get:

a^x(a^h - 1)/h

Now, we know from basic calculus that differentiating this should go ln(a).a^x, so we're looking to show that the below limit is true:

(a^h - 1) / h = ln(a), as h tends to zero.

This doesn't look tricky, does it? But remember that we're trying to prove a result fundamental to calculus, so what we can use is limited (no pun), we can't use l'hopital's rule (which would give the right answer), as it relies on differentiating an exponential - our thing to be proven.

So, basically, I would really, really appreciate it if somebody could attempt to prove my limit is true, or comment that they could not (so I know the ability levels it's going to take).

Thank you, and remember: No circular reasoning! No proving that e^x differentiates to itself by assuming it in the first place!

Michael.blackburn 20:56, 12 September 2006 (UTC)

The first question, I suppose, is the definition of e. If you use the Taylor series for e^x, as the definition of e, you can use term by term differentiation of a polynomial:

$e^x = \sum_{i=0}^\infty \frac{x^i}{i!}$

Assuming (without proof here, although it can be proven) that

$\frac{d}{dx}\sum_{i=0}^\infty \frac{x^i}{i!} = \sum_{i=0}^\infty \frac{d}{dx}\left( \frac{x^i}{i!}\right)$

you can differentiate and find the result. --TeaDrinker 21:01, 12 September 2006 (UTC)

I found an easy solution:

$\frac{d}{dx} e^x = \lim_{h \to 0} e^x \left( \frac{e^h}{h} - \frac{1}{h} \right)$
Using definition $e = \lim_{h \to 0} (1 + h) ^{\frac{1}{h}}$
$= \lim_{h \to 0} (e^x) \frac{\left( (1 + h) ^{\frac{1}{h}} \right) ^h}{h} - \frac{1}{h}$
$= \lim_{h \to 0} (e^x) \frac{1 + h}{h} - \frac{1}{h} = \lim_{h \to 0} (e^x) \frac{h}{h} = e^x ?$
so $\frac{d}{dx} e^x = e^x$ iff $\lim_{h \to 0} \frac{h}{h} = 1$
This would seem to require L'Hôpital's rule, but what do you all think? M.manary 21:11, 12 September 2006 (UTC)

For reference: The proof of the fact that $\lim_{h \to 0} \frac{h}{h} = 1$ does not require l'Hôpital. There is a rather elementary proof, directly based on the definition of the limit of an expression. Hint: Can you simplify $\frac{h}{h}$ ? JoergenB 18:02, 18 September 2006 (UTC)

I also believe that TeaDrinker's solution will rely on a formula already using d/dx e^x, as Taylor series can ONLY be derived from that notion (try it yourself and see), so that solution is no-go. M.manary 21:15, 12 September 2006 (UTC)

Formally, I have actually used the Taylor series as the definition of e^x, so no derivation of the Taylor series is needed. The proof from first principles does depend on your definition of e. --TeaDrinker 21:25, 12 September 2006 (UTC)

M.manary, I think using L'Hôpital's rule is okay here, as long as we don't use it with exponentials. It can be proven from fairly basic principles. Michael.blackburn 21:17, 12 September 2006 (UTC)

I just found a proof that you can read at: http://www.ltcconline.net/greenl/courses/106/ApproxOther/lhop.htm so: $\frac{d}{dx} e^x = e^x$ Q.E.D. M.manary 21:19, 12 September 2006 (UTC)

Nice. The way we learned it in Calculus, and the article agrees, the natural logarithm is defined as the area under the graph of 1/x from 1 to b. And the natural exponent is defined as the inverse of the natural logarithm. So,

f(x) = ln(x)
f'(x) = 1/x
g(x) = e^x
f(g(x)) = x
f'(g(x))g'(x) = 1
g'(x) = 1/f'(g(x))
g'(x) = 1/(1/(e^x))
g'(x) = e^x

The fifth step uses the chain rule, which makes no assumptions about the functions themselves, other than that they can be differentiated in the first place. Black Carrot 01:15, 13 September 2006 (UTC)

This is fine if you replace "e^x" by "exp(x)", where exp is defined to be the inverse function of ln. Next, it has to be shown that there exists a constant e such that exp(x) = e^x. --Lambiam ^Talk 03:41, 13 September 2006 (UTC)

I was under the impression such a proof existed. And that it in no way compromised this one. Do you know it? Black Carrot 06:14, 13 September 2006 (UTC)

After defining "exp", you define power as a^b = exp (b lna ). You need to show that this agrees with the definition of powers with rational exponents, which isn't hard. It is also easy to show that ln is bijective, so you define e as the preimage of 1. Then you're pretty much good to go. -- Meni Rosenfeld (talk) 08:08, 13 September 2006 (UTC)

[edit] (Another) on Calculus, with Radians.

Okay, thank you (Very) much for the awnsers to the previous question. This one is probably a lot more obvious, yet through searching Wikipedia and the internet I've still failed to find a solution. My mathematics teachers were also uncertain.

Basically, the question is "Why radians?"

From thinking about it, it's obvious that the Calculus can only work with a single angle measure, and obviously it's the radian. But the best explaination I've recieved is that solving Taylor series of Trig. functions works in radians only... but those Taylor series surely required Radians in the first place.

Can -anybody- offer a sensible 'proof' or reasoning that you have to use 2pi 'units' per circle for the Calculus to work, without starting in radians in the first place?

Thank you again!

It's not a proof at all, but consider a small angle segment of a circle. The arclength = rθ - with theta in radians. But this is approximately equal to r * sin (θ). So sin(θ) ~ θ, if theta is in radians (for small angles). You can show this by simple geometry Richard B 21:36, 12 September 2006 (UTC)

Oh, so.. if you used degrees for example, and arclength is.. kθ, so sin(θ) ~ kθ in degrees (where k isn't one), and so the results go icky. Is that it? Michael.blackburn 21:44, 12 September 2006 (UTC)

So the definition of a radian is that in a circle of radius r, a central angle of one radian will define an arclength of the circle that is exactly r long. If the radius of a circle is r, then there are (by circumference) 2π radii in the arclength of the circle. SO when we say an angle = 2 radians, it inscribes an arclength of 2r out of its circle.

The numbers actually don't work out that well, as you may notice, because we always have to say an angle is π/3 or 2π/7 radians, which really aren't that great numbers... M.manary 23:12, 12 September 2006 (UTC)

2π/7 radians, eh? How'd you write that in degrees, 360/7°, or 51.43° or 51°25' or (51+3/7)°? If you write that, can you tell faster how many of those you need for a full cycle? Personally, I think even π/5 is easier to understand than 36 degrees. – b_jonas 11:23, 13 September 2006 (UTC)

If you look at the Taylor series' defining sin and cos, you'll notice that there aren't any unnecessary coefficients. If you wanted to express the same thing in degrees, you would need to put π/180 in front of every x term (and square it for x^2, etc...). This will give you a complicated sequence of coefficients to keep track of. The radian is the simplest unit to use in this kind of calculation. There are also interesting geometric properties, and probably other reasons as well, but the main reason (to me) is that it avoids keeping track of unnecessary coefficients in calculation, or when using calculus. - Rainwarrior 23:27, 12 September 2006 (UTC)

Exactly. It's the same as why we have to use e as the base of exponentials. If you want to solve the linear differential equation $a_n \frac{d^ny}{dx^n} + \dots + a_1 \frac{dy}{dx} + a_0 y = 0$ , you find the roots of the polynomial $a_n z^n + \dots + a_1 z + a_0$ in the form of

z = p + q i

and then the base solutions are

y = e p x sin(q x)

(not counting the cases of roots with multiplicity). Here, you have to use

e

as the base of the exponent and the sine function using radians. – b_jonas 11:15, 13 September 2006 (UTC)

[edit] Inverse

We are learning determinants and inverses in matrices in math. My teacher said that if you have a determinant of "0' then there is no inverse. I went on to say that in some remote field of mathematics, there is probably a way to fond the inverse of matrix with d=0. She said, "Maybe, but I doubt it. Why don't you look that up and tell us tomorrow." I took look at abstract algebra, and it was kind of confusing. I did some relevant Google searches, but to no avail. My question is: am I right? Is there, in some field of mathematics, a way to find the inverse of a matrix that has a determinant of 0? schyler 23:37, 12 September 2006 (UTC)

I don't think so. If a matrix A is invertible, then AA^-1=I where I is the identity matrix. You can check by simuntaneous equations that you get two contradictions:

$\begin{pmatrix} 1&2\\2&4 \end{pmatrix} \begin{pmatrix} a&b\\c&d \end{pmatrix}=\begin{pmatrix} 1&0\\0&1 \end{pmatrix}$

If you go on to more difficult mathematics, you might understand a bit more about matrices when you learn systems of linear equations and related topics. x42bn6 Talk 02:12, 13 September 2006 (UTC)

The articles on Determinant, Invertible matrix, Identity matrix, and Matrix multiplication look nice, and link to other stuff. I don't know much about matrices, but I think I can give some general advice. First, if they say it's true, there's probably a good reason, especially in something as exhaustively studied and widely used as matrices. Even if in some obscure branch of mathematics someone decided they could do it, or made up a system in which it worked, that doesn't change that if you wind up with {{1,1},{1,1}}A={{1,0},{0,1}}, and you need A, you're screwed. I wish you luck finding the exception, though. While I was in Junior High I decided it was ridiculous to say that division by 0 was impossible, as everyone kept repeating, so I spent a few years figuring out how it works. I was, as you might imagine, rather upset to find out that limit notation and functional analysis have existed for a few centuries and nobody told me. But still, that doesn't change that 1/0 is meaningless in arithmetic and algebra. Black Carrot 02:21, 13 September 2006 (UTC)

There is a thing called a Pseudoinverse, which is the closest thing to an inverse matrix which a given matrix (even a singular or non-square one) has. If the determinant of some matrix is not 0 but an infinitesimal, you will be able to talk of an inverse with infinite entries (of course, this does not work with real numbers). -- Meni Rosenfeld (talk) 05:46, 13 September 2006 (UTC)

A nice way to see why d = 0 is a problem is to look at the following. Simplifying this somewhat, if A is a matrix and d its determinant, its inverse can be given by 1/d X, where X is some other matrix (see Application section on Adjugate matrix). Clearly if d = 0, you will have problems -- division by zero. Dysprosia 05:53, 13 September 2006 (UTC)

Another nice way to understand why an inverse matrix does not exist if det(A)=0 is to think of an nxn matrix A as representing a linear transformation of n dimensional space. So a 2x2 matrix is a linear transformation of 2-d space (i.e. the plane). The inverse matrix (if it exists) then represents the inverse of this transformation. But the inverse transformation is only defined if the original transformation is 1-1. Some linear transformations are not 1-1 because they map n dimensional space onto a linear sub-space of itself in a many-1 way - for example, the transformation represented by the matrix

$\begin{pmatrix} 1&2\\2&4 \end{pmatrix}$

maps the plane onto the line y=2x, because

$\begin{pmatrix} 1&2\\2&4 \end{pmatrix} \begin{pmatrix} x\\y \end{pmatrix} = \begin{pmatrix} x+2y\\2x+4y \end{pmatrix}=\begin{pmatrix} t\\2t \end{pmatrix}$

and so all the points along each of the parallel lines x+2y=t are mapped to the single point (t,2t). A many-1 linear transformation (for which an inverse transformation cannot, by definition, exist) is always represented by a matrix with determinant 0, which in turn does not have a matrix inverse. Gandalf61 10:50, 13 September 2006 (UTC)

The question is provocative, and fruitful. Multiplication is defined for two compatible matrices, where compatibility means that the number of columns of the left matrix equals the number of rows of the right matrix. The usual algebraic definition of inverse depends on a definition of identity. If a matrix is not square, we might have two different identities (left and right), suggesting the possibility of a left or right inverse. For example, the 2×3 matrix

$A = \begin{bmatrix}1&0&0\\0&0.6&0.8\end{bmatrix}$

can be said to have a right inverse 3×2 matrix

$B = \begin{bmatrix}1&0\\0&0.6\\0&0.8\end{bmatrix}$

because the product AB is the 2×2 identity matrix. This also shows that A is a left inverse for B.

Determinants, however, are only defined for square matrices. It is impossible for a square matrix to have a left inverse but not a right inverse (or vice versa), because the row rank and column rank are always equal. The determinant of a square matrix is nonzero precisely when the matrix has full rank, which means that if we look at the image space of n-vectors under the action of the n×n matrix, it also has dimension n.

Put more geometrically, a singular matrix collapses one or more dimensions, smashing them flat. The determinant measures the ratio of output volume to input volume, so a zero determinant tells us that such a collapse has occurred. And because the flattening has thrown away information in at least one dimension, we can never construct an inverse to recover that information. The rank of a matrix is simply the number of dimensions of the image space, while the nullity is the number of dimensions that are flattened. (Thus we have the rank-nullity theorem, which says that the sum of the two is the size n.)

Thus your teacher's skepticism is justified. But in some practical situations we will be satisfied with less than a full inversion. If we can only recover the dimensions that are not flattened, that's OK. The singular value decomposition of a matrix (of any shape) reveals both the rank and the nullspace of a matrix beautifully.

$A = U \Sigma V^T , \,\!$

where A is m×n, U is an m×m orthogonal matrix, V is an n×n orthogonal matrix, and Σ is an m×n diagonal matrix with non-negative entries. (The diagonal entries of Σ are called the singular values of A.) Let Σ⁺ be Σ with each nonzero entry replaced by its reciprocal. Then we may define the Moore–Penrose pseudoinverse of A to be

$A^{+} = V \Sigma^{+} U^T . \,\!$

This accomplishes what we hoped, and perhaps a bit more. For rectangular matrices whose columns (or rows) are linearly independent, we get the unique left (or right) inverse discussed earlier. For invertible square matrices, we get the ordinary inverse. And for singular matrices, we indeed get a matrix that inverts as much as we can. (This discussion also extends to matrices with entries that are complex rather than real.)

In applied mathematics, use of the pseudoinverse is both valuable and common. We need not wander into some remote and esoteric realm of pure mathematics to find it. So congratulations on your instincts; you may have a promising career ahead of you. (And, of course, congratulations to your teacher, whose instincts were also correct.) --KSmrq^T 10:36, 13 September 2006 (UTC)

[edit] Estimating a contour map from known spot heights.

Please tell me if I've got this right, or is there a better formula I could use?

I have ten spot heights scattered irregularly over a retangular map. I intend to estimate the height of every point on this map so that I can create a contour map.

I am going to estimate the height of each point by the weighted average of all the spot heights. The weight I am going to use is the inverse of the distance squared.

In fact I am going to use a further refinement - instead of just using the square, I am going to estimate the exact power to use by disregarding each of the ten spot heights in turn, and finding what the power is that best predicts the disregarded spot height from the nine known spot heights, and then calculate the arithmetic average of these ten powers.

So my weights will be 1/d^n where d is distance and n is the power. My questions are-

a) is there any better (ie more accurate estimator) formula to use for the weights than 1/d^n?

b) should I use an arithmetic average of the ten powers, or some other kind of average?

c) is there any better approach I could use, even though the scheme described above is easy to program? Thanks 62.253.52.8 23:55, 12 September 2006 (UTC)

I find the above idea interesting, but here's a totally different one. Get the delaunay triangulation of the ten points, then for every other point find the triangle it's in and just find the height from the plane the triangle defines. Your contour map would be all straight lines. A problem is that anything outside the convex hull of your spots is undefined, using the plane of the closest triangle should work. (You might be able to fix that and add curvature to the triangles using the adajacent triangles and spline interpolation, maybe.)

Are the points with known heights just random samples or were they chosen because they are relative maxima/minima ? Your method would work well in the latter case. However, the inability of your method to give elevations above the highest sample point or below the lowest sample point would be a problem if you are just using random points, resulting in a flatter geography than is really the case. StuRat 05:25, 13 September 2006 (UTC)

The spot heights are not max/min, so they must be the other choice. Actually they are not actually heights: I am interested in creating a contour map of house prices. The spot heights represent the prices in various towns. The house prices in the surrounding country are sometimes more, sometimes less. Another possibility would be to also weight the 'spot heights' by the population of each town, so I suppose I would get something like

w = p / d n

where p is population.

I see three problems here:

There is no reason to think that house prices are a continuous function. In fact, houses prices frequently vary dramatically on the other side of some barrier, such as a river, highway, railroad tracks, or city/school district boundary. So, you are using a method that depends on having a continuous function when you don't actually have one, which will lead to poor results.

I am working with a large area, so the fine detail is unimportant. If I was working with just a town or city, then such step changes are the very things I would like my map to make clear.

You can only compare prices on comparable houses. For example, 20 year old, 2000 square foot, 3 bedroom, 2 car garage houses. Otherwise, you are comparing "apples and oranges", so this doesn't tell you much about the premium/penalty of placing a home in that location. To distinguish between this "location premium" and the size and quality of houses built in an area, you might want to compare vacant lot prices. This should only give you the "location premium".

The statistical series I am working with are for a very large number of sales, not individual houses. The sub-series are for different types of house. I agree that prices may vary both by the quality of the house, and the favourableness of the location. This is less of a problem when comparing year on year price changes.

If you only look at houses offered for sale, there might be a built in bias there, in that people will want to move more when something is wrong (basement floods continuously ?). Therefore, houses offered for sale may not be typical of the true value of houses in the area. StuRat 02:07, 16 September 2006 (UTC)

All the statistical series are for sold houses. I am in the UK: here the statistical series are I'm sure different from what you have in what I assume is the US.

Another suggestion (though I have no idea if it is a good one) is to match your data points to a polynomial of the form

a + b x + c y + d x 2 + e x y + f y 2 + g x 3 + h x 2 y + i x y 2 + j y 3

which, conveniently, has 10 coefficients. -- Meni Rosenfeld (talk) 11:04, 13 September 2006 (UTC)

Sorry, I don't quite understand this. Would a.....j be the spot heights, and x, y the position on the plane, or what please?

Thanks. Although Delaunay triangulation is attractive, I think it would be far too difficult to program - it would require several days I expect.

I did wonder if there would be any advantage in using weights based on formulas such as:

$w = 1 / (x + d n)$ where x is a constant. I think the magnitude of x would determine to what extent the estimated local height was based on the average of all heights, rather than those of the nearer spot heights. Are there any better formulas I could use?

I don't actually know how I would estimate x and n by regression or otherwise - could any help me please? Thanks.

There's a fundamental problem here. To pick a method of fitting to data, some additional requirements have to be applied. In sciences where there is a theory, the theory predicts a form for the data. Just add data, fit the form, and then you know the values of the (typically unspecified) constants. However, in the absence of a model (i.e. a form with some parameters identified as free for fitting), onehas a vastly wider range of equally bad solutions. Examples:

Construct the Voronoi diagram of the locations of the data points, then for each sample, raise it's height to the sampled altitude. This is a solution that from (at least) one point of view assumes the least: no intermediate altitudes are inferred and each point is the height indicated by the nearest sample.
Fit the bivariate polynomial suggested by Meni, above. This has the advantage that the result is smooth. It has the disadvantage that it can "blow up" going rapidly to "unreasonable" values outside the region where the data is provided.
Take the logs of your input heights, fit the bivariate polynomial to the logs, then take $exp(fit )$ .

This much freedom should indicate that there are a lot of ways to do this, and which one is right depends on what a priori information you have (or even suspect) about the result. This is similar to trying to capture imprecise or subjective priors in Bayesian analysis.

The method you describe of omitting some data to evaluate a proposed model is a good one (q.v. Cross-validation), but... Ten data points is small to begin with, so one should expect large random effects caused by small sample sizes using this method. It may turn out that your data is "very nice" meaning that the method (surprisingly) works well.

Given that, the method for gluing the (slightly) mismatched exponents is also a problem in model choice. Perhaps a better method would be to use Bayesian inference on your 10 subsets to estimate the maximum likelihood choice for the exponent. -- Fuzzyeric 01:58, 15 September 2006 (UTC)

I have been wondering how I should best average ten different formulas of the form

1 / (x + d n)

. I'm not sure if taking the arithmetic average of x and perhaps the geometric average of n would necessarily be the best thing to do.

To clarify my polynomial, if you choose to give it a try - x and y are indeed the coordinates on the map, but a...j are coefficients you need to find. By substituting in the polynomial the x, y coordinates of a known point, and equating it to its known height, you get an equation in a...j. By doing it for all 10 points, you'll get ten equations, which you can solve. -- Meni Rosenfeld (talk) 04:40, 15 September 2006 (UTC)

Retrieved from "http://en.wikipedia.org../../../r/e/f/Wikipedia%7EReference_desk_archive_Mathematics_2006_September_12_a420.html"