Talk:Binomial coefficient

From Wikipedia, the free encyclopedia

Mathematics Portal

This article is within the scope of WikiProject Mathematics, which collaborates on articles related to mathematics.

Mathematics rating:

B Class

Mid Priority

Field: Algebra

One of the 500 most frequently viewed mathematics articles.

Please update this rating as the article progresses, or if the rating is inaccurate. Click to show/hide comments.
Please add to or update the comments to suggest improvements to the article.
The article talks about pizza in the first sentence, then jumps into a formula. An encyclopedic summary lead is necessary if this article is to become GA. --Cronholm144 02:26, 22 May 2007 (UTC) The pizza stuff is nice, but maybe better in the body rather than the lead? Geometry guy 20:19, 9 June 2007 (UTC)

[edit] use a damn header

Does anyone want to add the general binomial theorem (for all n). Also I have some weird alternative formulas for C(n,r) with n a non-negative integer, but I'd have to set up the pictures on a stable server or something.—The preceding unsigned comment was added by 69.22.195.18 (talk • contribs) .

[edit] Definition of binomial coefficients for n < 0

To Maxal and whoever else is willing to comment:

I would say that the recently inserted equation

$(1+x)^n = {n\choose 0} + {n\choose 1}x + {n\choose 2}x^2 + \dots = \sum_k {n\choose k} x^k.$

is redundant, because of

$(x+y)^n = \sum_{k=0}^{n} {n \choose k} x^k y^{n-k} \qquad (2)$

which shows up below.

The latter formula makes sense only for integer $n \geq 0$ . While the former one is true for any n. --Maxal 22:47, 13 Apr 2005 (UTC)

The point is, everywhere in this article it is assumed that n ≥ 0. Just read through it. The Pascal triangle for example, does not hold for n < 0. I find your addition not in the right place. I think it rather belongs in a generalization section at the end of the article. What do you think? Oleg Alexandrov 23:03, 13 Apr 2005 (UTC)

Disagree. It's the only complete definition for now. And it should come first. I think Pascal triangle and other middle-school-level stuff should form a separate section. --Maxal 23:07, 13 Apr 2005 (UTC)

Please notice that the generalization to n < 0, actually, for any n complex, is at already in the article, see several sections below in there. Oleg Alexandrov 23:10, 13 Apr 2005 (UTC)

First of all, there is a bad mixture of notations. $C_n^k$ or C(n,k) is the number of combinations and it's defined only for non-negative integer n,k (Pascal's triangle is actually defined for C(n,k)). The binomial coefficient ${n\choose k}$ is a generalization of $C_n^k$ which is defined for integer k and arbitrary n. As of generalization of ${n\choose k}$ to complex n, it's ok except for notation. C(n,k) is inappropriate.

Second, the article makes accent on combinatorics where complex n have no (or very limited) application. Hence, it's ok to keep the generalization to complex n at the end. --Maxal 23:26, 13 Apr 2005 (UTC)

So, what are the applications of negative n to combinatorics? Oleg Alexandrov 23:28, 13 Apr 2005 (UTC)

Say, they often appear in expansions of generating functions. And there many self-reverse relations including them. --Maxal 23:37, 13 Apr 2005 (UTC)

So, if you want to really make some changes to this article, you should read it all, then see how to make things look nice overall. What do you think?

I've read it all, and there was no correct definition for binomial coefficients for n<0. Moreover, in the discussion you can see a request for the definition "general" binomial coefficient. I've provided a general definition of binomial coefficients (most popular in combinatorics). From this very same expansion binomial coefficients can be also defined for non-integer n but that's mainly related to analysis. --Maxal 22:47, 13 Apr 2005 (UTC)

Also, please see the sentence:

This is generalized by the binomial theorem, which allows the exponent n to be negative or a non-integer.

in the article. Did you notice it? Oleg Alexandrov 22:42, 13 Apr 2005 (UTC)

Yes but that article should not prevent for providing correct and complete definition of the binomial coefficients. --Maxal 22:47, 13 Apr 2005 (UTC)

[edit] Rewriting the first section in this article?

I am a bit unahappy about the first section in this article. There, one starts with the expansion of

(1 + x) n

only to switch later in the same section to

(x + y) n .

One starts with an arbitrary integer n, only to continue with natural n in the next sections, and come back with arbitrary complex n a while below.

I would suggest to make the first paragraph talk only about the case of the natural number n. That is, start with the most elementary case. This will be followed naturally by the Pascal triangle, in the next section. One can expand and generalize on this later. This is consistent with Wikipedia:How to write a Wikipedia article on Mathematics. Any comments? Oleg Alexandrov 23:50, 22 Apr 2005 (UTC)

There was way too many changes to revert them at once. Reverting back. If you don't like the way it is currently presented, add/change just the first section instead of reverting the whole thing. --Maxal 23:56, 30 Apr 2005 (UTC)

As you noticed, I reverted not right away, but waited a while, then wrote the above, then waited more. I reverted all of it, because I did not want to do lots of work then you come back and revert again.

Now, I still believe the first section is better off the way it was before your changes. You made things more general at the expense of damaging the logical flow of the article.

Once we agree on that, I will put back the first section the way it was before you changed, and keep your other modifications. Oleg Alexandrov 00:06, 1 May 2005 (UTC)

There should be a clear distinction between C(n,k) and ${n\choose k}$ . What is currently defined at the top is actually C(n,k), not ${n\choose k}$ . So in the first two sections C(n,k) should be used while ${n\choose k}$ should appear later with complete definition as a generalization of C(n,k). --Maxal 11:46, 3 May 2005 (UTC)

I have no problems with that. However, one should mention on top that ${n\choose k}$ is an alternative notation. Oleg Alexandrov 15:46, 3 May 2005 (UTC)

[edit] Question: what is this thing?

Anyone ever see something that looks like this:

$\sum_{k=0}^n {n \choose k} y^k \frac {1}{k!} \frac{d^k}{dx^k} f(x)$

Any identities, relations, etc. for such a thing? Does it have a name? Its painfully suggestive of something but I can't put my finger on it ... (its not discussed in sheffer sequence or in binomial type or in hypergeometric series). Alternately, how about a non-integer generalization:

$\sum_{k=0}^\infty {k-s-1 \choose k} (-y)^k \frac {1}{k!} \frac{d^k}{dx^k} f(x)$

for s not an integer? How about the special cases y=1 or y=1/2? ... linas 05:00, 19 August 2005 (UTC)

[edit] Derivation from binomial expansion

I wonder what people think of the recent section Binomial coefficient#Derivation from binomial expansion. I copy the material below from KSmrq's talk page. Oleg Alexandrov 00:48, 7 September 2005 (UTC)

Hi KSmrq. Do you really think we need that lenghy derivation at Binomial coefficient? I think it gets the reader bogged down into unnecessary detail. I would think the reader needs to know how the coefficient is defined (the binomial expansion), how it is calcuated (Pascal's triangle), how it is used (combinatorics, etc). The derivation you put in the very first section, in my view, does not really help. Maybe we could put it in another article, or at the bottom of this article, what do you think?

Also, if you feel like working on this article, the section Binomial coefficient#Formulas involving binomial coefficients should probably make you mad. There, in many places they use sentences made up of only formulas (that is, x or so is the subject, the equal sign is the verb, etc :).

Let me know what you think (on this page). Oleg Alexandrov 17:28, 5 September 2005 (UTC)

I've got a guilty conscience for other articles left hanging, so I don't want to get sucked into working a lot on this one until I've done right by the others. By making an extra section, I hoped to support any decision by those of you who are heavily involved, either to keep it, toss it, move it, or fold it in.

As for why I wrote that section and placed it there: Yes, I felt the need, for the material if not the length. The article begins with formulae, then shows Pascal's triangle and mentions combinatorics, all with no intuitive connection or motivation. What I wrote is what works for me to understand the connections. The formula n?k = n! / (n−k)! k! does derive from the expansion, which was stated but nowhere shown. And the way it is derived, as I try to show, intimately relates to both Pascal's triangle and to the combinatoric connection.

If you and the other collaborators would prefer to state a string of definitions, facts, and formulae early and push intuition, derivation, and connection later, do what you think best. I would not make that choice, because of my understanding of how my mind (as most, I think) absorbs and retains material. On the other hand, if all I want is a quick reference putting essential facts at my fingertips, then the pedagogical material might be entirely superfluous.

I'm going to clean up my contribution a little, then abandon it to its fate. Chances are I'll visit again in the future to see how things have evolved. I hope the effort goes well, because I've actually done some research work that draws on multinomial coefficients, and it would be nice if WP has a good article. --KSmrq^T 19:38, 2005 September 5 (UTC)

You have good points. However, the first section might not be the right place to state the derivation, as it is too complicated. I would prefer to have the reader first accept that formula (by faith if you wish), then understand the Pascal triangle (which is in my opinion more significant than all that induction and factorial thing). Later, when the reader also sees the applications, he/she might be tempted to understand why that is true, then your explanation kicks in. How about that? (BTW, that article is indeed in sorry state, so I'd rather have you not abandon it for a while but instead work on it :) Oleg Alexandrov 20:06, 5 September 2005 (UTC)

[edit] Analogy to Vandermondes formula

Take a sample of $n$ items from a population of $N$ items. Get $i$ special items in the sample from $I$ special items in the population. This can be done in the following number of ways:

${I \choose i}{N-I \choose n-i}$

Vandermondes formula is

$\sum_{i=0}^n {I \choose i}{N-I \choose n-i}={N \choose n}$

A dual formula is

$\sum_{I=0}^N {I \choose i}{N-I \choose n-i}={N+1 \choose n+1}$

Example: N=4, n=2
    i=0 i=1 i=2
I=0  6   0   0
I=1  3   3   0
I=2  1   4   1
I=3  0   3   3
I=4  0   0   6

From the rows you deduce $i$ knowing $I$ . The row sum is ${4 \choose 2}=6$ . From the columns you induce $I$ knowing $i$ . The column sum is ${4+1 \choose 2+1}=10$

See the article on inferential statistics.

Bo Jacoby 07:09, 8 September 2005 (UTC)

See Vandermonde's identity. Michael Hardy 22:50, 8 September 2005 (UTC)

[edit] Uniform definitions

The last edit of the article made the notation "uniform," but I think in the wrong way. What do people prefer? ${n \choose k}$ or $C(n, k)$ . From my experience, the first form is used more. I also think it is easier to read when used in summation equations, much like the section "Formulas involving binomial coefficients" uses extensively. Stolee 08:10, 16 November 2005 (UTC)

Yeah,

C(n, k)

is a bad choice.

n C k

is better, but definitely the most widely used notation is $n\choose k$ . Apparently the reason

C(n, k)

was chosen was for "compactness", but if you need it to be more compact, try $\big({}^n_k\big)$ , produced by <math>\big({}^{n}_{k}\big)</math>, which looks very similar to what should be produced by <math>\textstyle n\choose k</math> if the <math> tag supported \textstyle. I'm not sure why it doesn't; I should investigate that. —Bkell 01:30, 17 November 2005 (UTC)

Okay, \textstyle has been requested over at MediaZilla. It's bug 4000, which is a nice number. —Bkell 02:08, 17 November 2005 (UTC)

It was me who made the notation "uniform". C(n, k) also has the advantage that it can be written in plain text in the same font, which would not work with $\big({}^n_k\big)$ . The notation $n\choose k$ is just too huge to fit anywhere inline, one's got to use it only in displayed formulas, on separate lines.

And my primary aim was indeed to make the notation uniform, as at least three notations coexisted. I don't mind $\big({}^n_k\big)$ , but I don't see a big need for it, and again PNG images inline are not desirable, as they look of size different than text, see also the math style manual. Oleg Alexandrov (talk) 02:36, 17 November 2005 (UTC)

I am going to change most of the C(n,k) notation in displaystyle formulas to the $\big({}^n_k\big)$ notation. Being the more widely used notation (which I don't have hard evidence for but I don't think anyone disagrees), it would make the article more readable and easier to search. But I agree that we should keep C(n,k) for inline use, just because $\big({}^n_k\big)$ would mess up the line spacing. 09:21, 9 June 2006 (UTC)

The reason I prefer $\big({}^n_k\big)$ over C(n, k) is mostly due to the fact that I don't see C(n, k) in use very often; it seems to be used as a fall-back if $\big({}^n_k\big)$ is impossible, and I find it more difficult to read and work with. But you have some very good points, so I guess I'll be content with whatever is used, as long as it's consistent. [If you're going to write C(n, k) as text, though, be sure to write it as C(''n'', ''k'').] —Bkell 04:19, 17 November 2005 (UTC)

[edit] The case of complex z

I partially undid Bo Jacoby's recent changes to this article. I don't think it is a good idea to start with a discussion of the binomial coefficient as a polynomial C(z, n) in a complex variable z. Most uses of the binomial coefficient is for C(m, n) with m and n positive integers, and most material in this article is devoted to that.

Per the math style manual, an article should start as elementary as possible, and the way binomial coefficents are first taught to people, high school or college, is using combinations (m choose n). That C(m, n) can be generalized to complex numbers shows up only when dealing with Taylor series of $(1 + x) z .$ I belive this can be best stated as a generalization of binomial coefficient, rather than a definition of that coefficient. Oleg Alexandrov (talk) 22:56, 23 November 2005 (UTC)

PS I agree that the article needs work. Help is appreciated. Oleg Alexandrov (talk) 22:58, 23 November 2005 (UTC)

[edit] Answer to Oleg

Compare formula A:

${n\choose k} = \prod_{j=1}^{k}{n-k+j\over j}$

with formula B:

${n \choose k} = \frac{n!}{k!(n-k)!}$

1. Formula A does not assume that the reader knows the factorial function, as does formula B.

2. Evaluation of formula A is easier because the intermediate results are smaller numbers.

3. The definition

$n!=\prod_{k=1}^n k$

needs the pi-notation of the product anyway.

4. Formula A applies also to negative, fractional, real and complex values of n, where formula B is undefined for negative values of n.

So formula A is better than formula B in every respect.

I agree that the concept of complex (or even real) numbers is not needed (nor wanted) here, but there is no need to have another definition in the special case (natural n) than in the general case (complex n). Bo Jacoby 18:22, 24 November 2005 (UTC)

Your formula uses the product notation too. But I agree with you that it is easier, I added it to the article. Oleg Alexandrov (talk) 21:39, 24 November 2005 (UTC)

[edit] Recent cleanup

I saw this page on the math project list for cleanup, so I made a couple changes, revising the induction paragraph (which had some nonsense before) and moving the multinomial stuff out of the way, in particular directing to multinomial theorem. I'm not planning on spending more time on this page, so feel free to accept/reject the changes as you see fit.--Luke Gustafson 06:44, 1 January 2006 (UTC)

I reintroduced the efficient calculation of binomial coefficients in the example. It is important. Bo Jacoby 20:01, 21 January 2006 (UTC)

[edit] Proposed MERGE with Combination

This page looks like it is about the exact same topic, and I think the pages should be merged. Any takers? Fresheneesz 23:14, 4 February 2006 (UTC)

Well observed. Go ahead and merge ! Bo Jacoby 12:46, 5 February 2006 (UTC)

JA: Keep Separate For Now. There is a signficant distinction between a combinatorial species, like a combination, and the enumerator thereof, like a binomial coefficient. One needs to think carefully, not just about the current state of the WP articles, which may be sparse, but about the literatures on the subjects, and also about the likely future developments of the two concepts, before merging the discussions of them. Jon Awbrey 13:48, 5 February 2006 (UTC)

Don't merge. These are distinct topics. The article on "combination" could use a major cleanup, however. linas 06:07, 6 February 2006 (UTC)

I cleaned the combination article. Bo Jacoby 07:53, 6 February 2006 (UTC)

Thanks, looks better now. Oleg Alexandrov (talk) 21:07, 6 February 2006 (UTC)

I am pleased that you approve. Bo Jacoby 04:54, 8 February 2006 (UTC)

[edit] Choose function

Looking on the google for '"choose function" combinatorics', I found a bunch of different references to people calling it the choose function. I also found a bunch of more reliable references calling it the choose function (not just choose). One even explained that the reason it got that name is because it is used so often as a function of other variables. We talked about the "choose function" alot in my CS40 class. Among others, I found these references:

I'll leave it up to you Oleg, if you think they're sufficient references for puting my edit back : P Fresheneesz 03:34, 8 February 2006 (UTC)

JA: Well, aside from being grammtically barbaric, it seems likely to cause confusion with "choice functions" in set theory. For example Constructive Mathematics. Google on <Troelstra "choice function"> for more. Jon Awbrey 05:08, 8 February 2006 (UTC)

Well, whether or not its confusing, its the way I, and many others, are taught it. I had never heard of "Binomial coefficient" before visiting this pace, even though I had learned all about it already. I think that if you're worried about confusion, a due note could be made about that in the article. Fresheneesz 08:02, 8 February 2006 (UTC)

JA: It never ceases to e-maze me that people who would actually say out loud "I never heard of X" would think themselves qualified to write e-cyclicals about X. Jon Awbrey 14:48, 8 February 2006 (UTC)

Hmm, that not very many Google hits, but undeniably there are some people calling it the "choose function". So I agree with Fresheneesz that it should be mentioned. Because of its rare use, I didn't put it in the first line though.

Jon, a more collegial attitude would be much appreciated. The nice thing about collaboration is that everybody can contribute what they know. -- Jitse Niesen (talk) 17:07, 8 February 2006 (UTC)

Thanks Jitse, you saved me the work I was supposed to do. Oleg Alexandrov (talk) 17:44, 8 February 2006 (UTC)

JA: That sounds real nice, Jitse. And, yes, a more collegial attitude would be appreciated. I'll be sure to quote you the next time somebody reverts a month's worth of my contributions based on a lack of what they've heard of. Jon Awbrey 17:50, 8 February 2006 (UTC)

I don't think that such a vague statement is appropriate, Jon. Whatever you may be referring to, it does not help in solving disputes. Oleg Alexandrov (talk) 20:55, 8 February 2006 (UTC)

Ok it looks great. Thanks you guys! Fresheneesz 21:12, 8 February 2006 (UTC)

[edit] Divisors of binomial coefficients

"A somewhat surprising result by David Singmaster (1974) is that any non-zero integer divides all but finitely many binomial coefficients." Gauge 'clarified' David Singmasters result about almost all to be all but finitely many. The number 1 occurs an infinite number of times in the pascal triangle, so he cannot mean that for only a finite number of pairs (n,k) is the binomial coefficient C(n,k) not divisible by the integer in consideration. What does almost all binomial coefficients mean? If that question cannot be answered the reference to David Singmaster's result must be omitted. Bo Jacoby 10:52, 9 March 2006 (UTC)

I hope my edit is a better clarification. For instance, the binomial coefficients that are odd form the Sierpinski triangle, which has area zero, so "almost all" binomial coefficients are even. -- Jitse Niesen (talk) 11:15, 9 March 2006 (UTC)

Jitse's clarification makes sense to me. Let's hope that it is also true. Thank you. Bo Jacoby 14:10, 9 March 2006 (UTC)

I inserted the original Singmaster reference, and Jitse's clarification makes sense to me also. Pgc512 20:12, 9 March 2006 (UTC)

Whoops. Sorry about that. I obviously didn't think that edit through; I'm glad that it's all cleared up now. - Gauge 05:07, 28 June 2006 (UTC)

[edit] 1 to 37 numberscalled jueteng

How we can compute numbers involving 1 to 37 numbers

[edit] practical calculation of the binomial coefficient

((((5/1)×6)/2)×7)/3 is there any reason not to clean this up? E.g. (5/1)×(6/2)×(7/3). Am I missing something?

You won't get integer byproducts in this way, meaning that the calculation will be more difficult.195.113.31.79 18:15, 12 March 2007 (UTC)

yes, and round-off errors are introduced, because 7/3 cannot be represented exactly as a decimal fraction or as a dyadic fraction. Bo Jacoby 16:21, 15 March 2007 (UTC).

[edit] Formula syntax corrections

Many of the formulas do not display. Instead their TEX contents are shown. Now that I have made them human-readable through IE6, could someone verify these for accuracy? (I did not change the content -- only the syntax used to properly display content). Also, not being familiar with generating functions, could someone who knows something about these make them human-readable? Vonkje 23:00, 15 March 2007 (UTC)

[edit] First sentence

I see that the first sentence was changed to give the algebraic definition (coefficients in the binomial expansion) instead of the combinatorial one (number of ways to choose k objects out of n). No reason was given. I think the latter definition is more accessible and hence that one should be mentioned in the first sentence. -- Jitse Niesen (talk) 01:11, 28 August 2007 (UTC)

As the name 'binomial coefficient' relates to the concept of a binomial, it seems pedagogical to begin with the binomial expansion, even if the combinatorial interpretation is the more accessible. See Multiset#Polynomial_notation for a unification of the two approaches. Bo Jacoby 06:59, 28 August 2007 (UTC).

[edit] 10:05, 3 November 2007 Lantonov (clean up using AWB)

What is the use of this 'cleanup'? The exponents 2 and 3 now have another style than other exponents, and they are harder to read. Bo Jacoby 16:13, 3 November 2007 (UTC).

[edit] Proofs

Many cases with a formula, I see statements like "Using (3) and induction, one can show that...," but may we actually see the proof written out, or at least have the proof be sourced? One thing about this article is that at least the online sources given do not provide proofs for a lot of the formulas present in this article and the article itself doesn't do it. Is it possible that either the proofs could be written in the article or links to online articles or resources be given showing the proofs?

I for one would really like to see proofs for (as of this revision) formulas (6) and (11) written out, as they don't have names and as such it is difficult to search the web for such proofs as resources discussing binomial coefficients don't always talk about everything.

Also, 2 formulas are labelled as (12). Cornince (talk) 23:25, 22 November 2007 (UTC)

Here's a good source here:

http://binomial.csueastbay.edu/Pascal1.html

http://binomial.csueastbay.edu/Identities.html

This mentions (5), which is called Pascal's 5th Identity, among others. Perhaps we can start inserting the names of at least some of the identities next to them? Cornince (talk) 00:48, 25 November 2007 (UTC)

Thanks for catching the double-labelling. The first (12) defines multinomial coefficients, so I moved it straight to that section and removed its label. My response to your comment on proofs is on your talk page.

Regarding the name issue, there are too many binomial identities for each one to have a name (and most of them are "trivial", in any case; see W Z theory for details, if the article is ever written). Only a few of them (Pascal's rule, the Vandermonde identity) really have names, and those are already named in the article. Michael Slone (talk) 08:44, 25 November 2007 (UTC)

[edit] C-code

The code of the article computes intermediate fractional numbers:

long double accum = 1;  for (int i = 0; i < k; ++i) accum = accum / (k-i) * (n-i);

It is better to use integer arithmetic all along such that for instance

choose(8,4) = ((((((5/1)*6)/2)*7)/3)*8)/4

Bo Jacoby (talk) 02:38, 21 January 2008 (UTC).

The intent was to provide simple code that avoids overflow (as the arguments get large) and works well in the typical desktop environment (which has fast floating-point hardware). Bo's algorithm, which might be worth mentioning as requiring only integer operations, seems more susceptible to overflow. — DAGwyn (talk) 00:06, 23 January 2008 (UTC)

While we're on the subject, a Pascal triangle based method would avoid all multiplication and division operations, using just addition, but is limited in many practical cases by recursion depth (as many levels as the larger of k or n-k). The double recursion could perhaps be changed to iteration, but then the bookkeeping would need an inordinate amount of storage. I wish I had my copy of Knuth's TAOCP vol. 1 nearby; I'm sure there must be bin.coef. algorithms in there. — DAGwyn (talk) 00:20, 23 January 2008 (UTC)

Just reverse the order of operations such that all intermediate results get exact integer values, even if stored in floating point format. ('i' is supposed to take values from 1 thru 'k'. Please check the code. I did not test it).

long double accum = 1;  for (int i = 1; i < k+1; ++i) accum = (accum * (n-k+i))/i;

Bo Jacoby (talk) 15:57, 23 January 2008 (UTC). PS. Hi DAGwyn. 'accum' means accumulator, which usually indicates a result of additions, not of multiplications and divisions. So the variable name 'accum' is not well chosen. Also there is no fear of overflow, so there is no reason for converting to floating point numbers for intermediate results and converting back to integer format. Try

long long t = 1;  for (int i = 1; i < k+1; ++i) t = (t*(n-k+i))/i;

I am not sure if integer division is called / or something else in C, so please improve, but the present code is not suited for publication. Bo Jacoby (talk) 08:46, 28 January 2008 (UTC).

It is an accumulator; the result is accumulated factor by factor. Think logarithmically if you have a need to tie it to addition. I appreciate improvements to the code, so long as they meet all the requirements (avoiding premature overflow, etc.) — DAGwyn (talk) 16:39, 15 February 2008 (UTC)

Also note that k+1 doesn't work if k and n havee the maximum representable value. — DAGwyn (talk) 17:14, 15 February 2008 (UTC)

[copied from User talk:DAGwyn] Hi, Doug. In the code you provide you use a long double to accumulate and you then mess with rounding error. If you rearranged your computation, you could make use of the fact that $n (n - 1)...(n - m)$ is always divisible by $m!$ and be assured that your calculation will always remain integer. So "9 choose 3" would be computed $(((((9 / 1) * 8) / 2) * 7) / 3)$ . Perhaps you chose to do your arithmetic in long double precision in case the value of "n choose m" will fit as an unsigned long long but the last multiplication would have overflowed before the last division had a chance to tame it? Similarly, do you do add the .5 because we're stuck with finite precision? Remember, I'm a theoretical computer scientist, so when I say integer I usually think the mathematician's meaning, not some element of { $- k,...,0,1,2,..., k$ } for a very finite $k$ . Regards... PaulTanenbaum (talk) 02:28, 15 February 2008 (UTC)

There were three reasons for my choice:

to avoid unnecessary overflow such as you mentioned,
to directly implement a formula exhibited earlier in the article, and
to keep the code simple and transparent.

The +.5 is needed in any event when converting a (positive) floating-point representation to integer, since otherwise the approximate (or even exact) integer value could truncate downward by nearly 1; this is allowed by the C standard and actually occurs in practice (sometimes). For the code I originally presented, there are also accumulated rounding errors due to intermediate values that are not exactly representable in floating-point format, resulting from some of the divisions. See above suggestions for ways to ensure that all intermediate values are (theoretically) exact integers.

It is possible for a standard-conforming C implementation to have more significant bits in the representation of type "long long" than in type "long double", alas. It might be useful for a similar function to return type long double, considering that many applications involve floating-point anyway.

I'll consider revising the code, but whatever formula it directly implements will need to be established in the article text. — DAGwyn (talk) 16:39, 15 February 2008 (UTC)

Okay, the code has been revised to take into account the above discussion. — DAGwyn (talk) 17:14, 15 February 2008 (UTC)

Is this really correct:

if (n <= 0 || k < 0 || k > n)
    return 0;

This also means that:

if (n == 0)
    return 0;

Which seems wrong to me. Because, ${0 \choose 0} = 1$ . --Joctee (talk) 00:53, 16 February 2008 (UTC)

Thanks for pointing that out. I fixed it in the article. — DAGwyn (talk) 22:36, 27 March 2008 (UTC)

I came across a very elegant implementation on this page: http://blog.plover.com/math/choose.html

       unsigned choose(unsigned m, unsigned n) {
         unsigned r = 1;
         unsigned d;
         if (n > m) return 0;
         for (d=1; d <= n; d++) {
           r *= m--;
           r /= d;
         }
         return r;
       }

Advantages are that is clear to the layman (at least to me) and it actually works. —Preceding unsigned comment added by Kieckerjan (talk • contribs) 16:25, 20 February 2008 (UTC)

Unfortunately that can overflow, and when it does C's rules for unsigned arithmetic practically guarantee that you get a wrong answer, silently. The code that was in the article can "unnecessarily" overflow only when int and long long have the same width, which is quite unusual, and then only for numbers near the top of the representable range. I have just updated the code to remove even that possibility and to simplify it. — DAGwyn (talk) 22:36, 27 March 2008 (UTC)

I don't understand the considerations of "near the top of the representable range". As you should be aware, binomial coefficients grow quite large down Pascal's triangle, especially near the center. For instance $\tbinom{68}{31}=21912870037044995008$ , which cannot be stored in an unsigned 64-bit integer, while 68 is not near the top of the representable range in any sense. So none of the functions considered could return such a value. Quality code should explicitly test for overflow on all possible occasions. The context where computing binomial coefficients by a procedure make sense seem to be limited to

all floating-point computations (for statistical applications for instance), or
all unlimited-size integer computations (for symbolic algebra for instance).

In both cases it is best to develop numerator and denominator in an interleaved fashion. When only those coefficients that certainly will not overflow machine integers are needed (in other words very few of them) using a precomputed (additively) table should be best. Marc van Leeuwen (talk) 15:10, 9 June 2008 (UTC)

[edit] Suggestions for improvement

I've only just come across this article, and I see that it has a long history, so I hesitate to correct without discussion things I think can be improved. Here is what I think needs changing, in order of importance

The section Binomial coefficient#Generalization to negative integers given an extension to values $k < 0$ that is completely unmotivated and conflicts with conventional use. Convention is that binomial coefficients are identically 0 when the second coefficient is a negative integer (see equation (5.1) in (Graham, Knuth, Patashnik; 1989), which book is extremely careful about validity of identities for non-natural number parameters. The given formula (actually two overlapping formulas for the cases $k\leq n<0$ which fortunately coincide there) seems to be inspired by aesthetics only: the pattern around the origin looks pretty, but fails Pascal's identity ${\tbinom {n-1}{k-1}}+{\tbinom {n-1}k}={\tbinom n k}$ exactly at the origin (in spite of its obvious aesthetic qualities, the equation $1 + 1 = 1$ is false), which I consider particularly insidious. In my opinion there should at most be a side remark that this is a possible definition, explaining that this preserves equation (4) (by design), but also explaining all the things that are broken by it.

Under Binomial coefficient#Definition, the first sentence "the binomial coefficient is defined to be the natural number ..." is followed by the symbol for the binomial coefficient and two different formulas. In my opinion this should be "the binomial coefficient, written $\tbinom nk$ is defined as the coefficient of x^ky^n−k in the expansion of the polynomial $(x + y) n$ . This number can be seen to be the number of distinct k-element subsets of some n-element set, and can be computed by the formulas ...". It could also be the coefficient of x^k in $(1 + x) n$ , or one could take the combinatorial description as first definition, that is not my point here; the point is that is seems silly that binomial coefficients should be defined by two different complicated formulas; nobody would care about such expressions if they did not have a combinatorial/algebraic interpretation. These formulas are only convenient for computing binomial coefficients, they belong to implementation, not specification, in computer science speak.

The article postpones an extension of binomial coefficients to non-natural number parameters, which is good in view of the increasing level of complication. However, I think that one could indicate that such an extension exists, and already tag identities with conditions on the (extended) parameters for their validity, so that readers have a way to know about this. For instance equation (4) ( $\tbinom nk = \tbinom n{n-k}$ ) could be tagged "integer $n\geq0$ , integer k", since it fails (while at least one member is still defined) when n is not a positive integer. One might omit the condition "integer k" since it is required for either member to be defined, given the condition on n. The cited book (Graham, Knuth, Patashnik; 1989) meticulously provides such tags, which is quite valuable in practice (and it makes it easy to tag the equations in this article).

The section Binomial coefficient#Generalization to real and complex argument gives the usual extension to arbitrary first parameter (it could also proclaim the convention that negative integer second parameter makes the binomial coefficient identically zero, so that ${\tbinom {n-1}{k-1}}+{\tbinom {n-1}k}={\tbinom n k}$ makes sense (and holds) everywhere). However I find the reference to real and complex numbers laughable, even though I know many authors make it. This has nothing to do with properties of real or complex numbers (the latter apparently being considered the summum of sophistication), it can also be used for any polynomial in Q[X], or for any quaternion for that matter (the expressions can always be evaluated in a commutative subring). So in fact, not only is " $f(z)=\textstyle{z\choose k}$ is a polynomial (function) in z of degree k with rational coefficients", it is actually true that that $\tbinom Xk$ is a polynomial (of degree k if $k\ge0$ ), in Q[X]. The point of the generalisation is that the first parameter can be treated as formal symbol, and all that matters is that multiplication and division by nonzero integers is defined (whence one cannot work in Z/pZ or Z[X]). If one really wants the general setting where the definition makes sense, I would say any Q-algebra. On the other hand the cases where I have seen practical uses of non-integer numerical values of the first parameter are limited to rational values.

Marc van Leeuwen (talk) 14:37, 9 June 2008 (UTC)

Are you sure that the definition in Graham et al. is the conventional one? Maple (software) uses the definition in the article. I remember though I was rather surprised when I discovered this for the same reason as you (it fails Pascal's identity).

I agree with point 3, though I'm not so sure that it is that useful if there is no agreement on how to define the binomial coefficients for negative k. I also agree with point 2, and I simply do not know about point 4. -- Jitse Niesen (talk) 15:50, 9 June 2008 (UTC)