Examples of generating functions

From Wikipedia, the free encyclopedia

The following examples are in the spirit of George Pólya, who advocated learning mathematics by doing and re-capitulating as many examples and proofs as possible. The purpose of this article is to present common tricks of the trade in context, so that the reader may incorporate them into his knowledge.

Contents

[edit] Worked example A: basics

New generating functions can be created by extending simpler generating functions. For example, starting with

G(1;x)=\sum_{n=0}^{\infty} x^n = \frac{1}{1-x}

and replacing x with 2x, we obtain

G(1;2x)=\frac{1}{1-2x} = 1+(2x)+(2x)^2+\cdots+(2x)^n+\cdots=G(2^n;x).

[edit] Worked example B: Fibonacci numbers

Consider the problem of finding a closed formula for the Fibonacci numbers Fn defined by F0 = 0, F1 = 1, and Fn = Fn−1 + Fn−2 for n ≥ 2. We form the ordinary generating function

f = \sum_{n \ge 0} F_n x^n

for this sequence. The generating function for the sequence (Fn−1) is xf and that of (Fn−2) is x2f. From the recurrence relation, we therefore see that the power series xf + x2f agrees with f except for the first two coefficients:

\begin{array}{rcrcrcrcrcrcr} f    & = & F_0x^0 & + & F_1x^1 & + & F_2x^2 & + & \cdots & + & F_ix^i & + &\cdots\\ xf   & = &        &  & F_0x^1  & + & F_1x^2 & + & \cdots & + &F_{i-1}x^i & + &\cdots\\ x^2f & = &        &  &         &   & F_0x^2 & + & \cdots & + &F_{i-2}x^i & +&\cdots\\ (x+x^2)f & = &    &  & F_0x^1  & + & (F_0+F_1)x^2 & + & \cdots & + & (F_{i-1}+F_{i-2})x^i & +&\cdots\\      & = &        &  &         &   & F_2x^2       & + & \cdots & + & F_ix^i & +& \cdots\\ \end{array}

Taking these into account, we find that

f = xf + x^2 f + x . \,\!

(This is the crucial step; recurrence relations can almost always be translated into equations for the generating functions.) Solving this equation for f, we get

f = \frac{x} {1 - x - x^2} .

The denominator can be factored using the golden ratio φ1 = (1 + √5)/2 and φ2 = (1 − √5)/2, and the technique of partial fraction decomposition yields

f = \frac{1}{\sqrt{5}} \left (\frac{1}{1-\varphi_1 x} - \frac{1} {1- \varphi_2 x} \right ) .

These two formal power series are known explicitly because they are geometric series; comparing coefficients, we find the explicit formula

F_n = \frac{1} {\sqrt{5}} (\varphi_1^n - \varphi_2^n).

[edit] Worked example C: edges in an undirected graph

The focus of the example is the use of generating functions in several variables to achieve the goal of classifying the elements m of a set E according to several parameters r_0(m), \, r_1(m), \, \ldots through the correspondence m \leftrightarrow w_0^{r_0(m)} w_1^{r_1(m)} \cdots. Here the variables of the generating function are w_0, \, w_1, \, \ldots and the generating function f is

f(w_0, w_1, \ldots) = \sum_{m\in E} w_0^{r_0(m)} w_1^{r_1(m)} \cdots

so that the number c of elements of E that correspond to the tuple of parameters (r_0, r_1, \ldots) is given by

c = [w_0^{r_0} w_1^{r_1} \cdots] f(w_0, w_1, \ldots),

where we have used the coefficient-extraction operator for formal power series.

The example we will study concerns a undirected graph G = (V,E) whose vertices V correspond to words of length t over an alphabet Σ containing exactly l letters, i.e.

\Sigma=\{a_1,a_2, \ldots,a_l\} \quad \mbox{and} \quad V \subseteq \Sigma^t.

In the following we will use the numbers from 0 to l − 1 in place of the letters to keep the notation simple.

Furthermore we are given a set of functions p_k: \Sigma \rightarrow 2^\Sigma that determine which of the l^t\, possible vertices occur in the graph, and what the edges are. More precisely, pk(λ) specifies a set of possible replacements for λ, if it occurs at position k. In that case, λ may be changed to any letter occuring in pk(λ), yielding an adjacent word, i.e., vertex. Edges exist between adjacent words, but only if they differ in exactly one position. Intuitively speaking, the pk provide the neighbors of a particular letter ocurring at position k, and the neighbors of a word are those words that may be obtained by selecting one position and replacing the letter λ at that position by one of its neighbors, i.e. an element of pk(λ). We are guaranteed that the pk are such that adjacency always works both ways, i.e. if \mu \in p_k(\lambda), then \lambda \in p_k(\mu). Words containing letters at positions where they have no neighbors i.e. where | pk(λ) | = 0 are omitted from the set of vertices. The question we seek to answer is, how many edges does this graph contain?

Here the set whose elements we seek to enumerate and classify is the set of vertices V, and we need to classify the vertices m according the number of letters having 1, 2, 3 \ldots neighbors, so that

m \leftrightarrow w_1^{r_1(m)} w_2^{r_2(m)} \cdots,

with r1(m) the number of letters having one neighbor at their position, r2(m) the number of letters having two neighbors at their position, etc., because then the number of edges originating at m is given by

\left( \sum_{j=1}^{l-1} j \frac{d}{d w_j} w_1^{r_1(m)} w_2^{r_2(m)} \cdots \right)_{w_1=\ldots=w_{l-1}=1}.

This is because w_q^{r_q} indicates the presence of rq letters having q neighbors in m, which contribute q \, r_q edges but

q \, r_q = q \, \left( \frac{d}{d w_q} w_q^{r_q} \right)_{w_q=1}.

This process will count edges twice (once at each endpoint), finally giving the formula for the number T of edges

T = \frac{1}{2} \left( \sum_{j=1}^{l-1} j \frac{d}{d w_j} f(w_1, \ldots, w_{l-1}) \right)_{w_1=\ldots=w_{l-1}=1}.

It remains to find f(w_1, \ldots, w_{l-1}). With this in mind we introduce the quantities vk(q), the number of letters at position k having q neighbors, i.e.

v_k(q) = [z^q] \sum_{\lambda\in\Sigma} z^{|p_k(\lambda)|}.

We will also need Lk, the number of letters actually ocurring at position k, i.e.

Lk = lvk(0).

(Recall that letters having no neighbors do not contribute to the set of possible vertices.)

Then the generating function f is given by

f(w_1, \ldots, w_{l-1}) = \prod_{k=0}^{t-1} \sum_{q=1}^{l-1} v_k(q) w_q.

[edit] Computing the answer

It happens that T can be computed explicitly, giving a simple and useful formula. This computation is based on the observation that

\left( \frac{d}{d w_j} f(w_1, \ldots, w_{l-1}) \right)_{w_1=\ldots=w_{l-1}=1}

evaluates to

\left( f(w_1, \ldots, w_{l-1}) \sum_{k=0}^{t-1} \frac{v_k(j)}{\sum_{q=1}^{l-1} v_k(q) w_q} \right)_{w_1=\ldots=w_{l-1}=1}

which simplifies to

\prod_{k=0}^{t-1} L_k \sum_{k=0}^{t-1} \frac{v_k(j)}{L_k},

and yields

T = \frac{1}{2} \left( \sum_{j=1}^{l-1} j \sum_{k=0}^{t-1} \frac{v_k(j)}{L_k} \right) \, \prod_{k=0}^{t-1} L_k.

[edit] A simple example

Let Σ = {0,1,2}, so that l = 3, and consider words (vertices) consisting of two letters, i.e. t = 2. The set of neighbor functions pk is given by

p_0(0)=\{1\}, \, p_0(1)=\{0,2\}, \, p_0(2)=\{1\} \quad \mbox{and} \quad p_1(0)=\{1\}, \, p_1(1)=\{0\}.

The generating function becomes

(w_1 + w_2 + w_1) (w_1 + w_1) = (2w_1 + w_2) 2 w_1 = 4 w_1^2 + 2 w_1 w_2 = \sum_{q=1}^2 v_0(q) w_q \sum_{q=1}^2 v_1(q) w_q.

Carrying out the differentiation we find the following expression for T:

\frac{1}{2} \left( 4 \times 1 \left( \frac{d}{d w_1} w_1^2 \right)_{w_1=1} + 2 \times \left( 1 \left( \frac{d}{d w_1} w_1 w_2 \right)_{w_1=w_2=1} + 2 \left( \frac{d}{d w_2} w_1 w_2 \right)_{w_1=w_2=1}  \right) \right),

or

\frac{1}{2} \left( 4 \times 2 + 2 \times ( 1 + 2 )  \right) = 7,

so we may expect seven edges.

Indeed, these edges are

00 \leftrightarrow 01, \, 00 \leftrightarrow 10, \, 10 \leftrightarrow 11, \, 10 \leftrightarrow 20, \, 01 \leftrightarrow 11, \, 11 \leftrightarrow 21, \, 20 \leftrightarrow 21.

The graph is shown below.

image:WordGraphForOGF.png

The result matches what we obtain by substituting into the formula. We have v_0(1) = 2, \, v_0(2)=1 and v_1(1) = 2\,, and L0 = 3 and L1 = 2, giving

\frac{1}{2} \left( 1 \times \left( \frac{2}{3} + \frac{2}{2} \right) + 2 \times \left( \frac{1}{3} \right) \right) \times 3 \times 2 = 7.

[edit] Special case

There is an important special case that yields a simpler formula for T. This is the case of there not being any "inert" letters, i.e. all letters have neighbors or vk(0) = 0 at all positions, which implies that Lk = l for all k. The simplified formula becomes

T = \frac{1}{2} \left( \sum_{j=1}^{l-1} j \sum_{k=0}^{t-1} v_k(j) \right) \, l^{t-1}.

Suppose, for example, that all letters are adjacent to those differing by one from their numerical value, independent of their position. Hence there are two letters having one neighbor (namely 0 and l − 1), and the remaining letters have two neighbors. Substitution then yields

T = \frac{1}{2} \left( 1 \times 2 t + 2 \times (l-2) t \right) l^{t-1} = t \, (l-1) \, l^{t-1}.

This example is from Les-Mathematiques.net. Variations on this problem, including a considerable amount of additional material, can be found among the external links.

[edit] Worked example D: bivariate generating functions and digit sums

Ordinary generating functions are not limited to representing sequences that depend only on a single parameter. More parameters are possible, e.g. two parameters, or more, as seen in the previous example. It is also possible to mix types of generating functions, as in bivariate probability generating functions, where an ordinary generating function generates an exponential one. These types of generating functions are often referred to as super generating functions. Here we show an example of how to work with bivariate generating functions. Let n be a natural number and consider the integers from 0 to 102n − 1, where those integers that have fewer than 2n digits are padded with zeros on the left, so that all of them may be considered to have 2n digits. E.g. for n = 3 the range goes from 000000 to 999999. We will use bivariate generating functions to answer the following question: what is the sum of those integers in [0,102n − 1] whose first n digits sum to the same value as the last n digits, i.e. the digit sum of the first half is equal to the digit sum of the second half. (The term digital sum is also used for digit sums.) We call these integers balanced. E.g. the integer 124511 would contribute to the sum for n = 3, because 1+2+4=7=5+1+1. The value that it would contribute is simply 124511. It is certainly impossible to compute this sum by direct enumeration for large n (take n = 20, for example).

We let En denote the desired sum, and introduce gn,k, which gives the number of integers n digits long (left-padding with zeros as before to obtain 10n values) whose digits sum to k, and sn,k, which is the sum of the integers n digits long whose digits sum to k. The key observation is to note that

E_n = \sum_{k=0}^{9 n} (10^n+1) s_{n, k} \, g_{n, k}.

To see this, fix k, the digit sum, and consider the set of balanced integers of length 2n, the digit sum of the first and second half of which is k. This set of integers is created by pairing any integer of length n and digit sum k with any other, so any particular value occurs gn,k times in the first half (multiplied by 10n), and gn,k times in the second half. Adding all of these integers yields sn,k. The number of summands that contribute to En is given by

F_n = \sum_{k=0}^{9 n} g_{n, k}^2.

We now introduce the following two bivariate generating functions:

G(z, u) = \sum_{n \ge 1, \,k \ge 0} g_{n, k} u^k z^n  \quad \mbox{and} \quad S(z, u) = \sum_{n \ge 1, \,k \ge 0} s_{n, k} u^k z^n.

(We require that gn,k = 0 and sn,k = 0 for k > 9n, because 9n is the maximum digit sum possible.) We will also require two auxiliary functions to keep the notation simple:

V(u) = \sum_{r=0}^9 u^r \quad \mbox{and} \quad W(u) = \sum_{r=0}^9 r u^r.

Now from the definition of G(z,u) it follows immediately that

G(z, u) = \sum_{n \ge 1} V(u)^n z^n =  \frac{V(u) z}{1 - V(u) z}.

This is because the terms of V(u)n represent the process of choosing one decimal digit for the first position on the left, one for the second, and so on. The exponents are added so that the term uk occurs every time the digits sum to k.

The next key observation is that for n>1\,,

s_{n, k} = \sum_{r=0}^9 (10 s_{n-1, k-r} + g_{n-1, k-r} r).

This simply says that the integers that contribute to s_{n, k}\, can be obtained from those that contribute to s_{n-1, k-r}\,, where r is a decimal digit, by shifting the latter one position to the left (multiplication by ten) and appending r (recall that there are g_{n-1, k-r}\, integers that contribute to s_{n-1, k-r}.\,)

We now prepare to reconstitute S(z,u) by summing the above recurrence over n>1,\, k\ge 0. The term for [z^1] S(z, u)\,, i.e. n = 1, will be missing from the sum on the left, so we compute it and add it in. (In what follows, we will use the coefficient-extraction operator for formal power series.) We have

\sum_{k \ge 0} s_{1, k} u^k z  = \sum_{r=0}^9 r u^r z = W(u) z.

Adding everything, we find

S(z, u) = z W(u) + 10z V(u) S(z, u) + z W(u) G(z, u)\,

and hence

S(z, u) = \frac{z W(u) +  z W(u) G(z, u)}{1 - 10z V(u)}  = z W(u) \frac{1 + G(z, u)}{1 - 10z V(u)}

or

S(z, u) = \frac{z W(u)}{(1 - z V(u))(1 - 10z V(u))} = \frac{1}{9} \frac{W(u)}{V(u)} \left( \frac{1}{1 - 10z V(u))} - \frac{1}{1 - z V(u)} \right)

which is a simple rational generating function.

Recall our earlier formula, which we may re-write as

E_n = \sum_{k=0}^{9 n} (10^n+1) [z^n u^k] S(z, u) \, [z^n u^k] G(z, u).

We may now compute the En, since we have closed forms for G(z,u) and S(z,u). This yields the following table for n and En:

    1 495
    2 3349665
    3 27625972374
    4 240801497591985
    5 2162288199783771180
    . ...
   17 1185633898500558643116053969483499881436610149944135688394603051650
   18 11525195623906119101843912373578899988474804376093880898156087626421100
   19 112203312767859412537217211281711779998877966872321405874627827887182882200
   20 1093844474149520613133628019194480743399890615552585047938686637198080551925660.

[edit] Alternate explicit formula

An explicit formula for En, which however is not as amenable to symbolic computation as the previous form, can be obtained by expanding the geometric series in the partial fraction decomposition of S(z, u)\, and using the symmetry of [z^n] G(z, u)\, to observe that

E_n = (10^n+1) \sum_{k=0}^{9 n} s_{n, k} \, g_{n, 9n-k},

which gives

E_n = (10^n+1) [u^{9n}] ([z^n] G(z, u) [z^n] S(z, u)).\,

But

[z^n] S(z, u) =  \frac{1}{9} \frac{W(u)}{V(u)} \left( 10^n V(u)^n - V(u)^n \right) = \frac{1}{9} W(u) (10^n - 1) V(u)^{n-1}

so that

E_n = \frac{1}{9} (100^n - 1) [u^{9n}] W(u) V(u)^{2n-1}.

[edit] Alternate recurrence

The astute reader will have observed that we obtained the recurrence for sn,k by appending the digit r to a contributor to s_{n-1, k-r}\,. Of course it is entirely possible to prepend r to such contributors, which yields the alternate recurrence

s_{n, k} = \sum_{r=0}^9 (10^{n-1} g_{n-1, k-r} r + s_{n-1, k-r}).

Summing as before, we find

S(z, u) = z W(u) + z W(u) G(10z, u) + z V(u) S(z, u)\,

Solving this for S(z,u) produces, as it ought to, the same solution as before:

S(z, u) = z W(u) \frac{1 + G(10z, u)}{1 - z V(u)} = \frac{z W(u)}{(1 - z V(u))(1 - 10z V(u))}.

[edit] Verification

When working with generating functions, it is important to verify that they yield sensible values. By setting variables in multivariate ordinary generating functions to one, we obtain generating functions of superclasses of combinatorial structures or quantities, since one or more of the parameters disappear. E.g. G(z, 1)\, is the generating function of the total count of n-digit nonnegative integers, where left-padding with zeros is used. Indeed we have

[z^n] G(z, 1) = [z^n]\frac{10z}{1-10z} = 10^n,

which is correct. Similarly, S(z, 1)\, is the generating function of the sum of all n-digit nonnegative integers, again with left-padding by zeros. We have

[z^n] S(z, 1) = [z^n] \frac{45 z}{(1 - 10z)(1-100z)} = [z^n] \left( \frac{1}{2} \frac{1}{1-100z} - \frac{1}{2} \frac{1}{1-10z} \right).

Hence

[z^n] S(z, 1) = \frac{1}{2} 100^n - \frac{1}{2} 10^n.

This too is the correct value, since

\sum_{k=0}^{10^n-1} k = \frac{1}{2} (10^n - 1) 10^n.

Additional material concerning this example, as well as a snapshot of how it evolved, can be found among the external links.

[edit] External links