Random walk

Example of eight random walks in one dimension starting at 0. The plot shows the current position on the line (vertical axis) versus the time steps (horizontal axis).

An animated example of three Brownian motion-like random walks on a torus, starting at the centre of the image.

A random walk, sometimes denoted RW, is a mathematical formalisation of a trajectory that consists of taking successive random steps. The results of random walk analysis have been applied to computer science, physics, ecology, economics, psychology and a number of other fields as a fundamental model for random processes in time. For example, the path traced by a molecule as it travels in a liquid or a gas, the search path of a foraging animal, the price of a fluctuating stock and the financial status of a gambler can all be modeled as random walks. The term random walk was first introduced by Karl Pearson in 1905^[1].

Various different types of random walks are of interest. Often, random walks are assumed to be Markov chains or Markov processes, but other, more complicated walks are also of interest. Some random walks are on graphs, others on the line, in the plane, or in higher dimensions, while some random walks are on groups. Random walks also vary with regard to the time parameter. Often, the walk is in discrete time, and indexed by the natural numbers, as in $X_0,X_1,X_2,\dots$ . However, some walks take their steps at random times, and in that case the position $X_t$ is defined for the continuum of times $t\ge 0$ . Specific cases or limits of random walks include the drunkard's walk and Lévy flight. Random walks are related to the diffusion models and are a fundamental topic in discussions of Markov processes. Several properties of random walks, including dispersal distributions, first-passage times and encounter rates, have been extensively studied.

1 Lattice random walk
2 Random walk on graphs
- 2.1 Relation to Wiener process
3 Self-interacting random walks
4 Applications
5 Probabilistic interpretation
6 Properties of random walks
- 6.1 Simple random walk
- 6.2 Non-reversal random walk
7 See also
8 References
- 8.1 Bibliography
9 External links

Lattice random walk

A popular random walk model is that of a random walk on a regular lattice, where at each step the walk jumps to another site according to some probability distribution. In simple random walk, the walk can only jump to neighbouring sites of the lattice. In simple symmetric random walk on a locally finite lattice, the probabilities of walk jumping to any one of its neighbours are the same. The most well-studied example is of random walk on the d-dimensional integer lattice (sometimes called the hypercubic lattice) $\mathbb Z^d$ .

One-dimensional random walk

Imagine a one-dimensional length of something, a 'line'. Now imagine this line has numbers on it, spaced apart equally. A particularly elementary and concrete random walk is the random walk on the integer number line, $\mathbb Z$ , which starts at $S_0 = 0\,\!$ and at each step moves by ±1 with equal probability. To define this walk formally, take independent random variables $Z_1,Z_2,\dots$ , where each variable is either 1 or −1, with a 50% probability for either value, and set $S_n:=\sum_{j=1}^nZ_j.$ The series $\{S_n\}\,\!$ is called the simple random walk on $\mathbb Z$ . This series (the sum of the sequence of -1's and 1's) gives you the length you have 'walked', if each part of the walk is of length one.

This walk can be illustrated as follows. Say you flip a fair coin. If it lands on heads, you move one to the right on the number line. If it lands on tails, you move one to the left. So after five flips, you have the possibility of landing on 1, −1, 3, −3, 5, or −5. You can land on 1 by flipping three heads and two tails in any order. There are 10 possible ways of landing on 1. Similarly, there are 10 ways of landing on −1 (by flipping three tails and two heads), 5 ways of landing on 3 (by flipping four heads and one tail), 5 ways of landing on −3 (by flipping four tails and one head), 1 way of landing on 5 (by flipping five heads), and 1 way of landing on −5 (by flipping five tails). See the figure below for an illustration of this example.

Five flips of a fair coin

What can we say about the position $S_n\,\!$ of the walk after $n\,\!$ steps? Of course, it is random, so we cannot calculate it. But we may say quite a bit about its distribution. It is not hard to see that the expectation $E(S_n)\,\!$ of $S_n\,\!$ is zero. That is, the more you flip the coin, the closer the mean of all your -1's and 1's will be to zero. For example, this follows by the finite additivity property of expectation: $E(S_n)=\sum_{j=1}^n E(Z_j)=0$ . A similar calculation, using the independence of the random variables $Z^n\,\!$ , shows that $E(S_n^2)=n$ . This hints that $E(|S_n|)\,\!$ , the expected translation distance after n steps, should be of the order of $\sqrt n$ . In fact,

$\lim_{n\to\infty} \frac{E(|S_n|)}{\sqrt n}= \sqrt{\frac 2{\pi}}.$

Suppose we draw a line some distance from the origin of the walk. How many times will the random walk cross the line if permitted to continue walking forever? The following, perhaps surprising theorem is the answer: simple random walk on $\mathbb Z$ will cross every point an infinite number of times. This result has many names: the level-crossing phenomenon, recurrence or the gambler's ruin. The reason for the last name is as follows: if you are a gambler with a finite amount of money playing a fair game against a bank with an infinite amount of money, you will surely lose. The amount of money you have will perform a random walk, and it will almost surely, at some time, reach 0 and the game will be over.

If a and b are positive integers, then the expected number of steps until a one-dimensional simple random walk starting at 0 first hits b or −a is ab. The probability that this walk will hit b before -a steps is $a/(a+b)$ , which can be derived from the fact that simple random walk is a martingale.

Some of the results mentioned above can be derived from properties of Pascal's triangle. The number of different walks of n steps where each step is +1 or −1 is clearly 2ⁿ. For the simple random walk, each of these walks are equally likely. In order for S_n to be equal to a number k it is necessary and sufficient that the number of +1 in the walk exceeds those of −1 by k. Thus, the number of walks which satisfy $S_n=k$ is precisely the number of ways of choosing (n + k)/2 elements from an n element set (for this to be non-zero, it is necessary that n + k be an even number), which is an entry in Pascal's triangle denoted by $n \choose (n+k)/2$ . Therefore, the probability that $S_n=k$ is equal to $2^{-n}{n\choose (n+k)/2}$ . By representing entries of Pascal's triangle in terms of factorials and using Stirling's formula, one can obtain good estimates for these probabilities for large values of $n$ .

This relation with Pascal's triangle is easily demonstrated for small values of n. At zero turns, the only possibility will be to remain at zero. However, at one turn, you can move either to the left or the right of zero, meaning there is one chance of landing on −1 or one chance of landing on 1. At two turns, you examine the turns from before. If you had been at 1, you could move to 2 or back to zero. If you had been at −1, you could move to −2 or back to zero. So there is one chance of landing on −2, two chances of landing on zero, and one chance of landing on 2.

n	-5	-4	-3	-2	-1	0	1	2	3	4	5
$P[S_0=k]$						1
$2P[S_1=k]$					1		1
$2^2P[S_2=k]$				1		2		1
$2^3P[S_3=k]$			1		3		3		1
$2^4P[S_4=k]$		1		4		6		4		1
$2^5P[S_5=k]$	1		5		10		10		5		1

The central limit theorem and the law of the iterated logarithm describe important aspects of the behavior of simple random walk on $\mathbb Z$ .

Gaussian random walk

A random walk having a step size that varies according to a normal distribution is used as a model for real-world time series data such as financial markets. The Black-Scholes formula for modeling equity option prices, for example, uses a gaussian random walk as an underlying assumption.

Here, the step size is the inverse cumulative normal distribution $\Phi^{-1}(z,\mu,\sigma)$ where 0 ≤ z ≤ 1 is a uniformly distributed random number, and μ and σ are the mean and standard deviations of the normal distribution, respectively.

For steps distributed according to any distribution with a finite variance (not necessarily just a normal distribution), the root mean squared expected translation distance after n steps is

$E|S_n| = \sigma \sqrt{n}$ .

Higher dimensions

Random walk in two dimensions.

Random walk in two dimensions with more, and smaller, steps. In the limit, for very small steps, one obtains Brownian motion.

Imagine now a drunkard walking randomly in a city. The city is realistically infinite and arranged in a square grid, and at every intersection, the drunkard chooses one of the four possible routes (including the one he came from) with equal probability. Formally, this is a random walk on the set of all points in the plane with integer coordinates. Will the drunkard ever get back to his home from the bar? It turns out that he will. This is the high dimensional equivalent of the level crossing problem discussed above. The probability of returning to the origin decreases as the number of dimensions increases. In three dimensions, the probability decreases to roughly 34%. A derivation, along with values of p(d) are discussed here: http://mathworld.wolfram.com/PolyasRandomWalkConstants.html.

The trajectory of a random walk is the collection of sites it visited, considered as a set with disregard to when the walk arrived at the point. In one dimension, the trajectory is simply all points between the minimum height the walk achieved and the maximum (both are, on average, on the order of √n). In higher dimensions the set has interesting geometric properties. In fact, one gets a discrete fractal, that is a set which exhibits stochastic self-similarity on large scales, but on small scales one can observe "jaggedness" resulting from the grid on which the walk is performed. The two books of Lawler referenced below are a good source on this topic.

Three random walks in three dimensions.

Random walk on graphs

Assume now that our city is no longer a perfect square grid. When our drunkard reaches a certain junction he picks between the various available roads with equal probability. Thus, if the junction has seven exits the drunkard will go to each one with probability one seventh. This is a random walk on a graph. Will our drunkard reach his home? It turns out that under rather mild conditions, the answer is still yes. For example, if the lengths of all the blocks are between a and b (where a and b are any two finite positive numbers), then the drunkard will, almost surely, reach his home. Notice that we do not assume that the graph is planar, i.e. the city may contain tunnels and bridges. One way to prove this result is using the connection to electrical networks. Take a map of the city and place a one ohm resistor on every block. Now measure the "resistance between a point and infinity". In other words, choose some number R and take all the points in the electrical network with distance bigger than R from our point and wire them together. This is now a finite electrical network and we may measure the resistance from our point to the wired points. Take R to infinity. The limit is called the resistance between a point and infinity. It turns out that the following is true (an elementary proof can be found in the book by Doyle and Snell):

Theorem: a graph is transient if and only if the resistance between a point and infinity is finite. It is not important which point is chosen if the graph is connected.

In other words, in a transient system, one only needs to overcome a finite resistance to get to infinity from any point. In a recurrent system, the resistance from any point to infinity is infinite.

This characterization of recurrence and transience is very useful, and specifically it allows us to analyze the case of a city drawn in the plane with the distances bounded.

A random walk on a graph is a very special case of a Markov chain. Unlike a general Markov chain, random walk on a graph enjoys a property called time symmetry or reversibility. Roughly speaking, this property, also called the principle of detailed balance, means that the probabilities to traverse a given path in one direction or in the other have a very simple connection between them (if the graph is regular, they are just equal). This property has important consequences.

Starting in the 1980s, much research has gone into connecting properties of the graph to random walks. In addition to the electrical network connection described above, there are important connections to isoperimetric inequalities, see more here, functional inequalities such as Sobolev and Poincaré inequalities and properties of solutions of Laplace's equation. A significant portion of this research was focused on Cayley graphs of finitely generated groups. For example, the proof of Dave Bayer and Persi Diaconis that 7 riffle shuffles are enough to mix a pack of cards (see more details under shuffle) is in effect a result about random walk on the group S_n, and the proof uses the group structure in an essential way. In many cases these discrete results carry over to, or are derived from Manifolds and Lie groups.

A good reference for random walk on graphs is the online book by Aldous and Fill. For groups see the book of Woess. If the transition kernel $p(x,y)$ is itself random (based on an environment $\omega$ ) then the random walk is called a "random walk in random environment". When the law of the random walk includes the randomness of $\omega$ , the law is called the annealed law; on the other hand, is $\omega$ is seen as fixed, the law is called a quenched law. See the book of Hughes or the lecture notes of Zeitouni.

We can think about choosing every possible edge with the same probability as maximizing uncertainty (entropy) locally. We could also do it globally – in maximal entropy random walk (MERW) we want all paths to be equally probable, or in other words: for each two vertexes, each path of given length is equally probable. This random walk has much stronger localization properties.

Relation to Wiener process

Simulated steps approximating a Wiener process in two dimensions.

A Wiener process is a stochastic process with similar behaviour to Brownian motion, the physical phenomenon of a minute particle diffusing in a fluid. (Sometimes the Wiener process is called "Brownian motion", although this is strictly speaking a confusion of a model with the phenomenon being modeled.)

A Wiener process is the scaling limit of random walk in dimension 1. This means that if you take a random walk with very small steps you get an approximation to a Wiener process (and, less accurately, to Brownian motion). To be more precise, if the step size is ε, one needs to take a walk of length L/ε² to approximate a Wiener process walk of length L. As the step size tends to 0 (and the number of steps increases proportionally) random walk converges to a Wiener process in an appropriate sense. Formally, if B is the space of all paths of length L with the maximum topology, and if M is the space of measure over B with the norm topology, then the convergence is in the space M. Similarly, a Wiener process in several dimensions is the scaling limit of random walk in the same number of dimensions.

A random walk is a discrete fractal (a function with integer dimensions; 1, 2, ...), but a Wiener process trajectory is a true fractal, and there is a connection between the two. For example, take a random walk until it hits a circle of radius r times the step length. The average number of steps it performs is r². This fact is the discrete version of the fact that a Wiener process walk is a fractal of Hausdorff dimension 2 ^[2]. In two dimensions, the average number of points the same random walk has on the boundary of its trajectory is r^4/3. This corresponds to the fact that the boundary of the trajectory of a Wiener process is a fractal of dimension 4/3, a fact predicted by Mandelbrot using simulations but proved only in 2000 (Science, 2000).

A Wiener process enjoys many symmetries random walk does not. For example, a Wiener process walk is invariant to rotations, but random walk is not, since the underlying grid is not (random walk is invariant to rotations by 90 degrees, but Wiener processes are invariant to rotations by, for example, 17 degrees too). This means that in many cases, problems on random walk are easier to solve by translating them to a Wiener process, solving the problem there, and then translating back. On the other hand, some problems are easier to solve with random walks due to its discrete nature.

Random walk and Wiener process can be coupled, namely manifested on the same probability space in a dependent way that forces them to be quite close. The simplest such coupling is the Skorokhod embedding, but other, more precise couplings exist as well.

The convergence of a random walk toward the Wiener process is controlled by the central limit theorem. For a particle in a known fixed position at t = 0, the theorem tells us that after a large number of independent steps in the random walk, the walker's position is distributed according to a normal distribution of total variance:

$\sigma^2 = \frac{t}{\delta t}\,\varepsilon^2,$

where t is the time elapsed since the start of the random walk, $\varepsilon$ is the size of a step of the random walk, and $\delta t$ is the time elapsed between two successive steps.

This corresponds to the Green function of the diffusion equation that controls the Wiener process, which demonstrates that, after a large number of steps, the random walk converges toward a Wiener process.

In 3D, the variance corresponding to the Green's function of the diffusion equation is:

$\sigma^2 = 6\,D\,t$

By equalizing this quantity with the variance associated to the position of the random walker, one obtains the equivalent diffusion coefficient to be considered for the asymptotic Wiener process toward which the random walk converges after a large number of steps:

$D = \frac{\varepsilon^2}{6 \delta t}$ (valid only in 3D)

Remark: the two expressions of the variance above correspond to the distribution associated to the vector $\vec R$ that links the two ends of the random walk, in 3D. The variance associated to each component $R_x$ , $R_y$ or $R_z$ is only one third of this value (still in 3D).

Self-interacting random walks

There are a number of interesting models of random paths in which each step depends on the past in a complicated manner. All are more difficult to analyze than the usual random walk — some notoriously so. For example

The self-avoiding walk. See the Madras and Slade book.
The loop-erased random walk. See the two books of Lawler.
The reinforced random walk. See the review by Robin Pemantle.
The exploration process.

Applications

Antony Gormley's Quantum Cloud sculpture in London was designed by a computer using a random walk algorithm.

The following are the applications of random walk:

In economics, the "random walk hypothesis" is used to model shares prices and other factors. Empirical studies found some deviations from this theoretical model, especially in short term and long term correlations. See share prices.
In population genetics, random walk describes the statistical properties of genetic drift
In physics, random walks are used as simplified models of physical Brownian motion and the random movement of molecules in liquids and gases. See for example diffusion-limited aggregation. Also in physics, random walks and some of the self interacting walks play a role in quantum field theory.
In mathematical ecology, random walks are used to describe individual animal movements, to empirically support processes of biodiffusion, and occasionally to model population dynamics.
In polymer physics, random walk describes an ideal chain. It is the simplest model to study polymers.
In other fields of mathematics, random walk is used to calculate solutions to Laplace's equation, to estimate the harmonic measure, and for various constructions in analysis and combinatorics.
In computer science, random walks are used to estimate the size of the Web. In the World Wide Web conference-2006, bar-yossef et al. published their findings and algorithms for the same. (This was awarded the best paper for the year 2006).
In image segmentation, random walks are used to determine the labels (i.e., "object" or "background") to associate with each pixel^[3]. This algorithm is typically referred to as the random walker segmentation algorithm.

In all these cases, random walk is often substituted for Brownian motion.

In brain research, random walks and reinforced random walks are used to model cascades of neuron firing in the brain.
In vision science, fixational eye movements are well described by a random walk.
In psychology, random walks explain accurately the relation between the time needed to make a decision and the probability that a certain decision will be made. (Nosofsky, 1997)
Random walk can be used to sample from a state space which is unknown or very large, for example to pick a random page off the internet or, for research of working conditions, a random worker in a given country.

When this last approach is used in computer science it is known as Markov Chain Monte Carlo or MCMC for short. Often, sampling from some complicated state space also allows one to get a probabilistic estimate of the space's size. The estimate of the permanent of a large matrix of zeros and ones was the first major problem tackled using this approach.

In wireless networking, random walk is used to model node movement.
Motile bacteria engage in a biased random walk.
Random walk is used to model gambling.
In physics, random walks underlying the method of Fermi estimation.
During World War II a random walk was used to model the distance that an escaped prisoner of war would travel in a given time.

Probabilistic interpretation

A one-dimensional random walk can also be looked at as a Markov chain whose state space is given by the integers $i=0,\pm 1,\pm 2,\dots$ , for some number $\,0 < p < 1$ , $\,P_{i,i+1}=p=1-P_{i,i-1}$ . We can call it a random walk because we may think of it as being a model for an individual walking on a straight line who at each point of time either takes one step to the right with probability p or one step to the left with probability 1 − p.

A random walk is a simple stochastic process.

Properties of random walks

Where R is the average end-to-end distance, R² is the average square of the end-to-end distance, N is the length of the walk, and b is the step size.

Simple random walk

Dimension	R²	Transient
1	Nb²	No
2	Nb²	No
3	...	Yes

Non-reversal random walk

Dimension	R	R²
2	?	2Nb²
3	?	(3/2)Nb²

References

↑ Pearson, K. (1905). The problem of the Random Walk. Nature. 72, 294.
↑ Hence the drunkard's random walk would eventually cover all of the city streets (2 Euclidean dimensions) and he will eventually return home, whereas the bird taking a 'random walk' flight through the air (3 Euclidean dimensions) will not cover all space, and will not return to their starting point.
↑ Leo Grady (2006): "Random Walks for Image Segmentation", IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1768-1783, Vol. 28, No. 11

Bibliography

David Aldous and Jim Fill, Reversible Markov Chains and Random Walks on Graphs, http://stat-www.berkeley.edu/users/aldous/RWG/book.html
Doyle, Peter G.; Snell, J. Laurie (1984), Random walks and electric networks, Carus Mathematical Monographs, 22, Mathematical Association of America, MR 920811, ISBN 978-0-88385-024-4, http://arxiv.org/abs/math.PR/0001057
William Feller (1968), An Introduction to Probability Theory and its Applications (Volume 1). ISBN 0-471-25708-7

Chapter 3 of this book contains a thorough discussion of random walks, including advanced results, using only elementary tools.

Barry D. Hughes (1996), Random walks and random environments, Oxford University Press. ISBN 0-19-853789-1
Gregory Lawler (1996), Intersection of random walks, Birkhäuser Boston. ISBN 0-8176-3892-X
Gregory Lawler, Conformally Invariant Processes in the Plane, http://www.math.cornell.edu/~lawler/book.ps
Neal Madras and Gordon Slade (1996), The Self-Avoiding Walk, Birkhäuser Boston. ISBN 0-8176-3891-1
James Norris (1998), Markov Chains, Cambridge University Press. ISBN 0-5216-3396-6
Springer Pólya (1921), "Über eine Aufgabe der Wahrscheinlichkeitstheorie betreffend die Irrfahrt im Strassennetz", Mathematische Annalen, 84(1-2):149–160, March 1921.

Robin Pemantle (2007), A survey of random processes with reinforcement.

Pal Révész (1990), Random walk in random and non-random environments, World Scientific Pub Co. ISBN 981-02-0237-7
Wolfgang Woess (2000), Random walks on infinite graphs and groups, Cambridge tracts in mathematics 138, Cambridge University Press. ISBN 0-521-55292-3
The XScreenSaver has a hack wander that shows random walk on the plane with the color changing with time.
Mackenzie, Dana, "Taking the Measure of the Wildest Dance on Earth", Science, Vol. 290, 8 December 2000.
"Numb3rs Blog." Department of Mathematics. 29 April 2006. Northeastern University. 12 December 2007 http://www.atsweb.neu.edu/math/cp/blog/?id=137&month=04&year=2006&date=2006-04-29.