Post correspondence problem

Not to be confused with the other Post's problem on the existence of incomparable r.e. Turing degrees.
Not to be confused with PCP theorem.

The Post correspondence problem is an undecidable decision problem that was introduced by Emil Post in 1946.[1] Because it is simpler than the halting problem and the Entscheidungsproblem it is often used in proofs of undecidability.

Definition of the problem

The input of the problem consists of two finite lists \alpha_{1}, \ldots, \alpha_{N} and \beta_{1}, \ldots, \beta_{N} of words over some alphabet A having at least two symbols. A solution to this problem is a sequence of indices (i_k)_{1 \le k \le K} with K \ge 1 and  1 \le i_k \le N for all k, such that

\alpha_{i_1} \ldots \alpha_{i_K} = \beta_{i_1} \ldots \beta_{i_K}.

The decision problem then is to decide whether such a solution exists or not.

Example instances of the problem

Example 1

Consider the following two lists:

α1 α2 α3
a ab bba

β1 β2 β3
baa aa bb

A solution to this problem would be the sequence (3, 2, 3, 1), because

\alpha_3 \alpha_2 \alpha_3 \alpha_1 = bba + ab + bba + a = bbaabbbaa = bb + aa + bb + baa = \beta_{3} \beta_{2} \beta_{3} \beta_{1}.

Furthermore, since (3, 2, 3, 1) is a solution, so are all of its "repetitions", such as (3, 2, 3, 1, 3, 2, 3, 1), etc.; that is, when a solution exists, there are infinitely many solutions of this repetitive kind.

However, if the two lists had consisted of only \alpha_2, \alpha_3 and \beta_{2}, \beta_{3} from those sets, then there would have been no solution (the last letter of any such α string is not the same as the letter before it, whereas β only constructs pairs of the same letter).

A convenient way to view an instance of a Post correspondence problem is as a collection of blocks of the form

αi
βi

there being an unlimited supply of each type of block. Thus the above example is viewed as

a
baa

i = 1

ab
aa

i = 2

bba
bb

i = 3

where the solver has an endless supply of each of these three block types. A solution corresponds to some way of laying blocks next to each other so that the string in the top cells corresponds to the string in the bottom cells. Then the solution to the above example corresponds to:

bba
bb

i1 = 3

ab
aa

i2 = 2

bba
bb

i3 = 3

a
baa

i4 = 1

Example 2

Again using blocks to represent an instance of the problem, the following is an example that has infinitely many solutions in addition to the kind obtained by merely "repeating" a solution.

bb
b

1

ab
ba

2

c
bc

3

In this instance, every sequence of the form (1, 2, 2, . . ., 2, 3) is a solution (in addition to all their repetitions):

bb
b

1

ab
ba

2

ab
ba

2

ab
ba

2

c
bc

3

Proof sketch of undecidability

The most common proof for the undecidability of PCP describes an instance of PCP that can simulate the computation of a Turing machine on a particular input. A match will occur if and only if the input would be accepted by the Turing machine. Because deciding if a Turing machine will accept an input is a basic undecidable problem, PCP cannot be decidable either. The following discussion is based on Michael Sipser's textbook Introduction to the Theory of Computation.[2]

In more detail, the idea is that the string along the top and bottom will be a computation history of the Turing machine's computation. This means it will list a string describing the initial state, followed by a string describing the next state, and so on until it ends with a string describing an accepting state. The state strings are separated by some separator symbol (usually written #). According to the definition of a Turing machine, the full state of the machine consists of three parts:

Although the tape has infinitely many cells, only some finite prefix of these will be non-blank. We write these down as part of our state. To describe the state of the finite control, we create new symbols, labelled q1 through qk, for each of the finite state machine's k states. We insert the correct symbol into the string describing the tape's contents at the position of the tape head, thereby indicating both the tape head's position and the current state of the finite control. For the alphabet {0,1}, a typical state might look something like:

101101110q700110.

A simple computation history would then look something like this:

q0101#1q401#11q21#1q810.

We start out with this block, where x is the input string and q0 is the start state:

 
q0x#

The top starts out "lagging" the bottom by one state, and keeps this lag until the very end stage. Next, for each symbol a in the tape alphabet, as well as #, we have a "copy" block, which copies it unmodified from one state to the next:

a
a

We also have a block for each position transition the machine can make, showing how the tape head moves, how the finite state changes, and what happens to the surrounding symbols. For example, here the tape head is over a 0 in state 4, and then writes a 1 and moves right, changing to state 7:

q40
1q7

Finally, when the top reaches an accepting state, the bottom needs a chance to finally catch up to complete the match. To allow this, we extend the computation so that once an accepting state is reached, each subsequent machine step will cause a symbol near the tape head to vanish, one at a time, until none remain. If qf is an accepting state, we can represent this with the following transition blocks, where a is a tape alphabet symbol:

qfa
qf

aqf
qf

There are a number of details to work out, such as dealing with boundaries between states, making sure that our initial tile goes first in the match, and so on, but this shows the general idea of how a static tile puzzle can simulate a Turing machine computation.

The previous example

q0101#1q401#11q21#1q810.

is represented as the following solution to the Post correspondence problem:

 
q0101#

q01
1 q4

0
0

1
1

#
#

1
1

q4 0
1 q2

1
1

#
#

1
1

1 q21
q810

#
#

1 q8
q8

1
1

0
0

#
#

q8 1
q8

0
0

#
#

q8 0
q8

#
#

q8
 

#
#

...

Variants

Many variants of PCP have been considered. One reason is that, when one tries to prove undecidability of some new problem by reducing from PCP, it often happens that the first reduction one finds is not from PCP itself but from an apparently weaker version.

References

  1. E. L. Post (1946). "A variant of a recursively unsolvable problem" (PDF). Bull. Amer. Math. Soc 52.
  2. Michael Sipser (2005). "A Simple Undecidable Problem". Introduction to the Theory of Computation (2nd ed.). Thomson Course Technology. pp. 199205. ISBN 0-534-95097-3.
  3. Salomaa, Arto (1981). Jewels of Formal Language Theory. Pitman Publishing. pp. 74–75. ISBN 0-273-08522-0. Zbl 0487.68064.
  4. 4.0 4.1 V. Halava; M. Hirvensalo and R. de Wolf (2001). "Marked PCP is decidable". Theor. Comp. Sci. (Elsevier Science) 255: 193–204. doi:10.1016/S0304-3975(99)00163-2.
  5. K. Ruohonen (1983). "On some variants of Post's correspondence problem". Acta Informatica (Springer) 19 (4): 357367. doi:10.1007/BF00290732.
  6. Michael R. Garey; David S. Johnson (1979). Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman. p. 228. ISBN 0-7167-1045-5.
  7. Y. Gurevich (1991). "Average case completeness". J. Comp. Sys. Sci. (Elsevier Science) 42 (3): 346–398. doi:10.1016/0022-0000(91)90007-R.
  8. P. Chambart; Ph. Schnoebelen (2007). "Post embedding problem is not primitive recursive, with applications to channel systems". Lecture Notes in Computer Science. Lecture Notes in Computer Science (Springer) 4855: 265–276. doi:10.1007/978-3-540-77050-3_22. ISBN 978-3-540-77049-7.
  9. Paul C. Bell; Igor Potapov (2010). "On the Undecidability of the Identity Correspondence Problem and its Applications for Word and Matrix Semigroups". International Journal of Foundations of Computer Science (World Scientific) 21.6: 963–978. doi:10.1142/S0129054110007660.

External links