 # Honesty Is Needed

Computations have no choice but to manipulate representations of objects rather than the objects themselves. Most often, strings of symbols taken from some finite alphabet are used for the purpose. Numbers, for example, are usually denoted by sequences of decimal symbols, or binary bits, or unary strokes (like the tally numbers of paleolithic times). In logic, one therefore distinguishes between numbers n, which reside in an ideal Platonic world, and numerals n, their symbolic representation as (first-order) terms. Similarly, graphs, which are set-theoretic objects, are typically either represented as lists of edges (pairs of nodes) or as binary adjacency matrices.

Given that representation is an inescapable necessity, some natural questions arise immediately:

• How much of a difference can the choice of representation make to computability or complexity measurements?

Answer: It can make all the difference between computable and incomputable, or between tractable and intractable.

• Who gets to choose the representation: Abe who formulates the queries, or Cay who designs the program to answer them?

Our answer: Cay may reinterpret Abe’s formulation any way she sees fit, but the reinterpretation is part and parcel of the process of answering.

• What is wrong with a representation of graphs that lists nodes in the order of a Hamiltonian path, if there is such—in which case deciding the question takes linear time?

Answer: Cay will only be able to quickly answer the specific question whether there is a Hamiltonian path, whereas she would have a much harder time performing basic graph operations, such as adding an edge.

• Is it legitimate to say that the parity of an integer (that is, whether it is even or odd) can be determined in constant time, when that is the case only for very specific representations of numbers (namely, least-significant-first binary, as opposed to ternary, say)?

Garey and Johnson [15, pp. 9-10] address the questions of representation and computational models as they impact the measurement of computational complexity. They assert upfront that it matters little, as long as one sticks to what is considered “reasonable”:

The intractability of a problem turns out to be essentially independent of the particular encoding scheme and computer model used for determining time complexity.

They go on to explain why at length:

Let us first consider encoding schemes. Suppose for example that we are dealing with a problem in which each instance is a graph.... Such an instance might be described by simply listing all the vertices and edges, or by listing the rows of the adjacency matrix for the graph, or by listing for each vertex all the other vertices sharing a common edge with it (a “neighbor” list). Each of these encodings can give a different input length for the same graph. However, it is easy to verify that the input lengths they determine differ at most polynomially from one another, so that any algorithm having polynomial time complexity under one of these encoding schemes also will have polynomial time complexity under all the others. In fact, the standard encoding schemes used in practice for any particular problem always seem to differ at most polynomially from one another. It would be difficult to imagine a “reasonable” encoding scheme for a problem that differs more than polynomially from the standard ones.

This discussion is followed by a caveat:

Although what we mean here by “reasonable” cannot be formalized, the following two conditions capture much of the notion:

• (1) the encoding of an instance I should be concise and not “padded” with unnecessary information or symbols, and
• (2) numbers occurring in I should be represented in binary (or decimal, or octal, or in any fixed base other than 1).

If we restrict ourselves to encoding schemes satisfying these conditions, then the particular encoding scheme used should not affect the determination of whether a given problem is intractable.

The main concern expressed in the above is that the input size should faithfully reflect the complexity of the input object. The choice of size can make a big difference, of course :

The computational complexity of a problem should not be obscured by a particular representation scheme.. Many problems are “fast” under the unary representation, as many computationally (probably) intractable problems in number theory are also “fast” under unary representation, such as factoring, discrete logarithm. But that is not honest complexity theory. The time is really exponential, compared to a more “reasonable” representation scheme of the information, such as in binary. [Italics ours.]

There are other ways in which a choice of representation may be unreasonable, besides being unnecessarily large. It could give away the answer—even if the sizes differ only polynomially, or it may harbor hints that make the task easier than it really is. That is the problem with a representation of graphs that lists nodes in Hamiltonian order, for example; it puts the solution—when there is one—right in front of one’s nose. Our proposal for measuring complexity honestly will solve this problem by taking into consideration both the cost of computing a given function as well as the cost of generating the function’s inputs (the nodes and edges of a graph in the Hamiltonian case).

This chapter looks at questions of “honesty” of representation in various contexts. We begin with what we feel is the underlying problem posed by representations, namely, the camouflaging of extra information (Sect. 6.2). After proposing a solution (Sects. 6.3 and 6.4), we consider how it resolves the problem of honest computability (Sects.6.5 and 6.6) and relate honesty to Martin Davis’s definition of universality (Sect. 6.7). Then we turn to see how this proposal also solves the problem of honest complexity (Sect. 6.8). With a solution in place, we analyze why considering formal languages, rather than functions, does not work (Sect. 6.9).