Prove that every sorting algorithm must make at least lg(n!) comparisons
P.S. My proof attempt/rewrite based on the solution discussed, uploaded mainly for my own reference.(warning: possibly erroneous)
Claim: Every sorting algorithm must make at least lg(n!) comparisons
- the log here is of base 2
- Prove by making an adversary argument
- When we talk about running time, input size is an important consideration.
- The adversary wants to make life difficult for the algorithm, hence it will make choices that are against the algorithm's interest.
- Suppose we are given an array of n integers (input size = n), there are n! unique permutations of the set
{1, ..., n}
.- They are unique because if they are the same, then sorting the same array will take the same steps.
- The adversary could chose any of the permutations. It does not fix the input in advance in the hope that if the algorithm makes too few comparisons, there will be an example of how the algorithm fails to sort the array. In particular, there will be two inputs that are treated the same from the algorithm's perspective. And yet, one of them will not be correctly sorted.
- What the (sorting) algorithm can do is to issue a comparison query to ask whether the integer at the ith position is larger than the integer at the jth position.
- The adversary takes the following strategy to respond:
- It looks through the current set of permutations(P), makes the comparison as required and takes note of the number(X) of permutations in P that satisfy the query i.e the ith integer is indeed larger than the jth integer.
- It takes n comparisons in the very first run, since there are n permutations in P.
- Suppose the number(X) of permutations that satisfy the query is larger or equal to the total size of the permutations / 2, the adversary returns YES and update the permutation set(P) to the satifying set.
- Else (meaning less than half of the permutations satisfy the comparison query, more than half does not satisfy the comparison query), the adversary replies NO and set the permutations(P) to the larger set of permutations that does not satisfy the comparison query.
- Either way the set of permutations to be examined next decreases. With the above strategy, the adversary is able to decrease at a slower rate, as it always try to keep at least half of the permutations. This makes sense because the goal of the adversary as mentioned earlier, is to come to a situation where there are more than one permutations left.
- Since the permutations decreases by at most half for every comparison query issued by the algorithm, there are at least two distinct permutations left. The algorithm will re-order both of the permutations in the same way, and must make one mistake. As all permutations are distinct, it is not possible to sort and re-order two different permutations in the same way to get the correct output.
Concrete Example
- n = 3
- current set of permutations =
{[1, 2, 3], [1, 3, 2], [2, 1, 3], [2, 3, 1], [3, 1, 2], [3, 2, 1]}
- 6 different permutations
- The algorithm will make less than lg(3!) = lg(6) = 2.584963 = ceil(2.584963) = 3 comparisons.
- meaning it asks at most 2 comparison queries
- trying to sort 3 items with 2 comparisons is not always possible (but ok for finding maximum)
- The algorithm asks(0 indexed): is 0th > 1st?
- The adversary checks the current set:
- [1, 2, 3] No
- [1, 3, 2] No
- [2, 1, 3] Yes
- [2, 3, 1] No
- [3, 1, 2] Yes
- [3, 2, 1] Yes
- X = 3 (count of permutations that satisfy the comparison query)
- The adversary answers Yes (Or No, but it does not matter)
- The current set becomes
{[2, 1, 3], [3, 1, 2], [3, 2, 1]}
- The algorithm asks(0 indexed): is 1st > 2nd?
- The adversary checks the current set:
- [2, 1, 3] No
- [3, 1, 2] No
- [3, 2, 1] Yes
- X = 1
- The adversary answers No (to go against the algorithm's interest)
- The current set becomes
{[2, 1, 3], [3, 1, 2]}
- The algorithm now makes a decision about how to sort the array, given the response.
- To recall,
- 0th > 1st is True
- 1st > 2nd is False
- So the algorithm does not know whether 0th is larger or smaller than 2nd. Suppose the algorithm choose 0th to be larger and arrange as such: 0th > 2nd > 1st
- In the available set, [3,1,2] will be sorted correctly. However, the set [2, 1, 3] does not.
- [2, 1, 3] becomes [2, 3, 1], which is still not sorted!
- The algorithm fails at sorting with less than lg(n!) comparisons.