An algorithm is correct if it always returns the correct result.
Assertions are our main tool for proving algorithms are correct; they state that some condition will be true whenever a particular line of code is reached. The goal is to assert that, when the algorithm finishes running, the result is correct. If this assertion never fails, then the algorithm will never return a wrong result.
An unfortunate problem
Consider the following sorting algorithm; assume the
is_sorted function tests whether
lst is sorted or not.
def infinite_loop_sort(lst): while not is_sorted(lst): # do nothing pass
This loop does nothing! Certainly, it makes no progress towards sorting the list. Yet we can be absolutely sure it never returns a wrong result, because it never returns a result at all!
This is really a problem.
If we write a
return statement, and
assert that the result is correct, the techniques we developed in the previous post actually allow us to “prove” that this clearly incorrect algorithm is “correct”:
def infinite_loop_sort(lst): while not is_sorted(lst): # do nothing pass assert is_sorted(lst) return lst
The “proof” is as follows.
At the end of the loop, the assertion cannot fail, because if
is_sorted(lst) is false, the loop would not have stopped.
So, the algorithm never returns a wrong result.
What we want to prove is that the algorithm always returns a correct result. But assertions only let us say that if the algorithm returns a result, then the result will be correct.(1)
We also need to prove that the algorithm will return a result at all — i.e. that it will eventually terminate, or finish running. In this post, we’ll explore the idea of variants, which are needed to prove that algorithms terminate. The good news is that variants are simpler and easier to use than invariants.
In the game of chess, a pawn can only move forwards. If you keep moving the same pawn, eventually it must reach the end of the board.
The y coordinate gets smaller on every move, so it must eventually reach 0. This proves that the pawn will reach the end of the board in finitely many moves — it can’t get stuck in an “infinite loop”.
In other words, the y coordinate is a variant.
The following algorithm computes the factorial of
def factorial(n): if n < 0: raise ValueError('n must not be negative') r = 1 while n > 0: r *= n n -= 1 return r
In this loop,
n is a variant — it gets smaller on each iteration, and the loop stops when
n is 0.
Therefore, this algorithm will always terminate.
Example: insertion sort
Consider the insertion sort algorithm.
def insertion_sort(lst): n = len(lst) for i in range(n): j = i while j > 0 and lst[j-1] > lst[j]: tmp = lst[j] lst[j] = lst[j-1] lst[j-1] = tmp j -= 1
The inner while loop must terminate, because
j gets smaller on each iteration, and the loop terminates when
j is 0.
If the other condition
lst[j-1] > lst[j] is false, then the loop might terminate sooner — but it will terminate.
j is a loop variant.
In the previous post we proved that the following algorithm never returns a wrong result:
def power(x, n): if n < 0: raise ValueError('n must not be negative') r = 1 i = 0 while i < n: r *= x i += 1 return r
Let’s also prove that the loop must terminate.
This example is a bit different, since there is no variable “counting down”, and the loop stops when
n, not when something reaches 0.
This loop terminates because
i keeps getting bigger, so it must eventually reach
Put another way, the difference
n - i keeps getting smaller.
For example, consider the trace from computing
|n||i||n - i|
n - i gets smaller on every iteration, because
i gets bigger but
n stays the same.
n - i reaches 0, the condition
i < n becomes false and so the loop terminates.
n - i is a loop variant.
Example: binary search
Consider the binary search algorithm:
def binary_search(lst, target): left = 0 right = len(lst) - 1 while left <= right: middle = (left + right) // 2 value = lst[middle] if value == target: return middle elif value < target: left = middle + 1 else: right = middle - 1 return -1
This example is a bit more complicated, because there’s no variable counting up or down to a fixed limit.
Intuitively, binary search should terminate because the “range of possibilities” keeps getting smaller — this range is from
right - left could be a loop variant.
On each iteration, one of three things happens:
- The algorithm might terminate immediately — so, no problem.
middle + 1.
middle - 1.
The loop condition is
left <= right, and
middle is halfway between them.
left gets bigger, or
right gets smaller — either way,
right - left gets smaller.
To prove the loop terminates, we need to show that
left <= right eventually becomes false.
This happens when
right - left reaches −1, so the proof is complete.(2)
After these examples, hopefully it’s clear what loop variants are, and how we can use them to prove algorithms terminate. But let’s also give a proper definition.
A loop variant is an expression which:
- Has an integer value.
- Gets at least 1 smaller on every iteration of the loop.
- When it gets below some threshold (usually 0), the loop condition becomes false.
Any loop with a variant is guaranteed to terminate, because an integer can’t keep getting smaller forever without falling below the threshold. Once it reaches the threshold, the loop condition becomes false, so the loop terminates.
A note on ‘for’ loops
In this post we’ve only analysed while loops. On the other hand, many algorithms use for loops:
for i in range(n): # ...
for loop iterates over a finite collection, such as a list, set or dictionary, or a finite range of integers, then the number of iterations is strictly limited to the size of that collection or range, so it can’t be an infinite loop.(3)
If necessary, we can still use variants to prove that a
for loop terminates, by writing an equivalent
This example iterates over the same range, and has the variant
n - i.
i = 0 while i < n: # ... i += 1
- Fundamentally, an assertion says that when a line of code is reached, the condition will be true. There is no way to assert that a line of code will be reached.
- The variant is guaranteed to reach −1 for the same reason it’s guaranteed to reach 0.
Alternatively, we could choose
right - left + 1as a variant.
- This assumes the size doesn’t change during the loop.
It’s generally a mistake to make a collection larger while you’re iterating over it with a
forloop, and many languages will raise an error if they detect this happening.
There are no published comments.