Exercises: Algorithm Correctness
May 142018Before attempting these exercises, you should read the posts on algorithm correctness, invariants and variants.
Before attempting these exercises, you should read the posts on algorithm correctness, invariants and variants.
An algorithm is correct if it always returns the correct result.
Assertions are our main tool for proving algorithms are correct; they state that some condition will be true whenever a particular line of code is reached. The goal is to assert that, when the algorithm finishes running, the result is correct. If this assertion never fails, then the algorithm will never return a wrong result.
An assertion is a claim about the state of a running program — it says that some logical condition should be true whenever a particular line of code is reached. In the previous post, we only considered assertions in code where each line is executed once, in sequence.
Of course, real algorithms are more complicated, because loops and recursion cause some code to be executed multiple times, not in a simple sequence. In this post we’ll explore invariants, which are needed to prove correctness of non-trivial algorithms.
When we write an algorithm to solve a problem, the algorithm is only correct if it does actually solve that problem. That should be obvious! But how do we know whether an algorithm is correct?
This is a genuine concern because algorithms often aren’t correct, and we need to debug them. We need a way to analyse an algorithm to decide if it really does solve the problem or not.
Before attempting these exercises, you should read the posts on dynamic analysis and static analysis.
In an earlier post we saw the importance of choosing the right data types when designing an algorithm. But we don’t just choose a type — whether we’re aware of it or not, we are also choosing a data structure which implements that type.
Both choices can have a big impact on how efficient an algorithm is; let’s consider some examples.
By making a series of assumptions and considering only “large” inputs, we can analyse how efficient an algorithm is without actually running it. The result of this analysis is a mathematical formula called the complexity (or time complexity) of the algorithm. For example, we derived the formula n2 for the selection sort algorithm.
Using standard notation, we would say that selection sort’s complexity is O(n2), or that selection sort is an O(n2) algorithm.
This formula says, very roughly, how much “work” the algorithm has to do as a function of n, which represents the “input size”. In this post, we’ll see how to make sense of this formula, and how to derive it with as little algebra as possible.
In the previous post we analysed how efficient the insertion sort algorithm is. We did dynamic analysis — we analysed the algorithm by actually running it.
Technically, we analysed an implementation of the algorithm; perhaps a cleverer implementation could be more efficient. From one perspective, the implementation is what we really care about:
On the other hand, it’s hard to do an accurate dynamic analysis:
In this post, we’ll get around these problems by making a series of simplifications. We’ll do a kind of static analysis called asymptotic analysis — this means we’ll analyse algorithms without running them, and we’ll assume the inputs are large.
Efficiency is important; people don’t like waiting for computers to slowly do things that ought to be fast.
So far we’ve been thinking about this in vague terms; we know, for example:
Now it’s time to go into detail: how do we know how fast an algorithm is?
Before attempting these exercises, you should read the posts on algorithm strategies and choosing data types.