Depth-first search (DFS) is a fundamental algorithm for working with trees, because it gives us a way to “visit” every node in the tree, once each.(1) That’s important, because visiting every node is a subproblem of many other problems on trees.
DFS explores the full depth of one subtree before moving onto the next one. If that’s the order we want to visit the nodes in, or the order doesn’t matter, then DFS is fine; otherwise, we need a different algorithm. We’ll see one in this post.
When we need a search algorithm, the data structure we want to search is more important than what we want to find. We could even say that linear search and binary search work on different data structures: linear search uses a list, and binary search uses a list which must be kept in order.(2)
Recursion is one of the most difficult ideas in programming, but sometimes it can be simple. A recursive function is a function which can call itself. Some recursive functions are very puzzling, but others are so natural that you can understand them even without knowing what recursion is.
A tree is an abstract idea, a common kind of structure which we find across many different applications. A tree models a relation between pairs of things:
- A section of an article contains another section,
- A directory contains another directory,
- An HTML tag contains other HTML tags,
- An employee manages another employee,
- A process spawns another process.
A “relation between pairs of things” is very abstract and very general — in almost any context, there are relations between pairs of things which we might want to model in a computer program. However, not every relation defines a tree structure.
Abstract data types define what data can be represented and what operations can be performed on it, but not how the data should be stored or structured in a program. Data structures are concrete implementations of ADTs which fill in these details — they define how the data should be stored in memory, and what instructions should be followed to perform the operations.
Trees are both; we have an abstract idea of what a tree is, but there are also data structures for them, which we’ll look at in this post. In fact, we can even use tree data structures to implement other ADTs, such as sets or dictionaries.(1)
Trees in computing are an abstract idea with many diverse applications. Visual representations are especially important here, because a lot of ideas about trees are easiest to convey visually. Having more ways to think about trees also makes us more likely to recognise where they’re applicable.
There are two kinds of ‘tree’ that everyone knows about: actual trees made of wood, and family trees. Family trees are closer to how we think about trees in computing; a family tree is an abstract diagram of the parent/child relationships within a family. There are many kinds of hierarchical relationships which need to be modelled in computer programs.