## Tree Data Structures

Feb 262018Abstract data types define what data can be represented and what operations can be performed on it, but not how the data should be stored or structured in a program. Data structures are concrete implementations of ADTs which fill in these details — they define how the data should be stored in memory, and what instructions should be followed to perform the operations.

Trees are both; we have an abstract idea of what a tree is, but there are also data structures for them, which we’ll look at in this post. In fact, we can even use tree data structures to implement other ADTs, such as sets or dictionaries.(1)

Actually, our idea of trees is even more abstract than an ADT, because there is no common set of operations which can be performed on all kinds of tree. We might want some or all of the following:

- Get the tree’s
*root*node. *Get*or*set*some data stored at a node.(2)- Get the
*size*of the tree (the number of nodes). - Get the
*parent*of a node. - Get the
*children*of a node, or*iterate*over them. - Check whether a node is a
*leaf*. *Insert*a new node.(3)*Remove*a node.(4)*Iterate*over all the nodes in the tree.- Get the
*siblings*of a node, or*iterate*over them. - Get the
*ancestors*of a node, or*iterate*over them. *Iterate*over the descendants of a node.

This seems like a lot, but most tree data structures don’t support all of these operations.

### List-of-lists

Trees can be represented as outlines, which are nested list structures.
In Python, we can write lists with square brackets, and nested lists with nested brackets.
This gives us a simple data structure for trees, called a *list-of-lists*.

Step through the animation below to see how a tree (shown here as a node/link diagram) is represented as a list-of-lists:

The list-of-lists data structure is particularly useful when each leaf node holds one data value, and the internal nodes hold no data. Then, leaf nodes are represented simply by their data values, and internal nodes are represented by lists. Here’s the same data structure as a box-and-pointer diagram:(5)

The values in any particular list are that node’s children (or pointers to them). To access a particular node in the tree, we need to ‘get’ from the appropriate index in each nested list.

### List-of-children

The *list-of-lists* data structure is a simple way to represent a tree if only the leaf nodes hold data — then an internal node *is* its list of children.

Otherwise, if internal nodes should hold data too, then each node *has* both a data value and a *list-of-children*.(6)
To do this, we define a `Node`

class with fields for a `value`

and a list of `children`

.
Here’s the class in Python:

```
class Node:
def __init__(self, value, children=None):
self.value = value
self.children = children
```

Using this data structure, `Node`

objects represent both the leaf and internal nodes; a leaf node has no children, so it has no `children`

list.(7)

Step through the animation below to see how a tree is represented using `Node`

objects:

Here’s the box-and-pointer diagram for this data structure — it contains `Node`

objects and lists:

To access a child node, we need to ‘get’ from the appropriate index in `children`

.

### Parent pointer tree

The *list-of-lists* and *list-of-children* data structures give us easy access to a node’s children, but not its parent.
One solution is to give the `Node`

class an additional field for the `parent`

.
However, in most applications we don’t need to access both children and parents, so it’s usually only worth storing one or the other.
If we store a node’s parent instead of its children, then we have a parent pointer tree.

For example, in the tree structure of “fastest routes to London”, you can find the next city in the route by finding the parent of whichever city you’re currently in.

Here’s a class for a parent-pointer tree node in Python:

```
class PPNode:
def __init__(self, parent, value):
self.parent = parent
self.value = value
```

Here’s a box-and-pointer diagram for the same tree as above, using the *parent pointer tree* data structure:

The root node has a null pointer for its `parent`

, because it’s the only node with no parent.
Since we can’t access an object unless we have a pointer to it, to use this data structure we’d need a separate collection (e.g. a list or a dictionary) to hold pointers to all of the nodes in the tree.

#### Footnotes

- Compare with stacks; a stack is not a list, but we can implement a stack using an array-list or linked list data structure. Similarly, sets and dictionaries are not trees, but we can implement them using tree data structures.
- We could store a data value (or multiple data values) at each node, or at each leaf node.
- We might specify what it’s parent should be, or the data structure might choose for us.
- The node’s descendants could also be removed, or moved elsewhere in the tree.
- To keep the box-and-pointer diagrams simple, we’ll show lists as just sequences of values, rather than representing an actual list data structure.
- This is a special case of the adjacency list data structure for a graph.
- This way we won’t need to inspect the type of a node to determine if it’s a leaf; we can just check whether
`children`

is a null pointer. Alternatively we could make`children`

an empty list.

## There are no published comments.