DS&A

Data Structures and Algorithms

Recursion on Trees

Mar 122018

Recursion is one of the most difficult ideas in programming, but sometimes it can be simple. A recursive function is a function which can call itself. Some recursive functions are very puzzling, but others are so natural that you can understand them even without knowing what recursion is.

A recursive function

For example, suppose we have a tree containing numbers, represented using the list-of-children data structure. How can we find the sum of every number in the tree? Here’s some Python code for an example:

class Node:
    def __init__(self, value, children=None):
        # a leaf node has an empty list of children
        if children is None:
            children = []
        self.value = value
        self.children = children

root = Node(1, [
    Node(2, [
        Node(3, [
            Node(4),
            Node(5)
        ]),
        Node(6)
    ]),
    Node(7, [
        Node(8, [
            Node(9)
        ])
    ]),
    Node(10)
])

To simplify the problem, we’ll give each leaf node an empty list of children, instead of None.(1) Here’s a node/link diagram of the example tree:

We want a function to calculate the sum of the numbers in the tree; the function’s input will be the root node. To begin with, let’s use a variable t to keep track of the “result so far”:

def tree_sum(node):
    t = node.value
    # ...
    return t

The missing code is the hard part — adding every descendant’s value to t. The root node has a list of children, but it doesn’t have a list of descendants; the best we can do right now is iterate over the children.

def tree_sum(node):
    t = node.value
    for child in node.children:
        t += # ...
    return t

For each child, we must add its value — and its descendants’ values. But this is exactly the same problem that tree_sum is meant to solve! So, we can simply call the tree_sum function to get the required result for child:

def tree_sum(node):
    t = node.value
    for child in node.children:
        t += tree_sum(child)
    return t

This function works — test it in Python, and find the sum by hand to check that the result is correct.

A recursive structure

This problem was easy to solve recursively, because the problem naturally has a recursive structure. The Node class is a recursive type: a Node object has a list of children, which are Node objects. The type’s definition refers to itself, just like a recursive function calls itself.

Intuitively, a recursive structure has “parts” which resemble the “whole”. A more visual way to see that trees are recursive structures is to consider subtrees: each node in the tree can be thought of as a “root node” for a smaller part of the tree. Here’s the subtree defined by the node with value 2:

The tree_sum function is an example of structural recursion. The problem has a recursive structure where the “parts” resemble the “whole”; the solution is a function which takes the “whole” as an input, and calls itself on the “parts”.

We have seen several recursive structures already:

A non-solution

Suppose we didn’t notice that tree_sum could call itself recursively to find the sum of the child subtree. In this case we might have written something like this:

def tree_sum(node):
    t = node.value
    for child in node.children:
        t += child.value
        # ...
    return t

This calculates the sum up to the first level of the tree, giving 1 + 2 + 7 + 10 = 20. To go further we should add the values from the second level, by iterating over child’s children:

def tree_sum(node):
    t = node.value
    for child in node.children:
        t += child.value
        for child2 in child.children:
            t += child2.value
            # ...
    return t

This function should return 1 + 2 + 3 + 6 + 7 + 8 + 10 = 37. To get to the third level of the tree, we need yet another nested loop to iterate over child2’s children:

def tree_sum(node):
    t = node.value
    for child in node.children:
        t += child.value
        for child2 in child.children:
            t += child2.value
            for child3 in child2.children:
                t += child3.value
                # ...
    return t

This now gives the correct result for our example tree, but it won’t work for trees with more levels.

Image credit: Warner Bros. Pictures

Clearly, however many nested loops we add, we’ll never finish writing this function. If you ever find yourself needing infinitely many nested blocks of code, that’s a clue that you should use structural recursion instead.

Footnotes

  1. For technical reasons, we can’t give the children parameter a default value of [], so we make it None and check inside the constructor.

There are no published comments.

New comment

Your comment will not appear until a moderator approves it.

Atom

hosted on werp.site