Wednesday, May 28, 2025

Intro to Recursion


 To get any deeper into algorithms on lists of numbers like we’ve seen, we have to at some point tackle the elephant in the room that is left unexplained in the last article: how we sort. But to get to sorting, we need to undergo a fundamental shift in how we think. Normally, a teacher will say it’s very bad form to define a thing in terms of itself; that may certainly be true in English class, but not in math or computer science. In math and computer science, we define problems in terms of themselves all the time with no issues at all: no wrong answers on account of that style (although mistakes can happen), no angry teachers, nothing like that.

We call these self-referential problems “recursive.”

In math, for instance, the function:

  • f(0) = 0
  • f(1) = 7
  • f(2) = 27
  • f(n) = f(n-1) / 2 + 3 * f(n-2)
is perfectly fine because it’s recursive.

In explaining recursion, I’ll point to math for two very famous sequences which are both classically defined this way: using addition, the Fibonacci sequence; and using multiplication, the factorial.

You have probably heard of Fibonacci in the context of breeding rabbits: given that you start with a pair of newborn rabbits, and they never die; and pregnancies last one month, and rabbits can breed after their first month, how do you figure out how many rabbits will exist in a certain amount of time?

  • At time 0, there is 1 pair, the original immature pair
  • At time 1, there’s still 1 pair, but now they’re mature and have just bred
  • At time 2, there are 2 pairs
    • The original
    • Their baby offspring
  • At time 3, there are 3 pairs
    • The original
    • Their first offspring
    • Their new babies
  • At time 4, there are 5 pairs
    • The original
    • Their first offspring pair
    • The offspring of this pair
    • Their second offspring pair
    • The offspring of this pair
  • And so on

How do you formulaically state how many pairs exist?

First, you have the initial conditions that at months 0 and 1, rabbit(x) = 1; there is 1 pair in month 0, and one pair in month 1. Then, notice that for months 2 and beyond, the number of rabbits is rabbit(x-1) + rabbit(x-2)—that is, how many pairs there were last month, plus how many there were two months ago. This is because a rabbit is pregnant for one month, and then baby rabbits can breed once they’re one month old (and they never die). Classically, the problem is about a year later. Let’s check it out!

  • When we start, there’s 1 pair
  • A month later, there’s 1 pair
  • 2 months later, 2 pairs
  • 3 months later, 3 pairs
  • 4 months later, 5 pairs
  • 5 months later, 8 pairs
  • 6 months later, 13 pairs
  • 7 months later, 21 pairs
  • 8 months later, 34 pairs
  • 9 months later, 55 pairs
  • 10 months later, 89 pairs
  • 11 months later, 144 pairs
  • A year later, 233 pairs!

This method of calculating the number of pairs works, but it’s inefficient for a reason we’ll cover in a later article.

Now, let’s do something similar, but with multiplication.

Let’s say a class includes no students. How many ways are there to form a line including no students? One way—[]. Now add Alice; how many ways can Alice stand in line? One way—[Alice]. Now add Bob. Two ways—[Alice, Bob] or [Bob, Alice]. Now add Charlie. How many ways? 6—[A, B, C], [A, C, B], [B, A, C], [B, C, A], [C, A, B], [C, B, A].  If Dave joins the class, there are 24 ways; if Ellie joins the class, 120; if Finn joins, 720; if George joins, 5040. By the time Jenny joins the class, there are more than 3 million ways to stand in line.  By the time Zayden joins, the class, there are about 2000 times as many ways to stand in line as there are total cells in all people alive.

Mathematically, the function that counts the number of orderings is called the factorial function, written with ! (so n! is read as “n factorial” or “the factorial of n”) and means 1 times 2 times 3 times 4 times… up to n. We can define this as n! = n*(n-1)!

Notice how in both the additive Fibonacci case and the multiplicative factorial, we define the problem (factorial of 7, 12th Fibonacci, etc.) in terms of smaller versions of that same problem. Factorial of 7 is 7 times factorial of 6; and Fibonacci of 12 is the sum of Fibonacci of 10 and Fibonacci of 11. Factorial of 6, in turn, depends on knowing factorial of 5; and Fibonacci of 10 requires knowing F(8) and F(9), and Fibonacci of 11 requires knowing F(9) and F(10). We keep building and solving self-similar problems, until we get to a problem we solve without this self-similarity because it’s small or easy enough.

For Fibonacci, it’s the initial-month conditions:

  • F(0) is 1, the initial baby rabbits
  • F(1) is still 1, the now-grown rabbits

For Factorial, either of the following works:

  • 0! Is 1
  • 1! Is 1

One place where beginning coders fail—and then get welcomed as “real developers” because you’ve suffered through this rite of passage—is in forgetting that all recursive problems need to have at least one stopping condition for which the answer is known without recursion, and that each successive problem (Fact(10) needs Fact(9) needs Fact(8) needs Fact(7) and so on, until Fact(1) or Fact(0) is given as 1) must in some way get closer to the stopping condition(s) than problems that came before it. Failure to both define and approach a stopping condition will produce the ceremonial first “StackOverflowError,” which every programmer must experience. (This is, by the way, the reason the website has that name.)

Make sure you have a great understanding of this concept before you ever move on to sorting. Call your friends or family into the room and explain it. Until they understand it, don’t move on. (This is because a lot of sorting is, at a high level, “to sort, divide, sort each piece, and put them together”—that is, the definition of, say, “sort(10)” requires you to “sort(5)” twice and put them together. In other words, a lot of sorting is, fundamentally, recursive.)

No comments:

Post a Comment

Switch

 Other than if/if-else/if-else if-else and the ternary operator, there is yet another common and important conditional expression in Java th...