Annotated Codewars Kata 1: A "Simple" Counting Exercise

This Codewars kata regards a simple mathematical exercise: count the zeroes at the end of n!, where by “!” I mean the mathematical function factorial.

The factorial of a natural number n is the product of all the numbers from 1 up to and including n. 1 * 2 * 3 * 4 * ... * n. By convention, 0! = 1, only because we need that to use factorial to solve counting problems from probability and statistics. Typically, I treat factorial as a function defined only for n > 0.

The Play-By-Play

I start reading https://www.codewars.com/kata/number-of-trailing-zeros-of-n and as soon as I read the headline of the exercise, my mind starts generating ideas, so I start to put them in my inbox.

count the number of 2s and 5s of factors
for every factor of 5 there exists a unique 2 that “comes before” that 5, so just count the 5s

First, I get this information out of my head, then I can take a moment to articulate what I mean and clarify it so that those not living directly inside my mind will likely understand. When a flood of ideas becomes the bottleneck, then I focus on writing down just enough to remind me of each idea, confident that I’ll come back later to add useful details. (Scroll to the bottom if you can’t stand the suspense, then come back here.)

This coding kata seems pretty easy, so it feels to me like confirming my understanding of elementary arithmetic, rather than writing code. So why do it? Perhaps it becomes a useful kata for practising with unfamiliar programming languages. For now, I try this with Java 10. I know Java, but not necessarily Java 10, and I don’t yet know Vavr very well, so I will focus my energy there. I hope that, by reading others’ solutions, I will pick up a few things about modern Java.

I have already set up a project with Gradle, Java 10, and Vavr in IntelliJ IDEA, so I start a new branch in order to start work on this exercise.

$ cd $PROJECT_ROOT
$ git status -s
[We're clean.]
$ git checkout -b number-of-trailing-zeros-of-n
[We're on this new branch.]

Next, I read the exercise description more thoroughly to check whether I have understood the exercise the way they intend it. I have my inbox nearby with a pen to make notes. The description doesn’t say much: it points to an article that explains the factorial function and mentions two examples: 6! ends in 1 zero and 12! ends in 2 zeroes. It helpfully adds “You’re not meant to calculate the factorial.” Good! As they point out, 1000! has 2568 digits. Sure, Ruby’s built-in number library can handle it, but it defeats the purpose of the exercise to merely compute the factorial and then count the trailing zeroes. From what I can tell, it involves more work, anyway.

I can use my familiarity with Ruby to help me find examples for this kata, which I can use to check my Java code. I could also open LibreOffice Calc and create a spreadsheet, but I don’t think Calc can handle numbers with 2000 digits the way that Ruby’s number library can. I’ll come back to this when I want to add my own tests.

Start Programming

I think I know what to do, so I press Train, then copy the sample tests into my programming environment as a starting point.

@Test
public void testZeros() throws Exception {
  assertThat(Solution.zeros(0), is(0)); 
  assertThat(Solution.zeros(6), is(1)); 
  assertThat(Solution.zeros(14), is(2));    
}

Ugh, no. One action per test, please. On the bright side, this gives me a chance to learn how JUnit 5 supports the Parameterized Test Case pattern.

The Codewars kata refers to 6! and 12! as examples, but uses 0!, 6! and 14! as the example tests for you to start from. As far as I can tell, there exists no significance here: the coincidence of the 6s might prompt you to expect 12 instead of 14 here. This tripped up at least one reader, so I assure you that it means nothing.

JUnit 5 and Parameterized Test Cases

In JUnit, one typically implements a test as a method. Sometimes we want to run the same test with several different input/output pairs. We could write a loop inside a single JUnit test method, but then when one test fails, the rest don’t run. In JUnit 3, the community developed the Parameterized Test Case pattern, which took advantage of the test class constructor to run the same test “engine” with multiple data “rows” (I picture this as rows in a table) and have the JUnit 3 test runner run the tests as expected and report sensible results. If the 4th example out of 7 fails, then JUnit 3 would run all 7 examples and report the 4th as failing. One can even differentiate the tests with names that make it easier to pinpoint the example that failed. I have read that JUnit 5 makes this easier, so I’d like to learn how to do that.

When I demonstrate TDD in Java, I tend to use Spock and Groovy for this kind of test, because I really like the syntax, particularly of “unrolling” the test engine. Maybe JUnit 5 provides something “nearly as nice” as Spock, so that I don’t have to confuse an audience by introducing Spock and Groovy when I’d rather focus on the fundamentals of TDD in Java.

With just a little searching, I find https://junit.org/junit5/docs/current/user-guide/#writing-tests-parameterized-tests and this shows me how to paste Codewars’ starter examples into my code as Parameterized Tests.

Add Build-Time Dependencies

I note https://junit.org/junit5/docs/current/user-guide/#writing-tests-parameterized-tests-setup which tells me how to add the necessary dependencies to the build.

# $PROJECT_ROOT/build.gradle
[...]
dependencies {
    [...]
    testCompile('org.junit.jupiter:junit-jupiter-api:5.2.0')
    testCompile('org.junit.jupiter:junit-jupiter-params:5.2.0') // New!
    testRuntime('org.junit.jupiter:junit-jupiter-engine:5.2.0')
}
[...]

Translate the Examples into Parameterized Tests

It took several minutes to learn how to do this. I started with a ValueSource, but settled on a MethodSource, during which time I also learned about Arguments. I dislike positional parameters—I have flashbacks of SQL prepared statements and keeping track of all those ? placeholders—but in the interest of learning here, I try not to fight them. I end up with the following JUnit 5 test class.

package ca.jbrains.math.test;

import org.junit.jupiter.api.Assertions;
import org.junit.jupiter.params.ParameterizedTest;
import org.junit.jupiter.params.provider.Arguments;
import org.junit.jupiter.params.provider.MethodSource;

public class CountTheTrailingZeroesInAFactorialTest {
    private static java.util.stream.Stream<Arguments> examples() {
        // SMELL Positional parameters.
        // 0: a natural number
        // 1: the expected number of trailing zeroes in [0]'s factorial
        return java.util.stream.Stream.of(
                Arguments.of(0, 0),
                Arguments.of(6, 1),
                Arguments.of(14, 2)
        );
    }

    @ParameterizedTest
    @MethodSource("examples")
    void checkTheTrailingZeroesInTheFactorialOfANaturalNumber(int n, int expectedTrailingZeroes) {
        Assertions.assertEquals(
                expectedTrailingZeroes,
                countTrailingZeroesInTheFactorialOfANaturalNumber(n),
                String.format("Wrong number of trailing zeroes for factorial(%d)", n));
    }

    private int countTrailingZeroesInTheFactorialOfANaturalNumber(int n) {
        return -762;
    }
}

Over the years I’ve adopted some conventions that I explain in my introductory TDD training course.

I almost always put tests in a package separate from production code.
I almost always name the test class for the behavior and not the class I intend to implement. (Indeed, there exists no production class yet.)
I typically start by implementing the production code as a method in the test class, which I intend to extract into production code later.
When a function returns a number, I like to return -762 as a placeholder. The specific number doesn’t matter, but 0 might coincidentally pass, so I don’t use 0.

If you object to the name n, then I usually agree with you, but I genuinely don’t believe that aNaturalNumber communicates much more than n in this context. If you paired with me and insisted, then I’d fight you for only 5 seconds before giving in.

I typically don’t bother with failure messages in my assertions, but I know from experience that they help specifically with parameterized tests: I can more easily identify which example has failed when one fails.

We now have the examples that Codewars provides, and they fail in a way that I understand.

Interlude: Generate Examples

I want to create more examples and I suspect that I would find that annoying in Java, so I choose to do it in Ruby. I know Ruby well enough and irb makes it easy to just play around and get a quick answer.

# Let's be all new and current, even though the number library probably hasn't changed in decades.
$ rvm use 2.5.1
$ irb

[...]
2.5.1 :009 > def factorial(n)
2.5.1 :010?>   product = 1
2.5.1 :011?>   (2..n).each { |i|
2.5.1 :012 >       product = product * i
2.5.1 :013?>     }
2.5.1 :014?>   return product
2.5.1 :015?>   end
 => :factorial 
2.5.1 :016 > factorial(1)
 => 1 
2.5.1 :017 > factorial(2)
 => 2 
2.5.1 :018 > factorial(3)
 => 6 
2.5.1 :019 > factorial(5)
 => 120 
2.5.1 :020 > factorial(19)
 => 121645100408832000 
2.5.1 :021 > factorial(1000)
 => 402387260077093773543702433923003985719374864210714632543799910429938512398629020592044208486969404800479988610197196058631666872994808558901323829669944590997424504087073759918823627727188732519779505950995276120874975462497043601418278094646496291056393887437886487337119181045825783647849977012476632889835955735432513185323958463075557409114262417474349347553428646576611667797396668820291207379143853719588249808126867838374559731746136085379534524221586593201928090878297308431392844403281231558611036976801357304216168747609675871348312025478589320767169132448426236131412508780208000261683151027341827977704784635868170164365024153691398281264810213092761244896359928705114964975419909342221566832572080821333186116811553615836546984046708975602900950537616475847728421889679646244945160765353408198901385442487984959953319101723355556602139450399736280750137837615307127761926849034352625200015888535147331611702103968175921510907788019393178114194545257223865541461062892187960223838971476088506276862967146674697562911234082439208160153780889893964518263243671616762179168909779911903754031274622289988005195444414282012187361745992642956581746628302955570299024324153181617210465832036786906117260158783520751516284225540265170483304226143974286933061690897968482590125458327168226458066526769958652682272807075781391858178889652208164348344825993266043367660176999612831860788386150279465955131156552036093988180612138558600301435694527224206344631797460594682573103790084024432438465657245014402821885252470935190620929023136493273497565513958720559654228749774011413346962715422845862377387538230483865688976461927383814900140767310446640259899490222221765904339901886018566526485061799702356193897017860040811889729918311021171229845901641921068884387121855646124960798722908519296819372388642614839657382291123125024186649353143970137428531926649875337218940694281434118520158014123344828015051399694290153483077644569099073152433278288269864602789864321139083506217095002597389863554277196742822248757586765752344220207573630569498825087968928162753848863396909959826280956121450994871701244516461260379029309120889086942028510640182154399457156805941872748998094254742173582401063677404595741785160829230135358081840096996372524230560855903700624271243416909004153690105933983835777939410970027753472000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

Here, I just write the obvious implementation of factorial and convince myself that I got it right. I choose the iterative version over the recursive version to avoid stack overflow errors, which would only distract me. I see a bunch of zeroes at the end of 1000!.

How do I count these zeroes? I could try to pick them off at the end of the string, but I don’t know the string manipulation library of Ruby well enough to do that, so I cheat and reverse the string, then use a regular expression to match the now-leading zeroes. Then it becomes easy to count them.

# First, turn the number into a string so that I can use regexes on it.
2.5.1 :022 > factorial(1000).to_s
 => 
[long string ending in zeroes]
# Next, reverse the string so that the trailing zeroes become easier/faster to count.
2.5.1 :023 > factorial(1000).to_s.reverse
 => 
[long string starting with zeroes]
# Regex! Starts with '0' characters, followed by not a '0'.
2.5.1 :024 > /^(0+)[^0]/.match(factorial(1000).to_s.reverse)
 => #<MatchData "0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000002" 1:"000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"> 
[Aha! the match is an Array whose 2nd item is just the zeroes.]
2.5.1 :025 > /^(0+)[^0]/.match(factorial(1000).to_s.reverse)[1]
 => "000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000" 
# How many zeroes are there?
2.5.1 :026 > /^(0+)[^0]/.match(factorial(1000).to_s.reverse)[1].length
 => 249 
[...]
2.5.1 :039 > /^(0+)[^0]/.match(factorial(999).to_s.reverse)[1].length
 => 246 
[Now I put this in a function to make it easier to reuse. I also handle the "no zeroes" case, albeit in a clunky way.]
2.5.1 :046 > def g(n); (/^(0+)[^0]/.match(factorial(n).to_s.reverse) || ["", ""])[1].length; end;
 => :g 
2.5.1 :047 > g(1)
 => 0 
2.5.1 :048 > g(5)
 => 1 
2.5.1 :049 > g(10)
 => 2 
2.5.1 :050 > g(25)
 => 6 
2.5.1 :051 > g(26)
 => 6 
2.5.1 :052 > g(30)
 => 7

That seems to work. It also gives me a clue about how to count the trailing zeroes: remember that when we multiply by 25, we introduce two factors of 5, and not just one, which explains why g(25) is 6 and not 5. This goes into the inbox.

count the number of 2s and 5s of factors
for every factor of 5 there exists a unique 2 that “comes before” that 5, so just count the 5s
remember that some numbers have many factors of 5, such as multiples of 25, 125, 625, and so on.

Now I can build examples by using function g (a silly name, but this function will not survive the day, so it suffices) and paste the expected result into my JUnit parameterized tests. That will help me build up the solution. This last observation about what happens at 25 tells me about some “interesting” special cases: 24, 25, 26, 124, 125, 126, 624, 625, 626, but also 49, 50, 51 and 74, 75, 76 and 99, 100, 101 and so on. Checking the numbers “on either side” of the “interesting” numbers (the ones with more than one 5 factor in them) comes from an old heuristic for choose test cases: don’t check only the boundaries, but also “close to” the boundaries.

Interlude: Two Kinds of Tests

I want to draw your attention to the two kinds of tests that I have, which have different purposes.

Tests that check the correctness of the code.
Tests that check the correctness of the algorithm.

I distinguish this way: the first set of tests relates to counting factors of 5 correctly (what the code does), while the second set of tests relates to computing the number of zeroes at the end of n! correctly (the problem that I intend this code to solve). More generally, the correctness of the algorithm refers to solving the intended problem, whereas the correctness of the code refers to writing code that behaves as expected. Even for a problem this tiny, we can have Programmer Tests for correct code and Customer Tests for solving the intended problem.

When I generate an example with my g Ruby function, I generate a Customer Test. I have in mind a Java function like countTheFactors() for which I’ll write Programmer Tests. The expected results for these two kinds of tests will look the same at first. This happens often. I still distinguish the tests because they have different audiences and gradually diverge over time.

Write Production Code

In the interest of simplifying the test results, I remove the existing Customer Tests and add code using a variation of acceptance-test-driven development (ATDD) that allows me to sidestep the more-complicated setup of multiple test suites. I don’t feel like learning that much Gradle today.

Choose a Customer Test, run it, then see it fail.
Remove the Customer Test, see the green bar.
Test-drive enough code to make the Customer Test pass.
Put the Customer Test back, run all the tests, see the green bar.
Loop until I can’t think of a Customer Test that would probably fail.

The Simplest Cases: 1-4

I start with the Customer Tests for 1 through 4, because they all expect the same result: 0.

private static java.util.stream.Stream<Arguments> examples() {
    // SMELL Positional parameters.
    // 0: a natural number
    // 1: the expected number of trailing zeroes in [0]'s factorial
    return Stream.of(1, 2, 3, 4).map(n -> Arguments.of(n, 0))
        .toJavaStream();
}

Vavr has its own Stream library, which I want to learn, but JUnit 5 uses the Java Stream library for parameterized tests, so I need to invoke toJavaStream() in order to expose the right type to JUnit 5.

Over time, I’ll probably want to extract a function for creating the Vavr Stream in order to hide the detail of converting to a Java Stream.

I make the 1 case pass, then add the 2-4 cases and they already pass. I don’t need any Programmer Tests yet, because the implementation doesn’t require it.

private int countTrailingZeroesInTheFactorialOfANaturalNumber(int n) {
    return 0;
}

We can count the trailing zeroes in the factorial of a natural number, as long as there are no trailing zeroes.

The First Trailing Zero: 5

I check the Customer Test for 5, even though I “know” what to expect.

2.5.1 :058 > g(5)
 => 1

I write this in Java.

private static java.util.stream.Stream<Arguments> examples() {
    return Stream.of(1, 2, 3, 4).map(n -> Arguments.of(n, 0))
        .append(Arguments.of(5, 1))  // New!
        .toJavaStream();
}

The test fails. I make it pass with my standard trick: treat the new input like a special case, otherwise do “the old thing”.

private int countTrailingZeroesInTheFactorialOfANaturalNumber(int n) {
    if (n == 5)
        return 1;
    else
        return 0;
}

The tests pass. Now I clean the kitchen. Well… I find nothing to clean. Although I find this code verbose, I don’t like the alternatives: I like the conciseness of the ternary operator, but I plan to add more special cases in a moment that will affect my feelings about that; if I keep the if, then I prefer to keep the else, if though I don’t need it, because I prefer how it expresses my intent. I move on.

Still Only One Trailing Zero: 6-9

I continue as I have gone so far, starting with 6. First, I check the Customer Test.

2.5.1 :063 > g(6)
 => 1

Next, I implement it in Java.

private static java.util.stream.Stream<Arguments> examples() {
    return Stream.of(1, 2, 3, 4).map(n -> Arguments.of(n, 0))
        .appendAll(Stream.of(5, 6).map(n -> Arguments.of(n, 1))) // New!
        .toJavaStream();
}

private int countTrailingZeroesInTheFactorialOfANaturalNumber(int n) {
    if (n >= 5) // New!
        return 1;
    else
        return 0;
}

What works for 6 works for 7-9—I checked in irb—but now I want a function that produces the list of numbers from 5-9 instead of enumerating them. Vavr must have that. It does: Stream.range(). I have to remember that range() excludes the end point of the range.

I refactor the existing tests to use range().

private static java.util.stream.Stream<Arguments> examples() {
    return Stream.range(1, 5).map(n -> Arguments.of(n, 0))
        .appendAll(Stream.range(5, 7).map(n -> Arguments.of(n, 1)))
        .toJavaStream();
}

Replaced enumerated lists with ranges.

Next, I add the new tests all at once, only because I feel very certain that they will already pass.

Whenever I tell myself that I feel “very certain” about something, I immediately think about how I would recover from getting it wrong. This reminds me to commit frequently, so that I routinely have the option to go back to the most recent committed version of the code, which probably works.

Indeed, they pass.

private static java.util.stream.Stream<Arguments> examples() {
    return Stream.range(1, 5).map(n -> Arguments.of(n, 0))
        .appendAll(Stream.range(5, 10).map(n -> Arguments.of(n, 1)))
        .toJavaStream();
}

We can count the trailing zeroes in a factorial of a natural number, up to 9.

I look at the tests and notice a delightful pattern: 1-5 (excluding 5) maps to 0, 5-10 (excluding 10) maps to 1. By chance, could we have 10-15 maps to 2 and 15-20 maps to 3 and so on forever? No, but I want to point out that refactoring tests only has the benefit of making patterns easier to spot, which explains why I refactor tests almost exactly the same way that I refactor production code. (Trust me: the few exceptions don’t bear on this example.)

Boring Examples: 10-24

I call these examples “boring” only because they follow the pattern I’d spotted so far: once we get to 5n, the number of trailing zeroes becomes n. I check this in Ruby.

2.5.1 :068 > Hash[(0...25).map { |n| [n, g(n)] }]
 => {0=>0, 1=>0, 2=>0, 3=>0, 4=>0, 5=>1, 6=>1, 7=>1, 8=>1, 9=>1, 10=>2, 11=>2, 12=>2, 13=>2, 14=>2, 15=>3, 16=>3, 17=>3, 18=>3, 19=>3, 20=>4, 21=>4, 22=>4, 23=>4, 24=>4}

I implement it in Java. I would normally want to remove duplication and take advantage of the arithmetic pattern, but I need to remember the audience of my tests. For Customer Tests, I need to think about what the Customer would find easy to read and understand at a glance. In this case, I imagine the Customer would have significant mathematical experience, but not necessarily significant experience reading Java 10’s Stream interfaces. For that reason, I write the tests “the long way” and sit with them for a while before trying to remove the obvious duplication.

At first, I write this.

private static java.util.stream.Stream<Arguments> examples() {
    return Stream.range(1, 5).map(n -> Arguments.of(n, 0))
        .appendAll(Stream.range(5, 10).map(n -> Arguments.of(n, 1)))
        .appendAll(Stream.range(10, 15).map(n -> Arguments.of(n, 2)))
        .appendAll(Stream.range(15, 20).map(n -> Arguments.of(n, 3)))
        .appendAll(Stream.range(20, 25).map(n -> Arguments.of(n, 4)))
        .toJavaStream();
}

Then I notice that I can at least remove the details of the Java code to leave behind the arithmetic sequence, which the Customer would probably care more about.

private static java.util.stream.Stream<Arguments> examples() {
    return Stream.range(1, 5).map(expectThisManyTrailingZeroes(0))
        .appendAll(Stream.range(5, 10).map(expectThisManyTrailingZeroes(1)))
        .appendAll(Stream.range(10, 15).map(expectThisManyTrailingZeroes(2)))
        .appendAll(Stream.range(15, 20).map(expectThisManyTrailingZeroes(3)))
        .appendAll(Stream.range(20, 25).map(expectThisManyTrailingZeroes(4)))
        .toJavaStream();
}

@NotNull
private static Function<Integer, Arguments> expectThisManyTrailingZeroes(int expectedTrailingZeroes) {
    // SMELL Positional parameters.
    // 0: a natural number
    // 1: the expected number of trailing zeroes in [0]'s factorial
    return n -> Arguments.of(n, expectedTrailingZeroes);
}

Named a complicated anonymous expression.

I try a few design options and I prefer this for its balance of hiding details with grouping the data appropriately: keeping the bounds of the range of inputs separate from the expected result.

But then I see that I could move the procedural appendAll() to the end where it feels more out of the programmer’s way.

private static java.util.stream.Stream<Arguments> examples() {
    return Stream.of(groupsOfExamples()).reduce(Stream::appendAll).toJavaStream();
}

@NotNull
private static Stream[] groupsOfExamples() {
    return new Stream[]{
        Stream.range(1, 5).map(expectThisManyTrailingZeroes(0)),
        Stream.range(5, 10).map(expectThisManyTrailingZeroes(1)),
        Stream.range(10, 15).map(expectThisManyTrailingZeroes(2)),
        Stream.range(15, 20).map(expectThisManyTrailingZeroes(3)),
        Stream.range(20, 25).map(expectThisManyTrailingZeroes(4))
    };
}

Now both the Programmer and the Customer can focus on reading groupsOfExamples() and see almost a table summarizing the examples: 1! to 4! (not 5) each have 0 trailing zeroes, 5! to 9! (not 10) each have 1, and so on. Again, if I wrote this with Spock, Gherkin, or Fit, then I would format all this in a literal table. I could also negotiate with the Customer—if there existed one other than me—regarding whether to express this as 5 rows or as a single expression that generates 5 rows. I have the option, and that suffices for the moment.

Hid the uninteresting details of the expressions that generate the Customer Tests.

Next, I want to make the tests pass. Since all the examples starting at 10 fail, I temporarily withdraw the tests for 11 and up while I write code for the 10 case.

Here, I look ahead and see the pattern: I think I want the sum of the number of 5s as factors of each number from 1 to n. From there, I even believe that I can improve execution time by taking the same sum from 5 to n by 5. For example: if n=34, then I mean for 5, 10, 15, 20, 25, 30. I think I know how to do that with the usual functional programming libraries, but I’d rather not jump there now. I can simplify the expression because, up to and including 24, the number of 5s in each natural number is either 0 or 1. I can generalize later. I quickly write down the idea in my inbox in order to get back to the task at hand.

count the number of 2s and 5s of factors
for every factor of 5 there exists a unique 2 that “comes before” that 5, so just count the 5s
remember that some numbers have many factors of 5, such as multiples of 25, 125, 625, and so on.
take the sum of the number of 5s in the factors of 1 up to and including n, then optimize by only looking at the multiples of 5.

I don’t mind looking ahead, but I prefer not to code ahead. In the meantime, I can map every multiple of 5 to 1 and every non-multiple of 5 to 0, then sum the result. Once again, I prefer not to write that directly, but rather to go “the long way” up to 24, then try the cleverer solution and see that it still works. (Make it work before making it “right”.) As I spend a few seconds going “the long way” I might see that the cleverer solution will work, in which case I’d find it easy to switch to that solution. For now, I make the 10 case pass.

private int countTrailingZeroesInTheFactorialOfANaturalNumber(int n) {
    if (n >= 10)
        return 2;
    else if (n >= 5)
        return 1;
    else
        return 0;
}

We can count the trailing zeroes in a factorial of a natural number, up to 10.

I think I see the pattern, so since I’ve just committed, I jump to the 24 case right away.

private int countTrailingZeroesInTheFactorialOfANaturalNumber(int n) {
    return n / 5;
}

Clever. Maybe too clever. Fortunately, the next test will challenge the assumption in this “solution”.

We can count the trailing zeroes in a factorial of a natural number, up to 24.

An Interesting Example: 25

At 25, things change. I check this in Ruby.

2.5.1 :069 > g(25)
 => 6

Of course! At 25, there are two 5s (25 = 5 × 5), and so there are 6 5s in 25!, meaning 6 factors of 10, and therefore 6 trailing zeroes.

In a TDD training class, I would add the special case and then write more tests to “force” me to generalize, but since I already see the pattern coming, and since I trust my refactoring discipline and skill to simplify the code later, I write code that expresses my intent more directly, rather than strive relentlessly for economy of space.

First, I write the test, and then I make it pass.

private static Stream[] groupsOfExamples() {
    return new Stream[]{
        Stream.range(1, 5).map(expectThisManyTrailingZeroes(0)),
        Stream.range(5, 10).map(expectThisManyTrailingZeroes(1)),
        Stream.range(10, 15).map(expectThisManyTrailingZeroes(2)),
        Stream.range(15, 20).map(expectThisManyTrailingZeroes(3)),
        Stream.range(20, 25).map(expectThisManyTrailingZeroes(4)),
        Stream.of(expectThisManyTrailingZeroes(6).apply(25))  // New!
    };
}
[...]
private int countTrailingZeroesInTheFactorialOfANaturalNumber(int n) {
    // Remember: range(1, n + 1) means 1 up to and including n.
    return Stream.range(1, n + 1).map(this::factorsOfFiveIn).sum().intValue();
}

private int factorsOfFiveIn(int n) {
    return n >= 25 ? 2 : (n % 5 == 0 ? 1 : 0);
}

We can count the trailing zeroes in a factorial of a natural number, up to 25.

Now I look at factorsOfFiveIn() and I want to write Programmer Tests for it. If I did that, then the Customer Tests would surely pass. I trust my confidence, and so I commit, then start a timer for 10 minutes, then do exactly that.

You might find it strange to set a timer for 10 minutes for something so simple. I would, too, if I hadn’t gone down so many rabbit holes for things that I expected to take “2 minutes” or “5 minutes”. I’ve learned that I have trouble judging short time periods, and so I set a timer.

[...]
// I must make this nested class public, otherwise JUnit will not run the tests.
public static class CountFactorsOfFiveInANaturalNumberTest {
    private static java.util.stream.Stream<Arguments> examples() {
        return Stream.of(groupsOfExamples()).reduce(Stream::appendAll).toJavaStream();
    }

    @NotNull
    private static Stream[] groupsOfExamples() {
        return new Stream[]{
                Stream.of(5, 10, 15, 20).map(expectThisManyFactorsOfFive(1)),
                Stream.of(25).map(expectThisManyFactorsOfFive(2)),
                Stream.of(
                        Stream.range(1, 5),
                        Stream.range(6, 10),
                        Stream.range(11, 15),
                        Stream.range(16, 20),
                        Stream.range(21, 25)
                ).reduce(Stream::appendAll).map(expectThisManyFactorsOfFive(0))
        };
    }

    @NotNull
    private static Function<Integer, Arguments> expectThisManyFactorsOfFive(int expected) {
        return n -> Arguments.of(n, expected);
    }

    @ParameterizedTest
    @MethodSource("examples")
    void checkTheNumberOfFactorsOfFiveInANaturalNumber(int n, int expected) {
        Assertions.assertEquals(
            expected,
            factorsOfFiveIn(n),
            String.format("Wrong number of factors of 5 in %d", n));
    }

    public static int factorsOfFiveIn(int n) {
        return n >= 25 ? 2 : (n % 5 == 0 ? 1 : 0);
    }
}

Here, I organize the tests slightly differently: I emphasize the special cases first, then add the “boring” cases at the end.

We now check the core of our algorithm more directly.

The Next Interesting Example: 125

I jump to 125, confident that I’ll remember the cases in between. (Even I can count from 1 to 125 without forgetting.) I try writing a “functional” implementation, but it feels unnecessarily circuitous and I can’t get it to work, so I settle for a procedural implementation for now.

public static int factorsOfFiveIn(int n) {
    int factorsOfFive = 0;
    while (n > 1) {
        if (n % 5 == 0) factorsOfFive++;
        n /= 5;
    }
    return factorsOfFive;
}

We can now count the factors of 5 in a number up to 125.

I post this as a gist so that the world can teach me how to write it better. I find this technique invaluable. Whenever I want to learn something about code, I post my code and ask the world to beat that code to death. So far, nobody has made me cry.

I don’t always trust the entire world with my questions, so I have a few communities that I turn to for help when I need it. I prefer to open my questions to as large a community as possible, but I would act foolishly to trust everyone with deeper, more personal questions. For that, I turn to smaller communities of people I trust. I don’t always have a trusted community that fits my question of the moment. In those cases, I pick one person and ask them for help.

From here, the examples become less interesting, so I just “fill in the gaps”.

I start by picking a few powers of 5 to spot-check my algorithm. I suppose that, if my algorithm works up to 5^10, then it works above that. The probability of success seems high, so I feel confident.

For this, I reorganize the tests to isolate the powers of 5 from the rest.

private static Stream[] groupsOfExamples() {
    return new Stream[]{
        Stream.range(1, 10).map(expectFiveToTheNToHaveNFactorsOfFive()),
        Stream.of(10, 15, 20).map(expectThisManyFactorsOfFive(1)),
        Stream.of(
            Stream.range(1, 5),
            Stream.range(6, 10),
            Stream.range(11, 15),
            Stream.range(16, 20),
            Stream.range(21, 25)
        ).reduce(Stream::appendAll).map(expectThisManyFactorsOfFive(0))
    };
}
[...]
private static Function<Integer, Arguments> expectFiveToTheNToHaveNFactorsOfFive() {
    return n -> Arguments.of(raiseToPower(5, n), n);
}

// REFACTOR Move to generic math library
private static int raiseToPower(int base, int power) {
    return (int)(Math.pow((double) base, power));
}

We can now count the factors of 5 for powers of 5 up to 5^10.

Next, I add the cases up to 124 before fear transforms into boredom, and then I sample a few values near the boundaries. And when I do this, I find that I have written incorrect code. First, the new tests.

private static Stream[] groupsOfExamples() {
    return new Stream[]{
        Stream.range(1, 10).map(expectFiveToTheNToHaveNFactorsOfFive()),
        Stream.of(50, 75, 100).map(expectThisManyFactorsOfFive(2)),
        Stream.of(10, 15, 20,
                  30, 35, 40, 45,
                  55, 60, 65, 70,
                  80, 85, 90, 95,
                  105, 110, 115, 120).map(expectThisManyFactorsOfFive(1)),
        Stream.of(
            Stream.range(1, 125).filter(n -> n % 5 > 0)
        ).reduce(Stream::appendAll).map(expectThisManyFactorsOfFive(0))
    };
}

I like the tests. They express my intent fairly directly:

For n from 1 to 10, 5^n has n factors of 5. (I choose 10 somewhat arbitrarily as a “reasonable” sample.)
For the remaining multiples of 5^2, there are 2 factors of 5.
For the remaining multiples of 5, there is 1 factor of 5.
For everything that isn’t a multiple of 5, there are 0 factors of 5.

We now have better tests for counting the factors of 5 from n=1 to 125 and a handful of powers of 5 above that.

Sadly, the current implementation is wrong, specifically with regard to cases 26-29. This has to do with integer division and me not thinking clearly enough. I should continue counting factors of 5 only as long as the interim n is a multiple of 5.

public static int factorsOfFiveIn(int n) {
    if (n % 5 == 0) return 1 + factorsOfFiveIn(n / 5);
    else return 0;
}

I really like this for its conciseness and clarity. Later I find out that I could have written it with tail-recursion, but I won’t let that spoil my past enjoyment.

We now actually count correctly the factors of 5 up to 125, and probably beyond that.

Just For Safety: Up to 625

Since I got the code wrong once, I want to check it further. I decide to check it up to 625, with the idea that if I get it right up to 5^4, then I can’t possibly get it wrong in general. I understand your hesitance. Go with me, anyway.

I add the tests up to 625, hoping that none of them fail.

private static Stream[] groupsOfExamples() {
    return new Stream[]{
        Stream.range(1, 10).map(expectFiveToTheNToHaveNFactorsOfFive()),
        Stream.of(250, 375, 500).map(expectThisManyFactorsOfFive(3)),
        Stream.of(50, 75, 100,
                  150, 175, 200, 225,
                  275, 300, 325, 350,
                  400, 425, 450, 475,
                  525, 550, 575, 600).map(expectThisManyFactorsOfFive(2)),
        Stream.of(10, 15, 20,
                  30, 35, 40, 45,
                  55, 60, 65, 70,
                  80, 85, 90, 95,
                  105, 110, 115, 120,
                  155, 160, 165, 170,
                  180, 185, 190, 195,
                  205, 210, 215, 220,
                  230, 235, 240, 245,
                  255, 260, 265, 270,
                  280, 285, 290, 295,
                  305, 310, 315, 320,
                  330, 335, 340, 345,
                  355, 360, 365, 370,
                  380, 385, 390, 395,
                  405, 410, 415, 420,
                  430, 435, 440, 445,
                  455, 460, 465, 470,
                  480, 485, 490, 495,
                  505, 510, 515, 520,
                  530, 535, 540, 545,
                  555, 560, 565, 570,
                  580, 585, 590, 595,
                  605, 610, 615, 620).map(expectThisManyFactorsOfFive(1)),
        Stream.of(
            Stream.range(1, 625).filter(n -> n % 5 > 0)
        ).reduce(Stream::appendAll).map(expectThisManyFactorsOfFive(0))
    };
}

Sometimes one simply needs to crank out all the cases. I could try to generate the tests with some expression, but I don’t feel confident that it would give other programmers much confidence. At some point, if the expressions become complicated enough, then I have the same (lack of) confidence in them as in the implementation. This last point also describes a weakness in Design by Contract and why I would supplement it with some Specification by Example.

We now actually count correctly the factors of 5 up to 625, and almost certainly beyond that.

To 625 and Beyond

I see that I could simplify the tests by duplicating some of them, but then I worry that I would run the risk of duplicating the implementation, making the tests less useful as pools of change detectors. I could, for example, generate tests for 5^n, then for multiples of 5, multiples of 25, multiples of 125, … and then at the end for non-multiples of 5. I choose to put this in the inbox, then come back to it later.

count the number of 2s and 5s of factors
for every factor of 5 there exists a unique 2 that “comes before” that 5, so just count the 5s
remember that some numbers have many factors of 5, such as multiples of 25, 125, 625, and so on.
take the sum of the number of 5s in the factors of 1 up to and including n, then optimize by only looking at the multiples of 5.
generate tests for multiples of 5, then multiples of 5^2, then 5^3… then eventually for non-multiples of 5.

For now, at least, I think the algorithm works for all integers n, so I can try Customer Tests for as many n as my Customer cares about.

private static Stream[] groupsOfExamples() {
    return new Stream[]{
        Stream.range(1, 5).map(expectThisManyTrailingZeroes(0)),
        Stream.range(5, 10).map(expectThisManyTrailingZeroes(1)),
        Stream.range(10, 15).map(expectThisManyTrailingZeroes(2)),
        Stream.range(15, 20).map(expectThisManyTrailingZeroes(3)),
        Stream.range(20, 25).map(expectThisManyTrailingZeroes(4)),
        Stream.range(25, 30).map(expectThisManyTrailingZeroes(6)),
        Stream.range(30, 35).map(expectThisManyTrailingZeroes(7)),
        Stream.range(35, 40).map(expectThisManyTrailingZeroes(8)),
        Stream.range(40, 45).map(expectThisManyTrailingZeroes(9)),
        Stream.range(45, 50).map(expectThisManyTrailingZeroes(10)),
        Stream.range(50, 55).map(expectThisManyTrailingZeroes(12)),
        Stream.range(55, 60).map(expectThisManyTrailingZeroes(13)),
        Stream.range(60, 65).map(expectThisManyTrailingZeroes(14)),
        Stream.range(65, 70).map(expectThisManyTrailingZeroes(15)),
        Stream.range(70, 75).map(expectThisManyTrailingZeroes(16)),
        Stream.range(75, 80).map(expectThisManyTrailingZeroes(18)),
        Stream.range(80, 85).map(expectThisManyTrailingZeroes(19)),
        Stream.range(85, 90).map(expectThisManyTrailingZeroes(20)),
        Stream.range(90, 95).map(expectThisManyTrailingZeroes(21)),
        Stream.range(95, 100).map(expectThisManyTrailingZeroes(22)),
        Stream.range(100, 105).map(expectThisManyTrailingZeroes(24)),
        Stream.range(105, 110).map(expectThisManyTrailingZeroes(25)),
        Stream.range(110, 115).map(expectThisManyTrailingZeroes(26)),
        Stream.range(115, 120).map(expectThisManyTrailingZeroes(27)),
        Stream.range(120, 125).map(expectThisManyTrailingZeroes(28)),
        Stream.of(Arguments.of(125, 31)),
        Stream.of(Arguments.of(5 * 125, 31 * 5 + 1)),
        Stream.of(Arguments.of(5 * 5 * 125, (31 * 5 + 1) * 5 + 1)),
        Stream.of(Arguments.of(5 * 5 * 5 * 125, ((31 * 5 + 1) * 5 + 1) * 5 + 1 )),
    };
}

This seems to suffice. On an industrial-strength project I might spend 15 more minutes adding more examples, but for me for now, this suffices.

We seem to count correctly the factors of 5 and the number of trailing zeroes as far as we want to go.

How much can we clean the inbox?

~~count the number of 2s and 5s of factors~~
~~for every factor of 5 there exists a unique 2 that “comes before” that 5, so just count the 5s~~
~~remember that some numbers have many factors of 5, such as multiples of 25, 125, 625, and so on.~~
~~take the sum of the number of 5s in the factors of 1 up to and including n, then optimize by only looking at the multiples of 5.~~
generate tests for multiples of 5, then multiples of 5^2, then 5^3… then eventually for non-multiples of 5.

I intend to reorganize my solution, then paste it into Codewars. If it fails, then I will keep working, otherwise I will declare victory.

Codewars Doesn’t Speak Vavr

In order to paste my solution into Codewars, I have to unwind the Vavr part. I end up pasting this into Codewars.

import java.util.stream.*;

public class Solution {
    public static int zeros(int n) {
        return countTrailingZeroesInTheFactorialOfANaturalNumber(n);
    }

    public static int countTrailingZeroesInTheFactorialOfANaturalNumber(int n) {
        // Remember: range(1, n + 1) means 1 up to and including n.
        return IntStream.range(1, n / 5 + 1).map(x -> 5 * x).map(Solution::factorsOfFiveIn).sum();
    }

    public static int factorsOfFiveIn(int n) {
        if (n % 5 == 0) return 1 + factorsOfFiveIn(n / 5);
        else return 0;
    }
}

I try to improve execution speed by counting the factors of 5 in only the numbers that have them, which reduces the number of invocations of factorsOfFiveIn() by about 80%. This seems like it should help. I don’t immediately understand the purpose of that code when I read it, so I wouldn’t keep that code unless I hid it behind an explaining method. It doesn’t seem to make much difference in execution time on Codewars, so I revert to the clearer code that computes the factors of 5 in every number from 1 up to n.

Codewars Comments

I look through the Codewars comments to try to poke holes in my solution…

A comment from user @dokwork (to someone else’s solution) reminds me to try Integer.MAX_VALUE. Of course, this causes a problem, because it takes a long time for Ruby to compute this answer “the long way”. I can certainly try this value to see whether my implementation at least halts, but it takes too long to compute the answer “the long way” to check it. (I tried.) In the end, I choose to ignore this case, trusting that if I compute the right answer up to 5^10, then it will remain right even around 2 billion.

Comments From the Public

By asking about this code on github, I learn about Stream.iterate(), which provides an abstraction that I can use for factorsOfFiveIn(). Doing so replaces a recursive function with an iterative version without surrendering all the typical expressiveness of recursion.

After making this change, I feel that “functional programming” feeling about the resulting code: someone accustomed to procedural code would probably hate it, while someone accustomed to functional code would probably love it.

package ca.jbrains.math;

import io.vavr.collection.Stream;
import org.jetbrains.annotations.NotNull;

import java.util.function.Function;
import java.util.function.Predicate;

public class NumberTheory {
    public static int factorsOfFiveIn(int n) {
        return Stream.iterate(n, divideBy(5))
                .map(isAMultipleOf(5))
                .takeWhile(isTrue())
                .length();
    }

    @NotNull
    private static Predicate<Boolean> isTrue() {
        return aBoolean -> aBoolean == true;
    }

    @NotNull
    private static Function<Integer, Integer> divideBy(int divisor) {
        return x -> x / divisor;
    }

    @NotNull
    private static Function<Integer, Boolean> isAMultipleOf(int divisor) {
        return x -> x % divisor == 0;
    }

    public static int countTrailingZeroesInTheFactorialOfANaturalNumber(int n) {
        // Remember: range(1, n + 1) means 1 up to and including n.
        return Stream.range(1, n + 1).map(NumberTheory::factorsOfFiveIn).sum().intValue();
    }
}

Yes, I can replace isTrue() with the identity function, but I prefer this version for its clarity. It more closely matches how I would describe the computation in words: “count the number of times we can divide n by 5 until the result is no longer a multiple of 5”. Hm. Maybe I really prefer takeUntil(isFalse()). I can easily change that.

public static int factorsOfFiveIn(int n) {
    return Stream.iterate(n, divideBy(5))
        .map(isAMultipleOf(5))
        .takeUntil(isFalse())
        .length();
}

I generally prefer change what I type to match what I say. This tends to result in code that more people can understand, since it better reflects how I try to explain it. Then I might not need to explain it at all!

With this, I replace the problematic implementation that could have run out of stack space with an iterative version that can (probably) handle any integer value that Java will allow.

Improved an algorithm, making it less likely to blow up.

Trying To Improve Execution Time

Since I know that factorsOfFiveIn() will return 0 for inputs that aren’t a multiple of 5, I could change the code that invokes it so as to send it only multiples of 5, as I did when I pasted my first solution into Codewars. I try this, but it doesn’t significantly improve the execution time, at least as “measured” by repeatedly running the tests and observing the execution time. I find the extra code a bit distracting, so I leave it out.

Later Insight

While not really thinking about the exercise, I suddenly notice a pattern that might drastically simplify the implementation, but at the cost of losing the link between the code and the algorithm.

Counting the zeroes at the end of n! amounts to counting the number of factors of 5 in all the numbers from 1 up to n. That doesn’t change. So far, I’ve approached this by computing the number of factors of 5 in each natural number, but what if I regrouped the numbers another way? This might allow me to arrive at the same result differently. Many mathematical results regarding series—especially infinite series—have come from regrouping the terms and seeing new patterns.

First, how many multiples of 5 are there from 1 up to n? There are n/5, where we interpret “/” as “integer division”, or in other words, floor(n/5). That means that we have (at least) n/5 factors of 5 in the product of 1 up to n. But 25 has two 5s, so it needs to count as 2 factors of 5 or one extra factor of 5. This remains true of all multiples of 25. How many multiples of 25 are there from 1 to n? Certainly, there are n/25.

But now, if we list the multiples of 5 and the multiples of 25, we see that 25 appears in both lists. If we counted the numbers in both those tests together, then 25 would count as 2 factors of 5. So would 50, 75, 100… and sadly, 125.

But wait! If this works for 25, then it works for 125, then 625, then 3125, and so on. When we enumerate the multiples of 125, then they all count for 3 factors of 5, or one extra factor of 5 when compared to the 2 factors that we’ve already counted. When we enumerate the multiples of 625, then they all count for 4 factors of 5, or one extra factor of 5 when compared to the 3 factors that we’ve already counted. And so on and so on. The number of factors of 5 in n! must then be the sum of the number of multiples of 5, 25, 125, 625… and so on through the powers of 5. At some point, we have to stop. We can stop counting multiples of powers of 5 once we reach a power of 5 larger than n, that power of 5 won’t appear in the terms we multiply to compute n!.

I see immediately how to type this in, so I don’t need to build it up incrementally. Since I have all these tests, I can try typing the “obvious implementation” into the computer, knowing that I can roll back quickly and easily if I get it wrong. In this situation, I usually give myself 3 chances to get it right before getting up and walking away from the computer.

public static int countTrailingZeroesInTheFactorialOfANaturalNumber(int n) {
    return takeUpToUpperBound(powersOf(5), n)
        .map(ithPowerOf5 -> n / ithPowerOf5)
        .sum().intValue();
}

// REFACTOR Integer could be anything Comparable.
private static Stream<Integer> takeUpToUpperBound(Stream<Integer> integers, int inclusiveUpperBound) {
    return integers.takeUntil(each -> each > inclusiveUpperBound);
}

private static Stream<Integer> powersOf(int base) {
    return Stream.iterate(base, previousPower -> previousPower * base);
}

This time, I get it right the first time. Good!

This almost transcribes the mathematical notation I’d use to express the answer: it’s the sum as i goes over the powers of 5 from 5 to n of n/(5^i). Here, I write ithPowerOf5 because fiveToTheIthPower doesn’t seem any better and I can’t start an identifier with a digit.

I no longer need the code nor the tests for factorsOfFiveIn(), because I am counting those factors of 5 another way. I decide to keep the code, largely arbitrarily, until it threatens to slow me down. The tests now run on average in about 20-25% less time than they did with the previous implementation.

Summary

From this very simple exercise, I learn the following:

Writing parameterized tests with JUnit 5 seems quite pleasant.
Some differences between Vavr’s Stream library and Java’s standard Stream library.
Stream.iterate().

This kata involves almost no design; it focuses on computing the right answer. I have noticed a pattern in how katas with very few design decisions proceed: I typically implement a very naive version for the first few special cases, then I notice a pattern, then I try to generalize, then I find one annoying counterexample, then I refine my generalization, then I finally add enough special cases tests to satisfy myself that I got the implementation right. It usually feels quite orderly and safe, but not very “test-driven”. I don’t mind: I don’t have to test-drive everything.

Addendum

Remember those details that I planned to add to my inbox when I had a moment? They consisted of articulating why I could justify the claims I made about counting 2s and 5s.

count the number of 2s and 5s of factors, because 2 × 5 = 10 and that’s the only way to get a trailing zero—multiplying by any other factors never adds an “unexpected” trailing zero
for every factor of 5 there exists a unique 2 that “comes before” that 5, so just count the 5s
- for all n > 0, since 2 < 5, then 2 n < 5 n, and so we always multiply by 2 n before we multiply by 5 n, so there’s always a unique 2 to go with each 5; and the same is true for 2^2 and 5^2, then 2^3 and 5^3, then 2^4 and 5^4… so we always have enough 2s to go with all the 5s

Feel free to criticize my reasoning. I’d like to get it right.

Complete and Continue