Previous Lecture Complete and continue  

  Annotated Exercise: The First Non-Repeating Number in a List

One of my clients pointed me to this exercise when we discussed the fundamentals of behavior-driven development. We used it to talk through specific techniques for dealing with the flood of distracting thoughts that often hits us when we start thinking about a new task.

The Exercise

Find the first non-repeating number in a list of numbers.

Assumptions and Questions

I find a set of abstract assumptions and questions swimming around in my mind, so I’d like to get them out.

  • The list is not sorted in any way.
  • The repeating numbers can be spread out in the list.
  • The numbers are any integers, and not floating-point numbers.
  • The list is finite. (A common unstated assumption that could lead to problems later if someone expects to be able to work on a lazy-but-infinite sequence.)

After writing these down, I let about 20 seconds pass without a new idea coming to my mind, so I move on.

Dump Core on Examples

Before writing this, I had worked through part of this exercise with my client. We had started with one example, which itself led to the first design decision: a function that turns a list of integers into an integer.

[1, 1, 2, 2, 5, 5, 6, 8, 8, 9] -> 6

I used this as a starting point, which then triggered a handful of ideas in my head, so I furiously scribbled down all the examples that I could think of.

Design Decision 2

With the third example that we worked through together, we bumped into a design problem.

[1, 1, 2, 2, 5, 5] -> ???

In this case, there exists no “first” non-repeating number in the list, so I need to decide how to handle this case. (If you’re thinking “Maybe monad!”, then so am I. One thing at a time.) The usual options include:

  • null
  • an exception
  • a sentinel value or error code
  • Maybe

I choose Maybe because I like it and it’s cool as I write these words in 2018. This changes the meaning of the notation in my examples. When I write “-> 6”, I should interpret that as “returns Just 6”, but when I have no value to return, then I can write “-> (X)”, which I interpret as “returns Nothing”.

I use a flexible notation for examples. An X inside a circle means “no answer” or “blow up”, which I can model as an exception, a null return value, or something more sophisticated, all without changing the notation of the examples. It’s just a trick I learned somewhere along the way.

A Typical Edge Case

Now that I’m thinking about inputs with no answer, how many interesting variations of that can I find? I write down a handful of these, even though I probably don’t need them all. (It feels easier to write them down now and ignore them later.) I guess I’ll know when I try to implement the function and see that new examples don’t require new code.

# All these result in "no answer"
[]
[-37, -37, -37]
[1, 2, 5, 1, 2, 5]
[4, 9, 38, 63, 38, 9, 63, 4]
[8, 18, 28, 48, 38, 28, 18, 38, 8, 48]
[14, 28, 42, 28, 42, 14, 42, 14, 28, 14, 28, 42]

Avoid Accidental Invariants

As I try to write more interesting examples, I notice that I use mostly monotonically increasing lists. I don’t know why, but I notice the potential blind spot. I have to remind myself not to choose only monotonic lists, because accidental invariants in the test data might lead to writing an implementation that only works for data that satisfies those accidental invariants, and not all possible (well, reasonable) inputs.

20 Seconds Passes…

As a rule of thumb, when I go (about) 20 seconds without thinking of a new example that might fail for a different reason than all the preceding examples, then I know that I have emptied my mind. Then I look at the set of examples and judge whether I have enough to start. I have 10 examples that look like enough to start and maybe enough even to finish. Let’s see!

Implementation

I choose to implement this in Ruby, mostly because I know the basics reasonably well, I won’t need any particularly complicated libraries, and I find its syntax for lists of numbers easy to use. (I certainly prefer its syntax to Java.) This seems like a good exercise to use to learn the basic libraries in other languages, so I would also like to do it in Elm, Java-with-Vavr, Python, Racket, whatever cool Javascript collections library was built last week, Clojure, and even Haskell. (I don’t intend this as an exhaustive list, but it suffices to start.)

Set Up the Ruby Environment

I choose Ruby, RSpec, a git repository, and to follow my own advice in http://blog.thecodewhisperer.com/permalink/relative-include-paths-and-the-slow-certain-march-towards-legacy-code regarding using the load path. I also want to try out new online Pomodoro-style timers, so arbitrarily I choose https://lanes.io. Of course, I use vim for editing text. Shut up. No, you shut up.

Maybe for Ruby

I’ll admit that I’ve gone a little off the deep end here, but I’ve done it in the name of learning, and because I have some spare capacity to invest. In Ruby I could just return nil when there is no non-repeating number in the list, but I’d rather learn something, so I take advantage of the opportunity and look for an implementation of Maybe for Ruby. I’ll try https://github.com/bhb/maybe first and see what happens. This happens to be the gem I get when I ask for maybe in my Gemfile. How nice!

Maybe Not

What?! No. Comparing Maybe values definitely returns a boolean and not maybe a boolean. No, thank you.

I ask my social networks to propose Ruby gems that implement Maybe correctly, but in the meantime, I choose to fall back to nil to represent “no value”. I don’t like it, but I didn’t feel like implementing Maybe in Ruby today.

Nothing Much Happened

After settling on an overall strategy (settling on a type signature for the function, really), everything proceeds more-or-less smoothly. I notice part way through that I need to look at what happens when the list itself contains nil values, but since Array supports a function to strip nil values out (compact), this didn’t cause any consternation. I add a few examples, but otherwise, simply make them pass.

Another Maybe Library

While taking a break from this task, [Steven Solomon, also known as @ssolo112](//twitter.com/ssolo112) on Twitter suggests another Maybe library for Ruby. His, of course. Since it looks promising, I decide to try it. I wanted to know whether, at least, it treats Maybe(6) and Nothing as unequal and that the equality test itself returns a boolean and not Maybe a boolean.

Oops! This is why I check. That looks like two versions of the same gem, but no: the 0.1.0 version is Steven Solomon’s gem, while the 1.1.0 version is the one I tried earlier. Remove!

Next, I try the gem out.

Ugh. Not quite what I expect. I expect the first of these two statements to return 7, not 6. Fortunately, Steven shows a willingness (even eagerness!) to help, so he patches the behavior and I move forward with it. Thank you, Steven!

Reintroducing Maybe

In order to introduce Maybe into this code, I see an easy, mechanical refactoring. The new function, which returns Maybe a number, invokes the old function and wraps the return value in a Maybe. Since I wants the new function to have the name of the old function, I add one preparatory step to the three steps that I learned from Kent Beck all those years ago.

  1. Rename the old thing in order to make room for a new thing with the old name.
  2. Add the new thing (with the now-old name that I want).
  3. Migrate clients from the old thing to the new thing.
  4. Remove the old thing.

Wait. I don’t even need to do all this! Steven’s version of Maybe doesn’t create Maybe objects, but rather implements only map().orElse(). Hm. I don’t know how I feel about that, specifically, but since I don’t plan to use this function in a more industrial context, I decide that I can “go with it” for now. In a more industrial context, I’d log the risk and look for an early opportunity to challenge my assumptions or explore my concerns.

So I continue to use nil to represent “no answer”, but at least now I have a way of invoking map().orElse() that reveals intent better than checking for nil. That satisfies me for the moment.

Back To The Feature

Since I had a break, my mind worked on the exercise, and in the process, I had an idea about how to approach it. Now I want to know whether my idea works and whether I might find something “better” in the process of writing it that way (for some value of “better” that I might not manage to define just yet). My strategy goes like this:

  • Split the list into head (a number) and rest (a list of numbers), except if the list is empty, in which case return nil.
  • If head is not in rest, then head is the first non-repeating number in the list; otherwise, run this same algorithm on rest. Recursion!

In general, for the past year or two, I’ve been trying to retrain myself to think in terms of functional programming concepts. This explains why a recursive implementation came to mind. I realize that I could improve execution speed a little by tracking the numbers that I’ve already seen and then skip the iterations that check the repeated instances of those numbers. Meh. I’m not worried about execution speed right now, and if I were, then I would run an execution speed test before I bothered implementing that improvement. Even if I never implemented it, I’d document the idea in case someone needs to improve execution speed later.

Laziness in Ruby?

I have another idea to build a lazy sequence of all the non-repeating numbers in the list, of which I can take the first one, if it exists. I don’t want to try it at the moment, but it merits investigation. I planned to read more about this later at https://rossta.net/blog/infinite-sequences-in-ruby.html. In the meantime, I get back to the action.

Back to the Action

After making a few more tests pass, I notice that I want easier feedback, so I install guard to run the tests more often and with less effort.

I Didn’t Need Maybe

So I remove it. Next time.

My Algorithm Is Wrong!

I find out that I need to track the previously-seen-as-repeated numbers for the correctness of the algorithm, so I do that. I discover this from a failing example: [1, 1, 2]. Let me trace this example to illustrate the problem.

  1. head = 1, rest = [1, 2]. Since head is in rest, try again with just [1, 2].
  2. head = 1, rest = [2]. Since head is not in rest, it’s not repeated, so return 1.

Fail.

I change the algorithm to check whether head had been previously marked as “repeating”, in which case, don’t bother looking in rest, but instead just go to the next iteration. With that, I rescue my algorithm.

The rest proceeds in a pretty boring fashion. Read the commits if you want to see the play-by-play action.

The Final List of Tests

Find the first non-repeating number in a list

  • empty list
  • there are non-repeating numbers
    • only one number in the list
    • there is only one non-repeating number
      • it’s the first number in the list
      • not the first number in the list
    • several numbers are not repeating
      • not the first number in the list, but the list is monotonic
      • not the first number in the list, and the list is not monotonic
      • all the repeating numbers come before several non-repeating numbers
  • there are only repeated numbers
    • smallest case
    • a single repeating number
    • a few repeating numbers
    • a few repeating numbers, but a varying number of them
    • several repeat numbers interleaved with one another
      • a repeating sequence of numbers
      • some more-interesting lists of numbers
        • every number twice, but interleaved
        • every number repeated, but an unequal number of times
  • there are no repeated numbers
    • the only number in the list
    • a list with more than one number
    • a longer list
  • ignoring nil
    • a list with only 1 nil
    • a list with 1 nil and 1 number
    • a list starting with several nils
    • a list with several nils and several numbers interleaved

The Code

https://gitlab.com/jbrains/first-nonrepeating-number

References

J. B. Rainsberger, “Getting Started With Getting Things Done”. Get things out of your head so that you can focus, such as tests while programming.

J. B. Rainsberger, “Avoiding Distractions While Programming”.

Steven Solomon, https://github.com/steven-solomon/maybe. A dead-simple function that implements map().orElse(), interpreting nil as “no value”/Nothing/None.

Ross Kaffenberger, “Infinite Sequences in Ruby”. How to implement lazy infinite sequences in Ruby. If I wanted to improve this design, I would explore changing it to compute a lazy sequence of repeating numbers in the list, of which we could take(1).