The FizzBuzz Kata

Intro

I didn’t know about this one. Then again, I’m not “actually” a programmer, so I guess I have an excuse… Or do I? I do spend lots of time programming in R 🙂

It’s been a while since I last wrote here. Holidays + Master + full time job (and recently, the Cybersecurity world was busy…), all in all, I simply had little spare time. Apologies.

The Kata

So reading LinkedIn posts today, I came across someone mentioning the “FizzBuzz” exercise, with which I wasn’t familiar.

The example there was about code “being easily transferrable” from R to Python and vice-versa.

But then I thought: Yes, it’s good that one can program in different languages. But isn’t it also good to be reasonably good at one of them?

And so I decided I should see whether I could implement a better version of the exercise…

The code

The basic version is something like so:

for(each in 1:100) {
  if(each %% 3 == 0 & each %% 5 == 0) print("FIZZBUZZ")
  else if(each %% 3 == 0) print("fizz")
  else if(each %% 5 == 0) print("buzz")
  else print(each)
}

Right there, the for loop tells me: Go “vectorized”. And so:

# faster option: Base, vectorized:
sapply(1:100, function(x) {
  if(x %% 3 + x %% 5 == 0) return("FIZZBUZZ") # Sum of the modulus == 0
  if(x %% 3 == 0) return("fizz")
  ifelse(x %% 5 == 0, "buzz", x)
})

Two advantages here: Less “else”, I feel it’s cleaner somehow. And of course it’s much faster. Detail (for fun), instead of an AND, I used a SUM. Same result and little difference really.

Then, I thought further about it: We have three options that match certain numbers… Couldn’t we go the other way around? Say find where to put each string, instead of on which number to put each string… Let me explain with the code, better:

unlist(lapply(1:100, function(x) {
  t <- c("FIZZBUZ", "fizz", "buzz")[x %% c(15, 3, 5) == 0][1]
  if(!is.na(t)) return(t)
  x
}))

I thought this was an original option. (Plus it turns out, it’s faster than the former one!). unlist()+lapply() being slower (but not relevant…) in this case.

Also, a mathematical detail: Using modulus 15 seems about right, mathematically 🙂 I would say: LCM(3, 5)… Anyway, so one operation less, I guess.

Then this next option was fastest so far (but not “elegant”). Please note that my R interpreter is on v4.0 in this particular ARM-compatible RStudio container… R 4.1 would have given me native |> pipe and some other anonymous function call that would look clean (even for the sapply() calls). But that’s what it is. So I fall back to the magrittr library and then can do this: as ifelse() is a vectorized operation itself, well…

1:100 %>% {ifelse(. %% 15 == 0, "FIZZBUZZ", 
  (ifelse(. %% 3 == 0, "fizz",
   ifelse(. %% 5 == 0, "buzz", .))))
}

Now let’s see if we can be even more… Creative.

What if we played with two dimensions and use row numbers, maybe?

{
  t1 <- matrix(1:100, ncol=5)
  t1[which(row(t1) %% 5 == 0)] <- "buzz"
  t1 <- as.vector(t1)[1:100]
  t1 <- matrix(t1, nrow=3)
  t1[which(row(t1) %% 3 == 0)] <- "fizz"
  t1 <- as.vector(t1)[1:100]
  t1[which(1:100 %% 15 == 0)] <- "FIZZBUZZ"
  t1
}

This last one was (mostly) the fastest option in my tests (although not by much…). It does however throw a warning, as 100 is not a multiple of 3, and forces us to cut the results to the first 100 entries because of it…

The timing of each option, in order of appearance:

min      lq        mean       median    uq        max      neval
4676.668 4833.0840 5214.04485 4994.6670 5213.5635 9304.584 100
137.459  142.7510  172.60865  149.7925  157.4385  2348.293 100
113.501  118.1460  137.29438  121.6670  127.1255  1571.167 100
51.334   54.0215   60.39899   57.6255   64.8760   89.084   100
42.043   46.1465   58.91734   59.2505   70.0840   88.584   100

Conclusion

Even for a very basic program or script, there are probably many alternatives out there to implement it. Depending on how much you’re willing to think about it (considering the effort vs value, I guess)…

EDITO: After the fact…

It only makes sense…

So I just found out, after doing it myself… The exercise is supposed to be Fizz, Buzz or FizzBuzz… Not what I was doing (fizz, buzz, FIZZBUZZ, and with some error somewhere in my code, as cherry on the cake).

This opens of course new possibilities: 15 no longer is only about multiples of 15, but in order multiples of 3 (Fizz) and multiples of 5 (Buzz). Then a “paste()” solution comes to mind.

And then, while looking into this exercise, I found (of course) many MANY sources of references. One solution I didn’t even consider is the dplyr() option “case_when()”. My bad, indeed, it looks cleaner I guess. As is often the case with dplyr.

I’ll put one nice reference with other alternative implementations in the references. If nothing else, it’s clear there are probably even more options out there. To my point, one can think of many ways of doing the same thing.

That, and read the requirements carefully :S

References

A link to my code, on my GitHub account

Looking for alternatives after the fact, I found that Post with nice options