Ruby’s Permutation and Heap’s Algorithm

Rubix Cube Recently, I’ve been practicing more Leetcode/technical challenges in order to keep my skills fresh. While practicing, I encountered a problem that required performing a permutation on a data set.

It had been awhile since I’d worked through a problem that required a permutation, and in vain, I consulted Claude. Claude recommended utilizing Ruby’s permutation method, which I had never used before.

Sure enough, implementating permutation solved the problem, in a concise, single line. This was a great example of how Ruby is so wonderful at providing built-in methods that significantly reduce lines of code.

For as great as Ruby is at providing simple, clean syntax, it is important for developers understand what is happening behind the scenes or “under-the-hood.”

This is especially important because “under-the-hood” a shorthand Ruby method may actually not be the most optimal choice in a technical challenge, despite how clean, elegant, and readable the solution may appear.

Additional Research and Heap’s Algorithm

Since my initial question had now turned into an exploration of the permutation method and its origins, I first attempted to answer what the method did on a broader-level.

This research led me to understanding that the permutation method was loosely based on an algorithm that I was familiar with many years ago, Heap’s Algorithm.

This made me curious as to whether a pure implementation of Heap’s Algorithm would be more optimal than the permutation method, so I dug in further…

The Actual Problem

For clarity purposes, I’m removing additional requirements in the problem I was solving, and instead, offering a more simplified version:

Given an integer array [1,2,3,4], find the total number of all possible 2 digit combinations.

The Solution–Permutation

Solving this using permutation is literally a single line.

some_array.permutation(2).count

# Output
# 12

The Nuance–Permutation

In this case, we wanted to know the total number of two digit numbers that could be created from [1,2,3,4], but imagine instead that the question asked us to return an array of arrays of each two-digit number, rather than the count..

Using permutation alone will not suffice. This is because permutation returns an Enumerator object, rather than an array of arrays, which is what we’re asking for in this case. For example:

# This will return an Enumerator
some_array.permutation(2)

# Output
# #<Enumerator: [1, 2, 3, 4]:permutation(2)>

To return an array of arrays, we’d need to do something like this:

# This will return an Array of Arrays
some_array.permutation(2).to_a

# Output
# [[1, 2], [1, 3], [1, 4], [2, 1]...]

Enumerators?

You might be wondering at this point what exactly an Enumerator object is…

An Enumerator is a memory-efficient class that allows you to iterate over individual elements and stop processing at any time. Enumerators only generate values when needed and hold one value at a time in memory.

Enumerators can be chained together quite beautifully in Ruby.

For example, I could write a simple one-liner to map an array to it’s index values:

%w[foo bar baz].map.with_index{ |w, i| "#{i}:#{w}" }  

Another way to think about it is like this:

%w[foo bar baz].map
# Returns an Enumerator
               .with_index
# Chains another Enumerator           
               { |w, i| "#{i}:#{w}" }
# Finally executes with the block  

Using Enumerators in the problem above prevents us from storing the transformed data sets in memory until after the final block has executed.

Back to the Solution–Permutation

In the problem we looked at earlier, the resultant Enumerator object did not store the permutations in memory when we first called permutation on the array.

However, if we chose to break apart the array into a set of arrays, we’d be storing all the values in memory. For large-data sets, this choice can have significant repercussions.

Now, we’re starting to see promising evidence that the built-in permutation method may be a more optimal solution than the implementation of Heap’s Algorithm based on memory allocation, but let’s explore Heap’s Algorithm next to see if this is true.

Solution–Heap’s Algorithm

Heap’s Algorithm was first proposed by B.R. Heap in 1963 and uses recursion to generate all possible permutations where k represents the size of the array we’re working with.

What is Recursion? Recursion is when a function calls itself with a smaller subset of the original problem until it reaches a base case.

For example:

def factorial(n)
  return 1 if n == 0 # Base case
  n * factorial(n - 1) # Recursive case
end

# Example
# a = 2
# factorial(2) => 2 * factorial(2 -1) => 2

In Heap’s Algorithm we observe recursion applied like so:

k = 4, arr = [1,2,3,4]

def generate_combinations(arr)
  @permutations = []  # Initialize here
  heap_permute(arr.length, arr) # Call with initial array length
  
  # Extract 2-digit numbers
  two_digits = @permutations.map do |perm|
    (0..perm.length-2).map { |i| "#{perm[i]}#{perm[i+1]}".to_i }
  end.flatten.uniq.count
end

def heap_permute(k, arr)
  if k == 1  # Base case: when k=1, we have a valid permutation
    @permutations << arr.clone
  else
    heap_permute(k - 1, arr) # First recursive call
    
    (0...k-1).each do |i| # Swap and generate more permutations
      if k.even?
        arr[i], arr[k-1] = arr[k-1], arr[i]
      else
        arr[0], arr[k-1] = arr[k-1], arr[0]
      end
      heap_permute(k - 1, arr) # Second recursive call
    end
  end
end

In this approach, we create all the two-digit combinations of the original array and map and count them at the end.

On the surface, this approach is significantly more difficult to read and understand than using permutation but is it actually more optimal than relying on the permutation method?

Breaking Down Heap’s Algorithm

Let’s think back to the Enumerator class we were generating using permutation and see how that compares with the implementation of Heap’s Algorithm above.

Here are the most-expensive parts of Heap’s Algorithm:

def heap_permute(k, arr)
  if k == 1
    @permutations << arr.clone # 💰 Expensive: Creates new array
  else
    heap_permute(k - 1, arr) # 💰 Expensive: Deep recursion 
    
    (0...k-1).each do |i| # 💰 Expensive: Additional iteration
      if k.even?
        arr[i], arr[k-1] = arr[k-1], arr[i]
      else
        arr[0], arr[k-1] = arr[k-1], arr[0]
      end
      heap_permute(k - 1, arr) # 💰 Expensive: More recursion
    end
  end
end

def generate_combinations(arr)
  @permutations = [] # 💰 Expensive: Stores all in memory
  heap_permute(arr.length, arr)
  
  # 💰 Expensive: Additional array transformations
  two_digits = @permutations.map do |perm|
    (0..perm.length-2).map { |i| "#{perm[i]}#{perm[i+1]}".to_i }
  end.flatten.uniq.count
end

Why is it more expensive than Permutation?

Money As we noted earlier, permutation creates an Enumerator object that does not store the results as arrays in memory like Heap’s Algorithm does. The Enumerator generates values on demand, rather than storing all permutations at once.

Furthermore, Ruby’s built-in methods are already built and compiled in C. If you truly want to understand what is happening in C, take a look at C implementation that is actually compiled.

For simplicity purposes, just know that by calling permutation, we’re closer to the hardware, as opposed to our Ruby implementation which needs to run through Ruby’s interpreter and does not directly access the lower-level optimizations that the C has access to.

An additional added expense is that our pure Ruby implementation of Heap’s Algorithm has additional overhead in the form of recursion, array copying, etc.

An Analogy

Road with Mountain

Ruby’s permutation: Direct highway to the computer’s processor (C code)

Our Heap’s Algorithm: Taking local streets with many stops (Ruby interpreter)

Conclusion

This was a rabbit hole of all-sorts. We started with a simple Ruby method and touched on Enumerators and Recursion and ended up discussing the C compiler and memory management. Rabbit

Never forget that Ruby, “under-the-hood”, is just C.

Not all languages provide the luxury of concise, built-in methods that are already optimized and designed to solve specific types of problems.

This is both and advantage and disadvantage of Ruby. With each built-in method, I believe it is important to understand the underlying concepts behind the built-in method and explore algorithmic alternatives.

With all that being said, if I was presented with a Leetcode problem that could be succinctly solved with permutation, I’d most certainly use it –however– I’d be prepared to explain why I chose it and understand the advantages and disadvantages of alternative approaches.