Concurrency Doesn’t Mean Parallelism

Concurrency Chart In Part 1, we discussed how concurrency can take many forms, including, but not limited to parallelism.

In Ruby and Ruby on Rails, there are many approaches we can take in building concurrency. In this post, we’ll look at several.

The GVL

Before examining the different approaches in Ruby, it is important to understand a co-related topic: the GVL.

The GVL, or Global VM Lock is built-in to CRuby (the Ruby most of us use). This feature does not exist in JRuby or TruffleRuby.

The GVL does not allow more than a single thread to access the Ruby VM at the same time.

This means that if a thread is executing Ruby code, all other threads are preventing from running until the GVL is opened up.

I point this out because as we look at different approaches to concurrency, it is important to understand the limitation of the GVL and its potential impact.

Threads (Multi-Threading)

One popular way to achieve multithreading in Ruby on Rails is through the use of the native Thread library.

Here’s a basic example:

Thread.new do
  3.times do
    puts "Thread 1 is working"
    sleep(1)
  end
end

Thread.new do
  3.times do
    puts "Thread 2 is working"
    sleep(1)
  end
end

# Output:
# Thread 1 is working
# Thread 2 is working
# Thread 1 is working
# Thread 2 is working
# Thread 1 is working
# Thread 2 is working

The key limitation here is the GVL. Although we are multithreading, we are only processing Ruby code, a single thread at a time.

In Ruby 3.0+, the GVL has been optimized to improve performance, however, the GVL is still a relevant limitation because it only allows the processing of one thread at once, no matter the number of threads you’ve created.

Fibers (Single-Threading)

Ruby allows you to achieve concurrency through Fibers in Ruby 3.0+. Think of Fibers as more lightweight, efficient threads, running within a single thread.

Remember, the GVL limits multi-threading to only single thread being processed at once. The advantage of Fibers is that you are now running a single thread. This single thread contains multiple Fibers, but again the GVL limits the processing to a single Fiber at a time.

Here’s an example:

# Create first fiber
fiber1 = Fiber.new do
 puts "Fiber 1 starting"
 Fiber.yield
 puts "Fiber 1 resuming"
 Fiber.yield
 puts "Fiber 1 finishing"
end

# Create second fiber
fiber2 = Fiber.new do
 puts "Fiber 2 starting"
 Fiber.yield
 puts "Fiber 2 resuming"
 Fiber.yield 
 puts "Fiber 2 finishing"
end

# Resume fibers alternately
fiber1.resume
fiber2.resume
fiber1.resume
fiber2.resume
fiber1.resume
fiber2.resume

puts "All done!"

# Output:
# Fiber 1 starting
# Fiber 2 starting
# Fiber 1 resuming
# Fiber 2 resuming
# Fiber 1 finishing
# Fiber 2 finishing

Here, Fiber.yield is pausing execution and returning control to where resume is called.

The Fibers run in our specified order and we do not need to worry about sleep or synchronization because everything is running on a single thread.

With Fibers, the developer controls when and where to start and stop the Fiber, providing more fine-tuned control over the single thread. This level of control contrasts with Threads, where the operation system makes this decision.

Although this decision-making can lead to more complex code decisions, Fibers can offer performance benefits over multiple Threads when used properly.

Async (Single-Threaded Using Fibers)

Another new feature introduced in Ruby 3.0+ is the Async library. Async is built on top of Fibers, but handles scheduling instead of the developer needing to manually control task switching. This reduces the complexity of implementation.

Additionally, Async uses an event loop, like JavaScript and is great for I/O operations. Here’s an example:

Async do
  # First async task
  Async do
    puts "Starting task 1"
    sleep(2)
    puts "Finished task 1"
  end

  # Second async task
  Async do
    puts "Starting task 2"
    sleep(2)
    puts "Finished task 2"
  end
end

puts "All done!"

# Output will look like:
# Starting task 1
# Starting task 2
# Finished task 1
# Finished task 2

In this example, each task runs asynchronously, starts almost immediately, and runs concurrently (not in parallel).

Ractors (Parallelism)

Ractors are available in Ruby 3.0+ and offer parallel execution (unlike threads which are concurrent but not parallel in Ruby due to the GVL).

Ractors run independently of each other, do not share memory, and communicate with other Ractors running in parallel. Here’s an example:

# Create first ractor
r1 = Ractor.new do
 3.times do
   puts "Ractor 1 working..."
   sleep(1)
 end
 "Ractor 1 done!"
end

# Create second ractor
r2 = Ractor.new do
 3.times do 
   puts "Ractor 2 working..."
   sleep(1)
 end
 "Ractor 2 done!"
end

# Get results from both ractors
result1 = r1.take
result2 = r2.take

puts result1 
puts result2

puts "All done!"

# Output might look like:
# Ractor 1 working...(0s)
# Ractor 2 working...(0s)
# Ractor 1 working...(1s)
# Ractor 2 working...(1s)
# Ractor 1 working...(2s)
# Ractor 2 working...(2s)
# Ractor 1 done! (3s)
# Ractor 2 done! (3s)
# All done!

Here, each Ractor.new creates a new isolated environment for code to run. The code inside each Ractor runs independently, and Ractors start executing as soon as they’re created. Additionally, both Ractors run at the same time on different CPU cores.

The key benefit of Ractors over Threads and Fibers is true parallelism - they can actually run at the exact same time on different CPU cores!

Use Cases

Consider the following criteria to guide your choice of concurrency service:

Threads:

Threads are best when:
1. Running background jobs
2. Integrating with blocking APIs
3. Managing long-running tasks
4. Needing shared state with synch

Fibers:

Fibers are best when:
1. Implementing generators
2. Custom iteration patterns
3. Needing fine-grained control

Async:

Async is best when:
1. Making multiple API calls
2. Handling websockets
3. Processing streams
4. Many concurrent I/O operations

Ractors:

Ractors are best when:
1. Processing large datasets
2. Running parallel calculations
3. Needing true CPU parallelism
4. Requiring memory isolation

Conclusion

Understanding the different approaches to concurrency and parallelism in Ruby is crucial for building efficient applications. Each tool - Threads, Fibers, Async, and Ractors - serves a specific purpose and comes with its own trade-offs.

Threads, while limited by the GVL, remain useful for background tasks and blocking operations.

Fibers provide fine-grained control over concurrency within a single thread.

The Async library simplifies concurrent I/O operations by managing Fibers for us.

And Ractors, new in Ruby 3.0+, finally bring true parallelism to Ruby by circumventing the GVL.

Toolkit

The key is choosing the right tool for your specific use case:

-Need to handle multiple I/O operations efficiently? Consider Async.

-Want precise control over task switching? Fibers might be your best bet.

-Running CPU-intensive calculations? Ractors could provide the parallel processing you need.

-Managing background jobs with shared state? Threads might still be the way to go.

Remember, concurrency isn’t always about performance - it’s about structuring your program to handle multiple tasks effectively. Whether you need true parallelism or just concurrent operation management will guide your choice among these powerful tools in modern Ruby.