Reducing JRuby Startup by Ditching Bundler
November 14, 2010 @ 08:11 PMUPDATE 2:
Amazingly, I was unaware of the Crown RubyGem when I wrote this article. It clearly would simplify the task of creating your monolithic 'lib' directory called for toward the end of this post.
UPDATE:
The introduction of --standalone to Bundler (head) largely invalidates this post, and hurray for that! At the risk of repeating myself: Bundler is a fantastic dev-time tool, but should not be a run-time dependency. I haven't yet had the time to play with --standalone to see how it changes the steps specified in this post, but as soon as I do I'll post an update.
I'd like to think that the timing of --standalone's introduction into Bundler was influenced by this post. I have no evidence to back that up. :-)
Prelude
Has it really been over two years since I last blogged? I certainly *started* a lot of blog posts in that time but usually run out of steam before finishing them. Turns out that Twitter / Facebook is a lot more suited to my writing style.
Nevertheless, a post such as this simply won't fit on Twitter and won't reach many folks on Facebook, so here I am.
I Love JRuby, but...
This shouldn't surprise most people: JRuby kicks ass; JRuby startup sucks(1). This is not an indictment against JRuby, @headius or any of the other JRuby devs. JRuby is in a sense booting up a VM on top of a VM, so it's going to take a little time.
Turns out, though, you can cut JRuby startup times significantly for many applications -- over 50% on several projects I've worked on.
I Love Bundler, but...
I really do. But, Bundler is -- or should be -- a dev-time tool, not a runtime tool. This is not a new concept (http://tomayko.com/writings/require-rubygems-antipattern) and I'm a little surprised at how quickly the Ruby community in general has accepted 'require "bundler"' in their code even as they were starting to reject 'require "rubygems"'.
It also turns out to be a huge performance drain during startup on JRuby. I'm guessing this is because the JIT can't really optimize that startup code in time to make a difference.
In any case, Bundler -- at runtime -- has to go.
The Project and Initial Benchmark
Let's start with a standard Rails 3 application and run some performance tests(2).
To assist with benchmarking, I've modified the Rakefile and wrapped the standard header lines in individual benchmarked blocks. I've also created a task that executes Rails' "routes" task -- this is just to ensure that the Rails environment is loaded and hit once.
Rakefile
require 'benchmark'
Benchmark.bm(20) do |x|
x.report("application:") {
require File.expand_path('../config/application', __FILE__)
}
x.report("rake:") {
require 'rake'
}
x.report("load_tasks:") {
Warehouse::Application.load_tasks
}
end
task :report do
Benchmark.bm(20) do |x|
x.report("Routes:") {
Rake::Task[:routes].invoke()
}
end
puts "Bundler : #{defined?(Bundler) ? 'YES' : 'NO'}"
end
Because I will be changing them later, here is the original config/boot.rb and config/application.rb files for our Rails 3 application:
boot.rb
require 'rubygems'
# Set up gems listed in the Gemfile.
gemfile = File.expand_path('../../Gemfile', __FILE__)
begin
ENV['BUNDLE_GEMFILE'] = gemfile
require 'bundler'
Bundler.setup
rescue Bundler::GemNotFound => e
STDERR.puts e.message
STDERR.puts "Try running `bundle install`."
exit!
end if File.exist?(gemfile)
application.rb
require File.expand_path('../boot', __FILE__)
require 'rails/all'
Bundler.require(:default, Rails.env) if defined?(Bundler)
module Warehouse
class Application < Rails::Application
config.encoding = "utf-8"
config.filter_parameters += [:password]
end
end
Now let's run some benchmarks; first, for comparison, Ruby 1.9.2:
Ruby 1.9.2
brasten@SilverBook ~/D/P/S/warehouse> time ruby -I. -S rake report
(in /Users/brasten/Development/Projects/Scratch/warehouse)
user system total real
application: 0.680000 0.180000 0.860000 ( 0.870297)
rake: 0.000000 0.000000 0.000000 ( 0.000031)
load_tasks: 0.030000 0.010000 0.040000 ( 0.045394)
user system total real
Routes: 0.720000 0.130000 0.850000 ( 0.853033)
Bundler : YES
2.03 real 1.62 user 0.37 sys
2 seconds, not horrible ... now JRuby:
JRuby 1.5.3
brasten@SilverBook ~/D/P/S/warehouse> time rake report
(in /Users/brasten/Development/Projects/Scratch/warehouse)
user system total real
application: 4.851000 0.000000 4.851000 ( 4.851000)
rake: 0.000000 0.000000 0.000000 ( 0.000000)
load_tasks: 0.410000 0.000000 0.410000 ( 0.410000)
user system total real
Routes: 2.118000 0.000000 2.118000 ( 2.118000)
Bundler : YES
10.15 real 14.61 user 0.75 sys
Ouch, 10 seconds. So this is our starting point.
The Great Bundler Removal
Next, let's remove Bundler. The steps for removing Bundler from your Rails 3 application are as follows:
-
Install your gems somewhere in your project.
One way to do this is simply to run "bundle install --path=vendor/gems".
For my projects, I have a Rake task that packages gems in "deps/development" and "deps/runtime" to handle gem groups. Following examples will assume an install path of deps/development (via "bundle install --path=deps/development").
-
Create a file that adds your gem lib dirs to the load path. In my case, we'll assume a file called "deps/development.rb."
deps/development.rb
platform = (defined?(JRUBY_VERSION) ? 'jruby/1.8' : 'ruby/1.9.1') deps_dir = File.expand_path("deps/development/#{platform}/gems") Dir["#{deps_dir}/**/lib"].each { |d| $:.unshift(d) } -
Replace the default boot.rb code with a 'require "deps/development"':
boot.rb
require 'deps/development' -
Remove or comment the Bundler.require line in application.rb:
application.rb
require File.expand_path('../boot', __FILE__) require 'rails/all' # BUNDLER is commented out! # Bundler.require(:default, Rails.env) if defined?(Bundler) module Warehouse class Application < Rails::Application config.encoding = "utf-8" config.filter_parameters += [:password] end end
You're done! To the benchmarks:
JRuby 1.5.3
brasten@SilverBook ~/D/P/S/warehouse> time rake report
(in /Users/brasten/Development/Projects/Scratch/warehouse)
user system total real
application: 1.696000 0.000000 1.696000 ( 1.696000)
rake: 0.000000 0.000000 0.000000 ( 0.000000)
load_tasks: 0.471000 0.000000 0.471000 ( 0.471000)
user system total real
Routes: 2.744000 0.000000 2.744000 ( 2.744000)
Bundler : NO
7.28 real 11.05 user 0.73 sys
Viola! Almost 30% reduction in startup time. This is essentially an empty project. In my experience, the more gems your application depends on the more dramatic a startup reduction you'll see. I was able to reduced the startup time of a recent large-ish project by over 50%.
For what it's worth, you get a barely noticeable improvement in 1.9.2 as well:
Ruby 1.9.2
brasten@SilverBook ~/D/P/S/warehouse> time ruby -I. -S rake report
(in /Users/brasten/Development/Projects/Scratch/warehouse)
user system total real
application: 0.300000 0.180000 0.480000 ( 0.513892)
rake: 0.000000 0.000000 0.000000 ( 0.000045)
load_tasks: 0.050000 0.020000 0.070000 ( 0.078305)
user system total real
Routes: 0.840000 0.210000 1.050000 ( 1.075324)
Bundler : NO
1.92 real 1.38 user 0.46 sys
Even More Awesomeness
If your project requires even faster startup times, you can go even one step further.
This is not a trivial change, but it can accomplish some impressive startup time reductions.
Apparently, scanning the load_path for each required file is a relatively expensive operation in JRuby. One solution to this is simply to flatten your load path as much as possible.
Not all gems behave well this way, so it can take some trial and error. You should be able to combine the load paths from *most* of your dependencies and keep the misbehaving gems separate.
The steps are:
-
Copy all files from each gem's "lib" directory into a single directory in your project. My directory is called "deps/development/rap" for irrelevant reasons.
-
Create a bin/rake file. You'll use this to run rake:
bin/rake
#!/usr/bin/env ruby require 'deps/development' require 'rake' Rake.application.run -
Modify deps/development.rb to add your combined library directory to the load_path:
deps/development.rb
deps_dir = File.expand_path("deps/development/rap") $:.unshift(deps_dir)
And that's it. Benchmarks:
JRuby 1.5.3
brasten@SilverBook ~/D/P/S/warehouse> time bin/rake report
(in /Users/brasten/Development/Projects/Scratch/warehouse)
user system total real
application: 1.328000 0.000000 1.328000 ( 1.328000)
rake: 0.000000 0.000000 0.000000 ( 0.000000)
load_tasks: 0.173000 0.000000 0.173000 ( 0.173000)
user system total real
Routes: 2.376000 0.000000 2.376000 ( 2.376000)
Bundler : NO
5.53 real 8.07 user 0.40 sys
Also interesting, the startup times for Ruby 1.9.2 are noticeably faster, too:
Ruby 1.9.2
brasten@SilverBook ~/D/P/S/warehouse> time ruby -I. bin/rake report
(in /Users/brasten/Development/Projects/Scratch/warehouse)
user system total real
application: 0.210000 0.100000 0.310000 ( 0.324459)
rake: 0.000000 0.000000 0.000000 ( 0.000013)
load_tasks: 0.030000 0.010000 0.040000 ( 0.037808)
user system total real
Routes: 0.750000 0.170000 0.920000 ( 0.908417)
Bundler : NO
1.40 real 1.08 user 0.31 sys
Conclusion
So, that's that. You'll have to decide for yourself whether or not the extra effort is worth the reduced startup time.
Ideally gem authors would ensure that their gems only depended on files on the load_path and would properly namespace their code. The Bundler or some other tool could more easily automate packaging libraries into combined directories.
Anyway, something to think about.
- Where "sucks" is roughly equivalent to a compile and relaunch of a Java app. Not necessarily prohibitive, but as Ruby folk we've gotten used to instant gratification.
- Benchmarks were ran on JRuby 1.5.3 and Ruby 1.9.2. In general, I ran each benchmark ~10 times, tossed the first few and picked up the most common benchmark. This is entirely informal, but I did run tests and select examples in good faith. In any case, feel free to run the scenario yourself.
Nice article. I am working on a JRuby on Rails 3 project myself and I find myself disliking the long JRuby startup time.
a ” rake spec ” ( JRuby Rake / RSpec-Rails ) takes 4-5x more time than running the same tests on MRI.
For me a JAVA_OPTS=”-d32 -Xms256m -Xmx512m -Dsun.rmi.dgc.client.gcInterval=3600000 -Dsun.rmi.dgc.server.gcInterval=3600000” got the running time for ‘rake spec’ down by 50%