codahale.com٭blog

Coda Hale lives in Berkeley, CA, where he writes about Ruby on Rails, usability, web design and development, and the occasional bit about bicycles.

Pound vs. Pen: Because You Need A Load-Balancing Proxy

Update 11/9/06: I just realized that pen handles servers in the order they’re listed on the command line. This means my code can cause all the Mongrel processes on a single server to be used while the other server is idle. I’ve updated the code to interweave the various hosts together, so that the load is even distributed, even under less than near-maximum loads. So, download again!


(Holy crap! I haven’t posted since August?!? Time to throw down on some technical issues, lest I be consigned to the Dustbin Of History! Lemme get this mad knowledge out of the way, and I’ll let you know what I’ve been up to. Meet you at the bottom!)

A Mongrel cluster walks into a bar and orders a bowl of water…

So. You’ve looking to set up Mongrel cluster, but the idea of compiling Apache 2.2 for each of your machines or moving over to the experimental branch doesn’t sit well with the part of you that doesn’t like hanging out in pickup truck beds on the freeway. Perfectly reasonable. You’ve decided to stick with Apache 2.0 because the nice folks at Debian manage the packages. (Don’t have Apache configured yet? Check this out!)

Now you’ve got your Apache config all lined up (knock on wood), and it’s time to make the decision: what the hell is Apache going to hand requests off to? A single Mongrel process? No. Hah. No. This is a business, not a blog in a bottle. Lighty and FastCGI? Oh hells no. Zed isn’t answering the same questions over and over again just so’s we can punk out and not use his amazingly cool HTTP server. mod_rewrite and some crazy-pantsed text file rotation? Not so much. No, the choice seems to come down to either Pound or Pen. Which to use? Oh, which to use?

Pound: Where stray Mongrels go to die

Pound is nice. Pound is great. Pound smells like fresh popcorn. Pound is not what you should use, and I’ll try to be brief about it. Pound replaces the X_FORWARDED_FOR header on every request, which for its usual use (an HTTP front-end) is great–it means the server cluster can still tell where the requests are coming from. But if it’s behind an Apache configuration, it will change the headers on each request to show that each request is coming from Apache–127.0.0.1. And Rails, even in production mode, will show a debug screen to local requests. This combination means that Rails will treat all requests as local, even if they originated with Evils McHackerson von Hideypants III. Awwwwkward.

Also? This means that your exception notification code will never fire, so not only are you handing out debug information to folks who can cause unhandled exceptions, but unless you’re camping out on your production log, you’ll have no idea that your app is bombing.

Worst of all, there’s no way to disable this functionality short of hacking the source and recompiling. Pound, regardless of what it needs to do, will always change the X_FORWARDED_FOR, which makes it pretty crap at what we need it to do.

Bottom line: Apache + Pound + Mongrel = NO BAD BAD NO NAUGHTY BAD

So what’s left?

Pen: Mongrels don’t have thumbs

Well, there’s Pen. It’s also a load-balancing proxy, but handily it lets us choose whether or not the X_FORWARDED_FOR header gets changed. It’s about as fast as Pound, and about as stable. Its SSL support is still experimental, but that’s OK, since we’re proxying between local services, not serving as an external front-end. (If we were, we’d probably use Pound.)

The only downside is that it doesn’t come handily packaged to run as a service. That’s ok, folks, because we’re programmers. Right?

pen-service: Life’s too short to start services manually

Here you go: pen-service_0.2.tar.gz

Installation

server:~$ wget http://blog.codahale.com/wp-content/pen-service_0.2.tar.gz
server:~$ tar -xzvf pen-service_0.2.tar.gz
server:~$ cd pen-service
server:~$ sudo mkdir /etc/pen
server:~$ sudo mv pen.conf /etc/pen
server:~$ sudo mv pen-service /etc/init.d/pen
server:~$ sudo chmod +x /etc/init.d/pen
server:~$ sudo update-rc.d pen defaults
server:~$ sudo vi /etc/pen/pen.conf

Configuration

A simple three-Mongrel cluster, starting at port 8000 and ending on port 8002. It listens on port 8080, doesn’t keep track of which clients go with which server, and has a PID file in /var/run. Each Mongrel process can take a maximum of one connection at a time, since Rails is not multi-threaded. You can bump this up, but Mongrel will just queue the requests, defeating the purpose of load-balancing.

/etc/pen/pen.conf

hosts:
- 127.0.0.1:
    ports: 8000..8002
    max_connections: 1
port: 8080
disable_tracking: true
pid_file: /var/run/pen.pid

Full explanation:

# This is YAML.
#
# The backend hosts to be proxied to. Ports can be a single port (8000) or a
# range (8000..8002).
hosts:
- 127.0.0.1:
    ports: 8000..8002
    max_connections: 1

# Used to send Pen commands.
# Default: none
control_port: 8081

# Attach X_FORWARDED_FOR headers. (Don't.)
# Default: false
proxy_headers: true

# Use poll() instead of select().
# Default: false
use_poll: tue

# Dump any debug data in ASCII format.
# Default: false
debug_ascii: true

# Put servers that don't respond on a blacklist for 5 seconds.
# Default: 0
blacklist_time: 5

# Maximum number of HTTP clients.
# Default: 2048
max_clients: 1024

# Log debug information.
# Default: false
debug: false

# If all servers are down, direct to this host.
emergency_host: backup.railsapp.com:80

# Hash the IP before assigning it to a server.
# Default: false
use_ip_hash: true

# chroot Pen to this directory.
# Default: none
chroot: /var/chroot/pen/

# Log Pen info to this file.
# Default: none
log_file: /var/log/pen.log

# Use blocking sockets. (Don't.)
# Default: false
disable_asynchronous_sockets: false

# Use simple round-robin assignments instead of sticky sessions.
# Default: false
disable_tracking: true

# Don't load balance. (Don't?)
# Default: false
disable_failover: false

# Connection timeout, in seconds.
# Default: 5
timeout: 10

# Run Pen as this user.
# Default: none (whoever runs pen-service, usually root.)
user: www-data

# Log Pen's pid to this file.
# Default: /var/run/pen.pid
pid_file: /var/run/railsapp/cluster1.pid

# Maximum number of simultaneous connections.
# Default: 256
max_connections: 500

# Log HTML-formatted usage statistics to this file.
# Default: none
stats_file: /var/www/stats/pen-stats.html

Ok, more about me

I’ve been working like crazy with an amazing team on an amazing application: Wesabe. Everything you hate about your bank website? We fix. Everything you need to know about your money, past, present, and future? You get it. We’re going live reeeeeaally soon, so check us out!

Also, I’m going to be a panel member for this month’s Ruby Tuesday, the San Francisco Bay Area’s Ruby user group. The topic is “Putting Rails To Work,” and I’ll get to squabble with folks like Josh Susser, with whom I’ve only squabbled bloggo-a-bloggo. I’m pretty sure we’ll have launched by then, so I’ll probably be all strung out on developer post-partum depression, or covered in sweat and soot from putting out fires. Expect to see folding chairs fly, folks. (I can’t help but feel a little out of place, seeing my name next to Mr. Rails Associations, Rabble, Florian frickin’ Weber, and Chris “Err The Blog” Wanstrath. Who wants to bet that someone’ll just point to me and shout: “Hey! Who the hell is that guy!” If that happens, I’ll spin around and point at Josh and yell “Yeah, who the hell are you?!”) I’m sure it’ll be a blast–Josh and I owe each other reconciliatory beers from way back when, and I really want to talk with Rabble about his social activism (little known fact–my degree was in Peace & Conflict Studies, with an emphasis in Nonviolent Social Movements).

(This month’s Ruby Tuesday is at Odeo, which means it’s all full. So, uh, crash it. And bring some beer–Daddy’s gotta speak in public.)

26 Responses to “Pound vs. Pen: Because You Need A Load-Balancing Proxy”

  1. Daniel Bristot Says:

    Pacaket Filter + Free/Open|BSD + CARP

  2. Coda Says:

    Daniel, I like pf as much as the next guy, but I’m not switching to FreeBSD/OpenBSD just so’s I can proxy requests from Apache to localhost and have pf round-robin that to the app servers. The point of having Apache is to have it serve up static files quickly and efficiently while passing on dynamic requests to the Mongrel cluster. CARP doesn’t come into the picture at all–if I wanted to spend money on more hardware, I’d buy a hardware load-balancer. If I wanted a hot failover option, I’d buy two and use VRRP.

  3. Ezra Says:

    Man It’s all about nginx. No need for anything besides nginx+mongrel. nginx is running on many many nodes balancing to many many more mongrels on our servers right now. Nginx is probably the most solid and easiest to manage part of our stack. It just works with simple setup and does everything you need from a rails front end. Really can’t say enough good things about nginx.

  4. Coda Says:

    Ezra–If it were my own box, I’d totally be down for trying out some Russian Craziness. I hear good things from people I trust, you included. As it is, the system I developed this for is protecting bank data, among other things, and I’m totally not interesting in using something I have to hand-compile. When nginx makes it into Debian Stable, I’m all over it. Until then, I’ll deal with Pen.

  5. court3nay Says:

    Why is mongrel behind apache? You have some complex rewrite rules in there? If possible, you should have pound going to apache OR your mongrels, not to apache -> mongrels.

    http://blog.tupleshop.com/2006/7/8/deploying-rails-with-pound-in-front-of-mongrel-lighttpd-and-apache

    With a simple pound->mongrel and pound->lighttpd setup (where the filename determines whether mongrel or lighty get the request) the IP comes in to rails just fine (the requests don’t go through lighttpd at all if it’s a dynamic request).

    http://mongrel.rubyforge.org/docs/pound.html

  6. Coda Says:

    court3nay: Apache is up front for SSL, compression, logs, security, some minor PHP, etc. We serve a lot of static files, and having to go through Pound would be that much more load. As it is, Apache can use sendfile(2) and throw static files as quickly as our machine can handle. In a few months we’ll probably have a hardware load-balancer, so this goes out the window and we’ll probably have a Pound+Mongrel/Apache split.

  7. Matt Says:

    Way cool… I’ve been putting pen behind Apache just like this for a while, for the exact same reasons. I’m curious, though… Would you consider it useful to have your script process a directory full of config files instead of just one, like mongrel_cluster does? And then be able to manage them each specifically by specifying a config file?

    I’m thinking that it would be useful because then you could deal with a “pack of pens”, as it were… I run multiple apps off of one machine, so I need to be able to run multiple pens.

    I might take a whack at it and see how it goes…

  8. Zak Mandhro Says:

    Has anyone noticed problems with lighttpd and mod_proxy since version 1.4.13? I was using lighttpd and mongrel on Ubuntu Edgy Eft (6.10). I didn’t run into any problems for a week. I even tried httperf with 100% success.

    After reading Zed’s note on lighttpd + mod_proxy, I got concerned that there might be problems that my httperf’ed URL is not covering. So I jumped over to pound and now the whole Rails application seems noticeably slower.

    Has anyone else noticed mod_proxy issues with 1.4.13? Or noticed slower response with Pound?

  9. Coda Says:

    Matt–Feel free to hack away. The functionality you’re talking about makes a lot of sense, it’s just not something we’ve needed. I don’t think I said it specifically, but pen-service is under the MIT license, so have at it.

    Zak–Did you use the –rate flag with httperf? That’s the key to overloading your server and seeing where things break.

  10. armando Says:

    I love your blog! :-)

    Seriously, it’s really comforting to go back to apache-2.0. Down the road, apache-2.2 and the proxy balancer may be the way to go, but keeping with the generally stable apache packaged for various distros keep things quite simple.

    Thanks!

  11. Zak Mandhro » Edgy on Rails? Get compiling! Says:

    [...] Pound is a decent standalone load balancer. You can learn more about using Pound with Mongrel here. It is quite configurable and comes standard with Ubuntu. Well, sort of. The Ubuntu Edgy package doesn’t work. In my case, it kept crashing without any warnings. Coda Hale doesn’t like Pound because Pound makes Mongrel think the request came from the local machine. You may want to consider Pen instead. One advantage here is that you can use lighttpd as a web server and proxy and let Pound/Pen do the balancing. If you do want Pound, you will have to compile from source. Rob Orsini’s explains how to setup lighty, Pound and Mongrel here. Long story short, download Pound and untar (tar xfz) it. PLAIN TEXT CODE: [...]

  12. Jason Says:

    OK, took me several hours but I found out how to stop rails thinking requests from pound behind apache are from localhost!

    In your environment.rb:

    module ActionController::Rescue
    def local_request?
    false
    end
    end

  13. infotage.net » Blog Archive » The poor state of mongrel_cluster frontend web servers Says:

    [...] the proxy requests and that is the proxy solution pen (0.15.0, plus pen-service taken from “Pound vs. Pen“). Pen can correctly be configured to only send 1 request to each mongrel instance and queue [...]

  14. al Says:

    Hi,

    have you looked into haproxy (http://haproxy.1wt.eu/) as a 3 lb?

    BR

    Aleks

  15. grant Says:

    hmm.

    - doesn’t mongrel use sendfile as well, now, for statics?

    - bank data? yeah, i’d wait for a vendor’s binary release too. liability issues?

    - nginx has a smaller footprint and consumes less on the fly {than pound} due to not using a thread-per-connection model.

  16. Angry, Manchester Says:

    mod_rewrite and some crazy-pantsed text file rotation?

    What exactly is wrong with this method?

  17. Pete Says:

    I’ve been working to get a soon-to-be production server up and running, and this has been very helpful, thanks! Apache 1.3 -> pen -> mongrels looks like it’s going to work quite well for us.

    I did run into one issue. I was doing some load testing to see how things held up, and when handing lots of requests I was getting proxy errors from apache. The culprit was pen, it was dropping connections when trying to handle too many because it had been told that each mongrel only wanted 1 connection at a time. So, max_connections: 1 looks like a bad idea (unless there’s some way to configure pen to queue the connections?).

    After bumping it up, the problem went away. One could also leave it out, but the pen-service script above would need to be tweaked to leave it out as well (it defaults to 1 if empty).

    Thanks again,
    -pete

  18. Wishlisting Blog » Blog Archive » Deploying Ruby on Rails, Mongrel, and Pen on a MediaTemple (dv) 3.0 Server Says:

    [...] Because this server comes with Apache pre-installed, and it’s not 2.2 (which supports mod_proxy_balancer), I decided to use the Pen load balancer. (Coda Hale sold me on this) [...]

  19. Pete Says:

    Answering my own question here, in case anyone else runs into the same issues I did. I asked above

    …unless there’s some way to configure pen to queue the connections?

    There is! If pen is configured to only accept the same number of connections as there are mongrel servers, the pen will do the right thing: queue extra connections instead of dropping them, and hand them to the first free mongrel.

    This is better than bumping up the number of connections allowed to each mongrel because that approach can cause pen to add connections to a mongrel that is currently running a slow task, rather then waiting for the first mongrel to free up.

    If invoking pen directly, add -x to the command line. If using Coda’s startup script, set max_connections: (this is the root level max_connections option, not the one under the hosts config).

  20. Pete Says:

    Oops, bad markup. Last paragraph again:

    If invoking pen directly, add -x [number-of-mongrels] to the command line. If using Coda’s startup script, set max_connections: [number-of-mongrels] (this is the root level max_connections option, not the one under the hosts config).

  21. Jason Says:

    Great article!

    Anyone interested in the “pack of pens” concept (to use Coda’s script with multiple config files/pen instances) and doesnt feel like coding it themselves check out http://code.google.com/p/packofpens/

  22. Jason Frankovitz Says:

    I’m interested in using the pen startup script, but when I run it I get this error:


    me@foo (2499) $ sudo /etc/init.d/pen start
    Password:
    /etc/init.d/pen:52:in `to_s': undefined method `each' for -2:Fixnum (NoMethodError)
    from /etc/init.d/pen:48:in `to_s'
    from /etc/init.d/pen:90:in `start'
    from /etc/init.d/pen:113

    The lines it’s complaining about are these:


    me@foo (2503) $ cat -n /etc/init.d/pen | egrep "48|52|90|113"
    48 @hosts = @options['hosts'].map do |host_options|
    52 for port in eval(host_options.values.first['ports'])
    90 system “pen #{@options}”
    113 Pen.send(ARGV.map{ |x| x.to_s.downcase.intern }.first || :usage)

    I’m using the pen.conf from above. Any idea what’s going wrong here? Thanks for any tips!

  23. Tomasz N Says:

    Just from brief looking it looks you dropped `pound` because of 1 reason: it changes X-Forwarded-For?
    To lazy to hack source and recompile? or some other-not-mentioned reasons?

  24. Ivo Taseff Says:

    Hi, I`m using pound and I can see the real source of the requests coming to the apache server. All you need is that module compiled and loaded in apache.

    LoadModule rpaf_module modules/mod_rpaf-2.0.so

  25. Gavin Conway Says:

    You write articles like you’re talking to a friend? Be more concise!

  26. johny b bad Says:

    like talking to a retarded friend.

    Thank you though, as painful as it is to read, insightful, useful info.