Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask YC: Capacity planning question
8 points by goodgoblin on Jan 21, 2008 | hide | past | favorite | 16 comments
How many simultaneous mongrels am I going to need to serve 120K users? 120K mongrels?

Assume the mongrels are sitting at the front of a rails app - any idea of a good metric for this? I know they're not all going to be hitting the site simultaneously, but I believe each request to rails blocks.

So at 4 Mongrels a virtualized server I'd need 250 to serve 1K simulataneous users. Is there a good metric for breaking down how many simul users I should plan for based on the total number of users? comments, links, ridicule welcome TIA



120K users a day? That's a _lot_ for a rails app. In fact, looking around, I'd be surprised if there were any existing rails applications that serve that many. 43things.com seems to max out in the 50K users per day range (on average). The first question I have to ask is do you really need to handle that kind of traffic? If you don't - don't bother trying to scale to that much yet. If you're talking hits, that's a different story... You'll definitely want to know how many concurrent you're getting. For reference, a TechCrunch post generates between 20-50 concurrent hits on an application that renders in ~1s/page (as of Nov 07). Plan for 2x-3x that for digg/reddit traffic. That TechCrunch load was served with 4 or 5 mongrels. 6-8 is the most I'd recommend running on a dual-core box.

I haven't seen a 1000 concurrent user app since I worked at UPS... so if you have this problem, why are you asking us? You should be hiring someone with the huge amount of money you have. ;)


Thanks - the 120K users is more likely the total number of users for the app - so chances are slim of them all using the site at once - but want to plan for the worst. Might be putting together a license deal and just want to make sure that I don't bankrupt myself.

So - a metric might be 30 concurrent users for 4 mongrels - 4 mongrels per EC2 server.

120 CCU would be 16 mongrels on 4 EC2 servers, etc.

Not that I'd use EC2 - just a convenient constant - load balancing would be tricky.


We do 400K+ users an day on our Rails app, 10MM pageviews, and 200 requests a second.

YellowPages.com, which is probably the largest Rails site (yes larger than Twitter) does a lot more than that.

I used to buy in to the whole 'scale later' philosophy from 37signals, but after getting burnt by a fast growing application and being down for half a month - capacity planning is definitely worth it over being caught with your pants down right when you're growing.


Cool, that's pretty good. Is that for your page builder or your facebook app?


It's just the Facebook app.


Here are some numbers from our application, let me know if this helps:

We have 400K daily active users, doing 200 requests a second and around 10MM page views per day. All requests are dynamic and hit the full Rails stack. We're probably easily in the top 5 Rails sites on the net based on load.

We run all of this on 5x 4core 8GB application servers and 2x 4core 32GB db servers in master-master replication. We run 16 mongrels on each app server for a total of 80. Our average response time per request is around 100-200ms.

We host on Softlayer and pay around $6000 a month.

Also, the number of mongrels you will need is directly dependent on how fast your requests are, and how you are loading balancing across these mongrels. We use the nginx proxy with the fair load balancing patch. http://brainspl.at/articles/2007/11/09/a-fair-proxy-balancer...


Thank you sir - this kind of info is pure gold. Softlayer - do they manage the servers or is that your datacenter and you manage the machines yourselves? Was looking at Engine Yard - they start at $17k for a cluster.


Yeah we looked at EngineYard. They are good if you don't ever want to deal with deployment at all, but I really can't justify the premium. The 17K price quoted is probably for their basic cluster of 3 machines - we run on 10 now so it's probably going to be a lot more than that.

SoftLayer is unmanaged, but they do have staff that can help you with sysadmin stuff for a fee.


Each request to Rails block a Mongrel while it servers it, but even on my home server I am getting upwards of 50 requests per second on my app (all logged in pages, so no caching). That means you could have 50 people all requesting the page at the same time and each of them will have it returned inside a second even with a single Mongrel.

You need to be careful that you don't tie up some of your Mongrels doing long running tasks - if you have actions that cause tasks to run that take on the order of seconds, consider queuing them up to be serviced by some other background process (which is what I decided to do).

As someone else mentioned here, try to cache as much as possible - cached full pages take the load off Rails completely, cached fragments reduce the time to serve a request inside Rails, so you can get more from each Mongrel. Make sure and not cache logged in pages though!

Other general advice for a database application - hit the database as little as possible - in Rails don't do things like:

  @user = User.find(params[:id])
  @products = @user.products.find(:all)
  @profile  = @user.profile.find(:all)
That would result in 3 database queries, while this will do it in 1:

  @user = User.find(params[:id], :include => [:products, :profile])
etc ...


I'm really trying to figure out how much money serving 120K users is going to cost me in servers and bandwidth. Thanks for the tips - I've farmed the image uploads out to merbs running in EC2 - there are some other long running (6+ second) tasks that users could perform frequently - I'll start to farm those out as well.


> So at 4 Mongrels a virtualized server I'd need 250 to serve 1K simulataneous users.

Depends on how simultaneous they really are. Are we talking 1000 hits per second? Or are we talking about 1000 unique people viewing some portion of your site/app at a given time?

If it's the latter, you can get away with a lot less.

Also, if you have shared-anything, it will become a bottleneck long before the mongrels. Your database especially will have to be replicated (for read-mostly apps), or sharded (for heavy read-write apps).

If it's a read-mostly app, consider aggressively caching fragments or even pages. (1K users hitting static pages will just hit Apache, given the right set of mod_rewrite rules, and you can have a lot more Apache processes (or threads; is Rails threadsafe these days?) running on a given server than mongrels (which, when I last used Rails, were very resource-hungry).

Consider also ways to extend the functionality of cached/static pages. You could have mod_rewrite check to see if the user has a login cookie and only then hit the non-cached app, OR you could have client-side javascript on the static cached page check for the same cookie and only then display the login name or do an XMLHttpRequest to the server (which then may cache a static html subpage named for that username, which can then be checked by mod_rewrite as well).

Just don't trust non-signed user cookies for looking up private information, or for making any database writes. Signed cookies, however, are a great alternative to centralized sessions (just remember to encrypt anything that you want the user to store, but not see: signing just protects against tampering). Jam the user's IP address into the signed cookie text and guard against replay attacks, as well!


Thanks - i'll have to spend some time to grok your suggestions, but for now at least I appreciate them.

Re: 1k simulataneous requests - concurrent requests - just didn't want rails to block.


I dont think your question has enough data to answer properly. We need to know whats the service time(avg request takes how many seconds) for each request and how the arrival rate of your requests like (120K per second or minute?) and whats the limit on your request queue (put them on hold till they get server). Finally whats the tolerance level of the final response time (can support 120K users by serialising across 10 servers but that drives up the response time for end users)?

Anyways probably you can read some books here at http://www.cs.gmu.edu/faculty/menasce.html

or if you are in a hurry a quick glimpse at the tactical paper at http://www.cmg.org/measureit/issues/mit04/m_4_7.html

Hope this gets you started if not answer your question thoroughly..


Hi - yes - looking into how to price a potential large scale licensing deal - but unlike something like MS-Word the cost per instance isn't zero with hosted software. Trying to come up with a per-user or per 1k user metric for the server costs. I'll take a look at your links and post back when I get some data - thx


I don't mean to be an ass, but you are not going to have 1k simultaneous users. If you did, you would be able to pay someone who had a better understanding of how webservers work.

One decent machine running 10 mongrels on a reasonably well-designed Rails app will easily be able to handle 100 requests per second. That is more traffic than you will ever get, I guarantee you.


How many users would 100 requests per second translate to? I know it depends on the app, but figuring an average render, think, click cycle - is that 1000 logged in users? 10,000?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: