View Full Version : What kind of plan will handle this server load?
JasonN
03-17-2006, 12:04 AM
Update: I've reduced my query count per page to the 3-25 range rather than 5-80, and I've upgraded to a VPS plan.
I'm going to bring a very fancy new version of my website online within a month or two, and I want to make sure my launch goes smoothly. I'm developing it on an inexpensive shared plan with a quality host.
When I transfer my live domain to the new site, I'll be getting about 5000 pageviews/day right away. Each pageview will have anywhere from 5-80 MySQL queries (average about 20) and some fairly intensive PHP processing. I expect an early traffic spike up to maybe 25K pageviews/day, so during the busiest parts of the busiest days, I might be averaging 10-15 MySQL queries/second.
Right now when I load my most server-intensive page, with 99 queries, I get an average execution time of 0.85 seconds. The shared plan accommodates my bandwidth, disk space, etc, so my decision is strictly about server performance, and I'm a newbie in that area.
My host seems to think I'll need a better plan. They have a VPS plan with a 256mb memory quota, 40gb storage space, 150gb monthly transfer, on a 1000 Megabit NIC. This is much cheaper ($50/month) than a full dedicated server ($229/month). I'll be talking to them about this soon and I want to make sure they don't sell me more than I need.
My basic questions for you guys:
Should my good shared service be sufficient for these needs, or am I right to be looking at this VPS stuff?
What questions should I ask my host's tech support?
AndyH
03-17-2006, 03:05 AM
Well firstly, are your SQL queries optimized as much as possible?
If it is performance then you don't need to go to a better plan, however your host will try to get more money out of you because you are a resource hog and if you slow the server too much they may even force you to leave.
Also, can these pages be cached?
Chris
03-17-2006, 06:38 AM
With that many queries I would seriously look into a caching solution. There are two articles on this site about caching, both can be found in the php programming section.
JasonN
03-17-2006, 07:27 AM
Hey guys,
Thanks for the tips. I just read those two caching articles Chris, and they're both very good, but it doesn't seem like the solution for me. My pages are all pretty dynamic:
Many incorporate recent posts from certain forum sections on the main page.
All the pages display differently depending on the user's session permissions/preferences.
All have search engine friendly URLs (parsing fake directory strings, not mod_rewrite).
Almost every page can take a varying number of parameters to perform different tasks, and this too depends on the user settings.
Much of the content is dynamically rated, so what appears on "page 1" may change if somebody performs an action on one item that boosts its rating, etc.
My query results are mostly arrays (often based on user settings) rather than single variables, so trickier to cache.
Some of the queries on each page are insert/update/deletes, to record the performance of various records (which is incorporated into ratings).
I don't know much about caching, except for what I just read, but it seems to me like caching wouldn't work without seriously dropping lots of features from my site.
My queries are not perfectly optimized, but they're not incredibly sloppy either. I have figured out how to trim several more, but I think that at best I'll be able to get it down to 3-40 queries per page. Another option to consider is that if I upgrade to VPS with my host, I can use MySQL 4.1, which will save me many queries by allowing subqueries. (I know how to simulate subqueries with joins, but that doesn't work for the subqueries I'd need.)
By the way Chris, thanks for your "How to Build an Effective Website" section. Well done!
I wrote a caching system some time ago, was mainly to stop sites like Slashdot killing the server but then as traffic grew it became a requirement!
The way I do it is I cache part of the page where the actual content from the database is located. I let the rest of the page generate as normal. Within these cached sections I add unique tags... like say... [this_is_the_poll_1234], then after the cache system has done its thing, I do an str_replace for the tags and that's how any highly dynamic content ends up within a cached page.
An slightly different method is to include PHP in the cached version rather than tags, such as function calls. Then use eval on the cached page - I don't like this option though because if someone somehow figured out a way of writing to the cache directory (which will of course be writable) then they could mess up the pages. This option is sometimes faster though because with the other option you need to run str_replace for each dynamic tag.
ramprage
04-07-2006, 08:51 AM
I would get away from shared hosting with this type of site for a few reasons. 1) The host restricts the amount of CPU/Memory usage on all accounts whether they tell you or not. If you're at the top of the list using all the servers memory chances are they'll suspend you without warning and force you to upgrade.
A VPS server should be able to do what you want now , I'm not sure what kind of growth it would allow for. Make sure they offer an uptime guarantee, ask how many other clients are on the VPS and what the main systems hardware is.
Find out how many staff they have and get some reviews or references from them.
ramprage
04-07-2006, 09:48 AM
Any easy solution would be Turck MMCache
http://turck-mmcache.sourceforge.net/index_old.html
dc dalton
04-08-2006, 10:46 PM
Why in HEAVENS name would you have ANY page that's running 5 - 80 seperate queries? OH MY GOD!
I think you seriously need to look at how you are doing this cause this can suck the life out of a server! Great example, photographers.com. This guy "cheaped out" when they first built the site and outsourced the job to a shop that didnt have clue one! The were running about the same amount of queries you are talking about .... well as things got rolling and the members came in the site became a slug and crashed constantly! He has had to pay to damn near overhaul the whole site just to keep it up and running.
I cant for the life of me understand why anyone would build a page with so many queries when you can set up some amazing things in the db to handle this ... and please god don't even tell me you are trying to do this with MySQL, cause if you are you are guaranteed to have some MAJOR problems as the site grows ..... I would personally suggest some more enterprise level, like Postgresql.
This is probably my biggest gripe with most programmers today (again I said most) ... they know NOTHING about real database design and using the database to handle the work it is supposed to handle! I see way too many folks who think a database is nothing more than a place to "put stuff"
Query Optimization is one thing but if you are looking at the kind of traffic you are talking about and the amount of queries you mention per page you have the potential to have a "server killer" here, even if it's a dedicted one! Remember the old days of ebay, cause that was the same problem!
JasonN
04-09-2006, 08:55 AM
It's easy to see how photographers.com could be done in just a handful of queries per page. It's a pretty simple format as far as the database is concerned.
Mine is way, way more complicated than that, in the interest of producing something fairly simple for the user from a complicated hierarchy of scientific information. Information from several scientific objects on each page is cross-referenced with various different types of illustrations and explanations and such. The end result is a very crisp, readable, interactive bunch of pages, but it's not the kind of thing that can be reduced to pocket change for the database.
I did overestimate the number of queries I would need per page. I initially made this post before I optimized my queries, so I was just taking a guess. I've done that now, and it looks like I'll be doing 3-25 queries per page rather than 5-80. I ended up upgrading to a VPS plan, and I am using MySQL 4.1 with its built-in query caching turned on to further lighten the load.
Also, I read somewhere that most of the *nukes use 40ish queries per page, and most forums are in the neighborhood of 15, using MySQL. So I think I should be fine with 3-25.
Dalton if you still think I should be panicking, can you please explain why? I haven't found anybody who can explain why doing lots of queries is bad for the server. If I take my execution times and multiply them by the expected traffic, it seems I would pretty rarely be asking the server to run more than one query at a time. What is the bottleneck on the server I should be worried about? I would like to have some sort of quantitative understanding of how my queries affect server load, but apart from "more/bigger queries = more server load" I haven't found any information about that. (I've found plenty about optimizing individual queries, but nothing about how to judge if I'm asking too much of the server from the entire page of queries.)
Here's another thing I don't understand. I recently overhauled an old page I wrote at the start of this project, when I was just learning to use PHP/MySQL, and I rewrote something that used 400 simple queries in a loop to just use 1 pretty simple join query. The execution time didn't change at all. It was fast both ways. Why is that? Are those 400 just as easy on the server as that 1? If not, why not?
The New Guy
04-09-2006, 09:01 AM
The database is usually the first bottleneck. Queries add alot of load to the site. Thats why even at 15 queries forums are hardly ever allowed on shared accounts or at least not at a high load.
I think the question I would have is: Why are your queries are in a range? Why would the queries change?
JasonN
04-09-2006, 09:28 AM
I have a different number of queries per page because I have several very different types of pages, some with more components than others. For example, I have forums which are about 3 queries per page and don't incorporate all the different kinds of data seen on the main pages. Article pages, the FAQ page, various simple picture gallery pages, and others are all very simple few-query deals compared to the main pages.
The main type of page varies within a smaller range, which will probably eventually be about 14-18 for guests and 15-20 for users. The reason there is that some pages don't have some components. For example, I only query the glossary or bibliography database if my custom BBcode parser encounters those tags in the page's text. And for some of the components which are paginated, I have to first do a query for the count, and then another query to actually retrieve the needed rows, but when the count is 0 I don't need to query for any rows so I don't.
Here's a question for you: What exactly is "load"? What does it physically mean, how is it quantified, and how can I measure it? I want to be able to select between two alternate ways of doing things based on how much "load" each will create, and I haven't found anything that can help me do that. Here's an example tradeoff I haven't found any useful advice about:
Say I have a 500-row 3-column table, and I'll need anywhere from 0-15 rows from that table in my page, but I have no way to query in advance for only the rows I'll need because they're triggered by tags written into text fields being pulled from several different tables. Is it better to do one query for all 500 rows and store them in a PHP array for use whenever I need them? Or is it better to do up to 15 individual queries whenever one is encountered in the text parsing?
What if it was a 5000-row, 40-column join of 3 tables, of which up to 40 rows are used per page?
Right now for each of those two cases I'm doing a single query and storing my whole table in a PHP data structure while the page loads. Is this better/worse than doing many individual queries, and why?
dc dalton
04-09-2006, 08:13 PM
As the new guy said, the db CAN be the bottleneck for a site (if its not used properly) ..... I am more apt to use a more powerful db (like Postgres, Oracle) for the back end and use Stored Procedures and or functions so that I can allow the db to do what it does best.... let me explain.
Lets say I have an app that has to pull an extreme amount of related data for lets say an employee. If the tables are normalized properly (note I sad properly, not overly normalized) you may have the data in anywhere from 5 - 15 tables. Now the average joe might do a major join or even worse query one table (or two at a time) ..... WAY WRONG. In this case creating a view of ALL of the tables is far superior and faster than ANYTHING you could do in your app... just a small example again:
employee_info table (20 fields)
payroll_info table (40 fields - relates to employee)
x-table (50 fields - relates to employee)
y-table (40 fields relates to employee)
etc etc .....
Now a join as such:
SELECT DISTINCT
e.*,
p.*,
x.*,
y.*,
FROM employee_info e, payroll_info p, x-table x, y-table y
WHERE e.employee_id = p.id
AND e.employee_id = p.id
AND e.employee_id = x.id
AND e.employee_id = y.id
Will literally bring you back all the data in one shot AND the internal view is about 100 times faster than anything you can do programatically or via joins. (and less CPU and memory intensive)
Now if you are talking complex math you can build a function and / or procedures in the db that run internally and can handle extrememly complex calcs in amazing times .... thats what dbs are SUPPOSED to do!
What bothers me about your post is the load factor:
x users times x queries per page ....... that is exponentially disasterous!
Know how to use a database properly can be the difference between a slug app and an amazingly fast app AND it can mean the difference between server crashes and no server crashes!
I would seriously consider having a db admin or guru look at what you are doing and to make suggestions how to make it carry less of a load on your servers ...... it can mean the difference between a great app and a crap app!
JasonN
04-09-2006, 09:39 PM
Thanks Dalton. That was a very informative post... although I'm still wondering about the questions I asked above it.
I'm using MySQL 4.1 and I don't think I'll be able to change that, so I don't have access to stored procedures, views, or triggers. Also, this is a site I'm creating by myself with a budget of $0, so hiring a DB expert to look over everything is out of the question. And I need to take the site live in a month or two, so I don't have time to rewrite all the database interactions to utilize those features... I would have to dramatically revamp tens of thousands of lines of PHP.
I realize my situation is far from the ideal from a DB perfection standpoint, but do you think I've really got something to worry about if I'm averaging 15 queries per page and 3-10K pageviews/day on a VPS plan? It seems quite a few forums do that and more all the time without enterprise SQLs.
ramprage
04-13-2006, 04:24 PM
Interesting post we have here ;)
DC is right on the money about how a poorly written application can even bring the biggest servers down with only a few or one user running it. I've seen some real crap PHP and Perl scripts that have done just that. Load the one .cgi file and it instantly uses 70% of the CPU.
Think of your server load like a cars horse power. A little Honda Civic is a very cost effective car to drive and it gets you around for daily stuff. You would never hook up a boat and trailer to a Honda Civic. The car isn't designed for this and isn't strong enough. You'd get yourself a nice RAM or equilvent pickup with lots of horse power so you can tow more and handle more weight.
Think of every time a new visitor going to your website they are like a passenger getting into the car. Soon your car will be full and your start loading your visitors into the trailer. Your car will be very slow and non responsive.. eventually it will get so loaded down that you can't even move. A server can only handle so much, only so many applications can run that use so much of it's power. Linux systems use load to determine how busy your system is or backed up the processing is. Usually systems with a load of 10 or under run optimally, try to keep the load around 0-5 at best. This means your CPU has extra breathing room and can handle new jobs quickly. Like your car still has room for more passengers. If you keep loading the car with people or your website gets a massive traffic hit, everything will just get swamped.
It's important to have a well optimized application because you want it to handel as many people as you can. Just like you want to keep you car tuned well so it gets the best gas milage. A more poorly written script is like not chaning your oil for 12 months and can eventually fry your engine. At the end of the day you need to find a car you can afford that can handle the things you need. Just because you want to get a trailer and boat doesn't mean you can't later on down the road.
john190
04-17-2006, 06:32 AM
I agree!
Using caching is very good. I use it on most of my sites, and I love it.
With that many queries I would seriously look into a caching solution. There are two articles on this site about caching, both can be found in the php programming section.
Powered by vBulletin® Version 4.2.2 Copyright © 2024 vBulletin Solutions, Inc. All rights reserved.