Obvious Hints: benchmarking

Showing posts with label benchmarking. Show all posts

Thursday, July 2, 2009

Benchmarking MongoDB VS. Mysql

EDIT November 2011: Please take this benchmarks with a grain of salt. They are completely non-scientific and when choosing a data store probably raw speed is not all you care about. Try to understand the trade-offs of each database, learn about how your data will be handled in case of losing a server. What happens if there's data corruption? How replication works in your specific database… and so on. After all… don't choose a database over another just because of this blogpost.

One of the projects I work for at the company has a message system with more than 200 million entries in a MySQL database. We've started to think about how can we scale this further. So in my research for alternatives look at nosql databases. In this post I would like to share some benchmarks I ran against Mongodb –a document oriented database– compared to MySQL.

To perform the tests I installed MongoDB as explained here. I also installed PHP, MySQL and Apache2 from Macports. I did no special configurations in any of them.

The hardware used for the tests is my good ol' Macbook White:

Model Name: MacBook

Model Identifier: MacBook2,1

Processor Name: Intel Core 2 Duo

Processor Speed: 2 GHz

Number Of Processors: 1

Total Number Of Cores: 2

L2 Cache: 4 MB

Memory: 4 GB

Bus Speed: 667 MHz

Boot ROM Version: MB21.00A5.B07

SMC Version (system): 1.13f3

Because I don't have enough space to store the MongoDB database in my local hardrive, I launched the server with this command:

./bin/mongod --dbpath '/Volumes/alvaro/data/db/'

which tells MongoDB to use my USB hardrive. YES, my USB hardrive :-P

The MySQL server stored the data in the local hardrive.

What was the test?

I loaded in both databases 2 million records from our real data of the message system. Every record has 28 columns, holding informatin about the sender of the message and the recipient, plus the subject, date, etc. For MySQL I used mysqldump. For MongoDB I used the following:

$query = "SELECT * from messsage";

$result = mysql_query($query);

while($row = mysql_fetch_assoc($result))

{

$collection->insert($row);

}

Of course that for the real data loading I added some paginations, I didn't retrieved 2M records at once. And there was some code to initialize the MySQL connection and Mongo to get that $collection object.

The MySQL databases had index on the sid and tid fields (sender id and target id), so I added them to the MongoDB database.

$m = new Mongo();

$collection = $m->selectDB("msg")->selectCollection("msg_old");

$collection->ensureIndex( array("sid", "tid") );

Then I wrote some simple code that will select a limit of 20 records filtered by sid. In the real application this means I'm watching the first 20 messages of my outbox.

EDIT - 2009/07/03

Due to some confusion I have to make something clear. What I'm benchmarking is not the data loading into both databases, nor traversing the data, etc., but the code that you can find here .

This is a similar case of what a user message outbox (or inbox) could be in a production website. The users access his inbox and we retrieve up to 20 messages of his inbox, which are then displayed in an html table. What siege accessed was a script serving that html generated out of the query results.

So the idea is, if MongoDB or MySQL are the backends of this message system, which one will be faster for this specific use case. This benchmark is not about if MongoDB is better than MySQL for every use case out there. We use MySQL a lot in production and we will keep using it as far as I can tell. And yes, I know that MySQL and MongoDB are two totally different technologies that probably only share the word database in their descriptions.

END EDIT - 2009/07/03

I did the code for Mongodb and MySQL. Then my idea was to launch siege and pick some random user ids from a text file and do the stress tests.

Here's an extract from the url textfile:

http://mongo.al/index.php?id=96

http://mongo.al/index.php?id=105

http://mongo.al/index.php?id=108

http://mongo.al/index.php?id=113

http://mongo.al/index.php?id=116

http://mongo.al/index.php?id=117

http://mongo.al/index.php?id=127

http://mongo.al/index.php?id=129

http://mongo.al/index.php?id=130

http://mongo.al/index.php?id=134

This means that siege will pick a random url and hit the server, requesting the outbox of that user id.

Then I increased the ulimit to be able to run this test:

siege -f ./stress_urls.txt -c 300 -r 10 -d1 -i

With that command I launch siege, telling it to load the urls to visit form the text file. It will simulate 300 concurrent users and will do 10 repetitions with a random delay between 0 and 1. The last option tells siege to work in internet mode, so it will pick urls randomly from the text file.

When I launched the test wit MongoDB as backend it worked without problems. With the MySQL it crashed quite often. Below I add the results I obtained for both of them.

MongoDB test results:

siege -f ./stress_urls.txt -c 300 -r 10 -d1 -i

Transactions: 2994 hits

Availability: 99.80 %

Elapsed time: 11.95 secs

Data transferred: 3.19 MB

Response time: 0.26 secs

Transaction rate: 250.54 trans/sec

Throughput: 0.27 MB/sec

Concurrency: 65.03

Successful transactions: 2994

Failed transactions: 6

Longest transaction: 1.47

Shortest transaction: 0.00

MySQL tets results:

siege -f ./stress_urls_mysql.txt -c 300 -r 10 -d1 -i

Transactions: 2832 hits

Availability: 94.40 %

Elapsed time: 23.53 secs

Data transferred: 2.59 MB

Response time: 0.74 secs

Transaction rate: 120.36 trans/sec

Throughput: 0.11 MB/sec

Concurrency: 89.43

Successful transactions: 2832

Failed transactions: 168

Longest transaction: 16.36

Shortest transaction: 0.00

As we can see, MongoDB performed more than 2X better than MySQL for this specific case. And remember, MongoDB was reading the data from my USB hardrive ;-).

Tuesday, March 3, 2009

Symfony Speed and Hello World Benchmarks

After reading some posts showing that my blah blah framework is way more fast than symfony for a Hello World application I decided to explain why: because symfony is extensible and can adapt to your needs. That’s easy to say you may think, in fact, every framework out there claims that. So what makes symfony so special?

The following list names some of the features provided by symfony.

Factories
The Filter Chain
The Configuration Cascade
The Plugins System
Controller adaptability
View adaptability

Factories

Since version 1.0 symfony provides a configuration file called factories.yml. This file affects the application core classes configuration. There you can override symfony default classes by your own ones. This means you can set up a custom Front Controller, Web Request, Cache classes, Session storage, etc.

But this come with an extra price: when a symfony application bootstraps, it reads the configuration file from the filesystem. If the yml file was parsed before, then it loads a PHP file -which can also be cached with APC-, if not, then it parses the yml file, stores the parsed file on the cache folder and loads the configuration from there.

Why should I need that flexibility you may ask? In a project I’m involved with we needed Memcached. This means that we overrode all of the symfony default cache mechanism by our own custom classes. How? Setting up our classes in an easy to read yml file. So for sure when you benchmark your Hello World application symfony will be slower.

Filter Chain

One of the patterns from the Core J2EE Patterns book that impressed me the most is the Intercepting Filter. This patterns teach how to modify a request processing without the need to change Controllers or Model code. The idea is that in a configuration file you plug a class that will take care of pre or post processing the request. This classes are called filters. As an example, you can add a filter that checks if the user has the proper credentials to execute the action she wants. Another filter can cache the response, etc.

Symfony has the filters.yml file which can be specified by application or by module. This means that we can set filters to be executed for the whole application, and then for specific modules -let’s say for Ajax actions-, we disable them. Does your framework provides this flexibility without resorting to some kind of monkey patching techniques? No? Well symfony does. Say hello to the Filter Chain. So for sure when you benchmark your Hello World application symfony will be slower. Because it adds flexibility to the process. You want to get rid of this behavior? Sure, set up a custom controller in the factories.yml file, and in your new class override the loadFilters method.

Configuration Cascade

As explained in the configuration chapter of the symfony book, symfony allows to modify it’s behavior through some yml files. So for example we have the view.yml that tells symfony which css and javascript files it should load for the current request. We can have an application view.yml configuration and then override the settings per module. When symfony process the request it checks all this files, that’s why in the Hello World benchmark it‘s slow.

Plugins System

Symfony has a very powerful and easy to use Plugin System. It’s more than 400 plugins with a set of 200+ developers speaks by itself of it success.

The plugins can contain modules of their own and also a config.php file, similar to the project config.php file or the module config.php. When symfony process a request it checks for the settings in those files, this means that they will be read from disk. So in a plugin we can provide a custom logger that is fired up when bootstrapping the application. The Plugin user doesn’t need to care how the logger will be activated, she just now that it will work.

The same applies for plugin modules. How do you think that symfony knows that certain module/action should be called from a plugin? If you enable the plugin module on the settings.yml file symfony will check inside the plugin module to see if the requested action should be executed there.

Controller adaptability

For each action that the user call symfony will execute a page controller. Inside our modules symfony let us use a generic myModuleActions class that will extend sfActions or one specific to the action requested by the user, that as an example could be called indexAction and will extend the sfAction class. When a request is processed symfony first checks for the existence of the later. If it doesn’t exists then it tries to load the generic action for that module. Of course you don’t need this kind of flexibility for Hello World apps.

View Adaptability

For rendering the response symfony uses by default the sfPHPView class. If certain module in your application requires a different view, then there are at least three ways to accomplish this as explained here.

Conclusion

Symfony is a Professional Web Application Framework built to cope with real world needs. In a large project with more than a simple salutation feature sooner or later you will need the flexibility provided by the framework. This will save you time and will prevent headaches, because when you have built a whole system with a framework and the business needs start to push in a direction where you have to extend the framework you will thank yourself for having choose symfony at first.

In case that your client requires a Hello World! application, then you can use the following hyper fast framework code: die(“Hello World!”) ;-)

Tuesday, September 2, 2008

We started something!!

It seems that our benchmarking example has pushed people to do their useful benchmarks.

You can check this framework benchmarks page and try to guess what are they actually benchmarking. If you can find the point of that benchmark, please drop some comments here, because I want to sleep with ease tonight.

So I want to left here a just a few remarks about this kind of stuff:

Stop benchmarking your just created framework against symfony, Zend Framework, Cake, or whatever.
When are we going to realize that the point of a framework is not to run as fast as assembly code, but to improve developer productivity and save money in developer time?
I'm not pissed off, I just can't get the point of those benchmarks.

If you have more examples of this kind of useless benchmarks, please add them to the comments.

Monday, September 1, 2008

Benchmarking die("Hello world!"); VS. exit("Hello world!"); VS. echo "Hello World!"; on PHP

After doing some useful and problem solving benchmarking with Siege we can scream to every corner of the world that exit() is faster than die() and echo

Show me the facts! I hear you screaming. Let there be facts!:

echo "Hello world!";

Transactions: 250 hits

Availability: 100.00 %

Elapsed time: 7.10 secs

Data transferred: 0.00 MB

Response time: 0.01 secs

Transaction rate: 35.21 trans/sec

Throughput: 0.00 MB/sec

Concurrency: 0.20

Successful transactions: 250

Failed transactions: 0

Longest transaction: 0.05

Shortest transaction: 0.00

die "Hello world!";

Transactions: 250 hits

Availability: 100.00 %

Elapsed time: 10.04 secs

Data transferred: 0.00 MB

Response time: 0.01 secs

Transaction rate: 24.90 trans/sec

Throughput: 0.00 MB/sec

Concurrency: 0.17

Successful transactions: 250

Failed transactions: 0

Longest transaction: 0.04

Shortest transaction: 0.00

exit "Hello world!";

Transactions: 250 hits

Availability: 100.00 %

Elapsed time: 6.05 secs

Data transferred: 0.00 MB

Response time: 0.01 secs

Transaction rate: 41.32 trans/sec

Throughput: 0.00 MB/sec

Concurrency: 0.27

Successful transactions: 250

Failed transactions: 0

Longest transaction: 0.04

Shortest transaction: 0.00

The previous test were performed simulating 25 concurrent users with a 10 times repetition. Now just imagine for one minute (or two if you need more) if 20.000 concurrent users hits your echo "Hello World" website! That will be a mess! Just by imagine myself this scenario I can't stop hearing the sirens on my mind, so please, grep through your code and preg_replace() all those echo "Hello world!" you may have there!

Obvious Hints

Thursday, July 2, 2009

Benchmarking MongoDB VS. Mysql

Tuesday, March 3, 2009

Symfony Speed and Hello World Benchmarks

Tuesday, September 2, 2008

We started something!!

Monday, September 1, 2008

Benchmarking die("Hello world!"); VS. exit("Hello world!"); VS. echo "Hello World!"; on PHP

About Me

My Twitter

Labels

What I read

Blog Archive

Obvious Hints

Thursday, July 2, 2009

Benchmarking MongoDB VS. Mysql

Tuesday, March 3, 2009

Symfony Speed and Hello World Benchmarks

Tuesday, September 2, 2008

We started something!!

Monday, September 1, 2008

Benchmarking die("Hello world!"); VS. exit("Hello world!"); VS. echo "Hello World!"; on PHP

About Me

My Twitter

Labels

What I read

Subscribe To

Blog Archive