Jaslabs: High performance Software

High Performance Software

Archive for March, 2006

The early days of slashdot

By Justin Silverton

I have been a regular reader (and occasional poster) of the popular tech site http://www.slashdot.org since 2000. Recently, I searched archive.org for the earliest version of the site I could find.

Here is a link: http://web.archive.org/web/19980113191222/http://slashdot.org/
it is slashdot.org on January 13th, 1998

Some interesting archived article summaries taken from the above link:

IE Takes the Lead?

Contributed by CmdrTaco on Fri Jan 09 at 2:09PM EST From the fun-with-numbers deptDavid Fagan wrote in to tell us about this article where it is reported than recent statistics show that IE4 is out on top in the browser battle with 63% of the traffic at various high traffic sites. I don’t put a lot of weight in these stats, but this is a pretty significant number.

Intel Releases 266 Pentium

Contributed by CmdrTaco on Fri Jan 09 at 12:29PM EST From the pushing-but-not-hard deptIntel is releasing the 266Mhz version of the possibly immortal Pentium line of CPUs. Supposedly this chip is primarily for the Laptop world. Interesting timing considering how hard Intel is pushing the P2 lines of chips, and the next generation of that line that is due out soon. Thanks to David Fagan for alerting us.

(a direct link to an article posted by cmdr taco about the current state slashdot):

http://web.archive.org/web/19980113194013/slashdot.org/slashdot.cgi?
mode=article&artnum=00000425

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • DZone
  • Slashdot
  • StumbleUpon
  • Technorati
No comments

Flickr.com - PHP/mysql case study

Introduction

Carl Henderson from Flickr.com, a very popular photo blogging service has released a pdf (not sure exactly when this was actually released) detailing the issues they faced with having a high-traffic website.

original PDF can be downloaded here

some interesting points taken from this pdf are below.

Classes, libraries, and systems used

1) smarty for templating
2) PEAR and XML for Email parsing
3) perl for controlling
4) imagemagick for image processing
5) mysql (4.0/innoDb)
6) java, for node service
7) apache 2 and redhat linux

8) 60,000 lines of PHP code
9) 60,000 lines of templates
10) 70 custom smarty functions/modifiers
11) 25,000 DB transactions/second at peak
12) 1000 pages per second at peak

unicode support

1) UTF-8 pages
2) CJKV support

Tips: don’t use HtmlEntities(). Also, Javascript has patchy Unicode Support

Why php was used

1) Everything can be stored in the database, including smarty cache
2) a “shared nothing” approach (as long as php sessions were not used)

Mysql usage

Select’s: 44,220,588
Insert’s: 1,349,234
update’s: 1,755,503
delete’s: 318,439
13 select’s per Insert, Update, and delete

Tips: many of the tables that needed to be full-text searched were de-normalized. This does waste space, but because it allowed for little or no joins, it made searching much faster.

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • DZone
  • Slashdot
  • StumbleUpon
  • Technorati
3 comments

The Zend Framework

By Justin Silverton

Introduction

The Zend Framework is a recently released (still in alpha) set of open source tools for php designed for developing Applications and Web Services.

Included Functionality

Zend_Controller and Zend_View

These components provide the base for a simple MVC website and are already used on this site and several others. A front controller dispatches requests to page controllers. It is as minimalist as possible and we’re working to make it even simpler. The Zend_View component provides encapsulation for view logic. It can use templates written in PHP or can be combined with a third-party template engine.

Zend_Db

Database access is a very light layer on top of PDO. Solutions existing systems not using PDO (such as mysqli or oci8) are presently under development. Included are adapters, a profiler, a tool to assist with building everyday SELECT statements, and simple objects for working with table row data.

Zend_Feed

The links on the sidebars of our home page are generated using Zend_Feed. This component provides a very simple way to consume RSS and Atom data from feeds. It also includes utilities for discovering feed links, importing feeds from different sources, and feeds can even be modified and saved back as valid XML.

Zend_HttpClient

This component provides a client for the HTTP protocol and does not require any PHP extensions. It drives our web services components. In time, we will develop support for extension-based backends such as cURL.

Zend_InputFilter

The input filtering component encourages the development of secure websites by providing the basic tools necessary for input filtering and validation.

Zend_Json

Easily convert PHP structures into JSON for use in AJAX-enabled applications.

Zend_Log

Log data to the console, flat files, or a database. Its no-frills, simple, procedural API reduces the hassle of logging to one line and is perfect for cron jobs and error logs.

Zend_Mail and Zend_Mime

Almost every internet application needs to send email. Zend_Mail, assisted by Zend_Mime, creates email messages and sends them. It supports attachements and does all the MIME dirty work.

Zend_Pdf

Portable Document Format (PDF) from Adobe is the de facto standard for cross-platform rich documents. Now, PHP applications can create PDF documents on the fly, without the need to call utilities from the shell, depend on PHP extensions, or pay licensing fees. Zend_PDF can even modify existing PDF documents. Create a sharp customer invoice in Adobe Photoshop, fill in the order from Zend_Pdf, and send it with Zend_Mail.

Zend_Search_Lucene

The Apache Lucene engine is a powerful, feature-rich Java search engine that is flexible about document storage and supports many complex query types. Zend_Search_Lucene is a port of this engine written entirely in PHP 5, allowing PHP-powered websites to leverage powerful search capabilities without the need for web services or Java. Zend_Search_Lucene’s file format is fully binary compatible with its Java counterpart.

Zend_Service: Amazon, Flickr, and Yahoo!

Web services are becoming increasingly important to the PHP developer as mashups and composite applications become the standard for next generation web applications. The Zend Framework provides wrappers for service APIs from three major providers to make the as simple to use as possible. We’re working on more and engaging API vendors directly to make PHP the premier platform for consuming web services.

Zend_XmlRpc

PHP 5’s SOAP extension dramatically lowered the bar for communicating with SOAP services from PHP. Zend_XmlRpc brings the same capabilities to XML-RPC, mimmicking the SOAP extension and making these services easier to use than ever from PHP 5.

Conclusion

The Zend Framework looks promising, but I think that in its current state, it is more of a set of classes than an actual framework. Currently, PEAR is a much better choice in terms of community support and component availability. I am glad that Zend is continuing to embrace the open source community and I will be curious to see the future builds of this framework.

Download

It can be downloaded Here

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • DZone
  • Slashdot
  • StumbleUpon
  • Technorati
No comments

10 database speed tests

By Justin Silverton

I came across the following 10 benchmark tests covering:

SQLite version 3.3.3
SQLite version 3.3.3
SQLite version 2.8.17
SQLite version 2.8.17
PostgreSQL version 8.1.2
MySQL version 5.0.18
FirebirdSQL version 1.5.2

About the hardware/database settings used:

All databases were installed with default settings.
Tests were run on 1.6GHz Sempron with 1GB of ram and 7200rpm SATA disk running Windows 2000 + SP4 with all updates applied.

Test 1: 1000 INSERTs

CREATE TABLE t1(a INTEGER, b INTEGER, c VARCHAR(100));INSERT INTO t1 VALUES(1,13153,’thirteen thousand one hundred fifty three’);INSERT INTO t1 VALUES(2,75560,’seventy five thousand five hundred sixty’);… 995 lines omittedINSERT INTO t1 VALUES(998,66289,’sixty six thousand two hundred eighty nine’);INSERT INTO t1 VALUES(999,24322,’twenty four thousand three hundred twenty two’);INSERT INTO t1 VALUES(1000,94142,’ninety four thousand one hundred forty two’);

SQLite 3.3.3 (sync):
3.823
SQLite 3.3.3 (nosync):
1.668
SQLite 2.8.17 (sync):
4.245
SQLite 2.8.17 (nosync):
1.743
PostgreSQL 8.1.2:
4.922
MySQL 5.0.18 (sync):
2.647
MySQL 5.0.18 (nosync):
0.329
FirebirdSQL 1.5.2:
0.320

Test 2: 25000 INSERTs in a transaction

BEGIN;CREATE TABLE t2(a INTEGER, b INTEGER, c VARCHAR(100));INSERT INTO t2 VALUES(1,298361,’two hundred ninety eight thousand three hundred sixty one’);… 24997 lines omittedINSERT INTO t2 VALUES(24999,447847,’four hundred forty seven thousand eight hundred forty seven’);INSERT INTO t2 VALUES(25000,473330,’four hundred seventy three thousand three hundred thirty’);COMMIT;

SQLite 3.3.3 (sync):
0.764
SQLite 3.3.3 (nosync):
0.748
SQLite 2.8.17 (sync):
0.698
SQLite 2.8.17 (nosync):
0.663
PostgreSQL 8.1.2:
16.454
MySQL 5.0.18 (sync):
7.833
MySQL 5.0.18 (nosync):
7.038
FirebirdSQL 1.5.2:
4.280

Test 3: 25000 INSERTs into an indexed table

BEGIN;CREATE TABLE t3(a INTEGER, b INTEGER, c VARCHAR(100));CREATE INDEX i3 ON t3(c);… 24998 lines omittedINSERT INTO t3 VALUES(24999,442549,’four hundred forty two thousand five hundred forty nine’);INSERT INTO t3 VALUES(25000,423958,’four hundred twenty three thousand nine hundred fifty eight’);COMMIT;

SQLite 3.3.3 (sync):
1.778
SQLite 3.3.3 (nosync):
1.832
SQLite 2.8.17 (sync):
1.526
SQLite 2.8.17 (nosync):
1.364
PostgreSQL 8.1.2:
19.236
MySQL 5.0.18 (sync):
11.524
MySQL 5.0.18 (nosync):
12.427
FirebirdSQL 1.5.2:
6.351

Test 4: 100 SELECTs without an index

SELECT count(*), avg(b) FROM t2 WHERE b>=0 AND b=100 AND b=200 AND b=9700 AND b=9800 AND b=9900 AND bTest 5: 100 SELECTs on a string comparison

SELECT count(*), avg(b) FROM t2 WHERE c LIKE ‘%one%’;SELECT count(*), avg(b) FROM t2 WHERE c LIKE ‘%two%’;SELECT count(*), avg(b) FROM t2 WHERE c LIKE ‘%three%’;… 94 lines omittedSELECT count(*), avg(b) FROM t2 WHERE c LIKE ‘%ninety eight%’;SELECT count(*), avg(b) FROM t2 WHERE c LIKE ‘%ninety nine%’;SELECT count(*), avg(b) FROM t2 WHERE c LIKE ‘%one hundred%’;

SQLite 3.3.3 (sync):
4.853
SQLite 3.3.3 (nosync):
4.868
SQLite 2.8.17 (sync):
4.511
SQLite 2.8.17 (nosync):
4.500
PostgreSQL 8.1.2:
6.565
MySQL 5.0.18 (sync):
3.424
MySQL 5.0.18 (nosync):
2.090
FirebirdSQL 1.5.2:
5.803

Test 6: INNER JOIN without an index

SELECT t1.a FROM t1 INNER JOIN t2 ON t1.b=t2.b;
SQLite 3.3.3 (sync):
14.473
SQLite 3.3.3 (nosync):
14.445
SQLite 2.8.17 (sync):
47.776
SQLite 2.8.17 (nosync):
47.750
PostgreSQL 8.1.2:
0.176
MySQL 5.0.18 (sync):
3.421
MySQL 5.0.18 (nosync):
3.443
FirebirdSQL 1.5.2:
0.141

Test 7: Creating an index

CREATE INDEX i2a ON t2(a);CREATE INDEX i2b ON t2(b);
SQLite 3.3.3 (sync):
0.552
SQLite 3.3.3 (nosync):
0.526
SQLite 2.8.17 (sync):
0.650
SQLite 2.8.17 (nosync):
0.605
PostgreSQL 8.1.2:
0.276
MySQL 5.0.18 (sync):
1.159
MySQL 5.0.18 (nosync):
0.275
FirebirdSQL 1.5.2:
0.264

Test 8: 5000 SELECTs with an index

SELECT count(*), avg(b) FROM t2 WHERE b>=0 AND b=100 AND b=200 AND b=499700 AND b=499800 AND b=499900 AND bTest 9: 1000 UPDATEs without an index

BEGIN;UPDATE t1 SET b=b*2 WHERE a>=0 AND a=10 AND a=9980 AND a=9990 AND aTest 10: 25000 UPDATEs with an index

BEGIN;UPDATE t2 SET b=271822 WHERE a=1;UPDATE t2 SET b=28304 WHERE a=2;… 24996 lines omittedUPDATE t2 SET b=442549 WHERE a=24999;UPDATE t2 SET b=423958 WHERE a=25000;COMMIT;

SQLite 3.3.3 (sync):
1.883
SQLite 3.3.3 (nosync):
1.894
SQLite 2.8.17 (sync):
1.994
SQLite 2.8.17 (nosync):
1.973
PostgreSQL 8.1.2:
23.933
MySQL 5.0.18 (sync):
16.348
MySQL 5.0.18 (nosync):
17.383
FirebirdSQL 1.5.2:
15.542

Test 9: 1000 UPDATEs without an index

BEGIN;UPDATE t1 SET b=b*2 WHERE a>=0 AND a=10 AND a=9980 AND a=9990 AND aTest 10: 25000 UPDATEs with an index

BEGIN;UPDATE t2 SET b=271822 WHERE a=1;UPDATE t2 SET b=28304 WHERE a=2;… 24996 lines omittedUPDATE t2 SET b=442549 WHERE a=24999;UPDATE t2 SET b=423958 WHERE a=25000;COMMIT;

SQLite 3.3.3 (sync):
3.153
SQLite 3.3.3 (nosync):
3.088
SQLite 2.8.17 (sync):
3.993
SQLite 2.8.17 (nosync):
3.983
PostgreSQL 8.1.2:
5.740
MySQL 5.0.18 (sync):
2.718
MySQL 5.0.18 (nosync):
1.641
FirebirdSQL 1.5.2:
2.976

If you want to see some more information on the above and 10 more tests, you can go Here

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • DZone
  • Slashdot
  • StumbleUpon
  • Technorati
6 comments

optimizing mysql tables

By Justin Silverton

Many times, slow access to a mysql database can be the result of Badly defined or non-existent indexes and fixing these can often lead to better performance. Here is an

example table:

CREATE TABLE address_book (

contact_number char(10) NOT NULL,
firstname varchar(40),
surname varchar(40),
address text,
telephone varchar(25)
);

example query: SELECT firstname FROM address_book WHERE contact_number = ‘12312′;

This will retrieve the firstname of a person added to the address_book table, based on the contact number.

Without any kind of indexes added to this table, mysql will have to search through each row to find the item that you would like to find, which is very inefficient.

Optimizing your table

There is a built-in command called explain, that can show you what, if any, indexes that are being used to retrieve results.

example:

EXPLAIN SELECT firstname FROM address_book WHERE contact_number = ‘12312′;

This will return a set of results that will tell you how myql is processing the results

table: The table the output is about (will show multiple if you have joins)
type: The type of join is being used.best to worst the types are: system, const, eq_ref, ref, range, index, all
possible_keys: Shows which possible indexes apply to this table
key: And which one is actually used
key_len: The length of the key used. The shorter that better.
ref: The column, or a constant, is used
rows: The number of rows mysql believes it must examine to get the data
extra: You don’t want to see “using temporary” or “using filesort”

and index can be added to the above example table using the following command:

ALTER TABLE address_book ADD INDEX(contact_number);

you can also add an index on only part of a varchar. In the following, I will add an index on only 8 of the 10 characters.

ALTER TABLE address_book ADD INDEX(contact_number(8));

Why would you want to do this?

Indexes do increase performance in the right situations, but they are also a tradeoff between speed and space. The bigger an index is, the more space it will consume on your harddrive.

Using the query optimizer/analyzer

the following command can analyze your table key distribution to find out the best indexes to use:

analyze table *tablename*

also, another thing to keep in mind is the fact that over time, update and delete operations leave gaps in the table, which will cause un-needed overhead when reading data from your tables.

from time to time, it is a good idea to run the following (which will fix the above issue):

optimize table *tablename*

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • DZone
  • Slashdot
  • StumbleUpon
  • Technorati
No comments

Next Page »