Thursday, October 28, 2004

Database DesignOver at Kottke, there is a great set of comments (most of which say the same thing over and over) about normalisation during database design and the use of denormalisation.

Jason asserted that normalisation/denormalisation is a tradeoff and references a good presentation on Flickr and their use of PHP and MySQL.

To summarise, most (90%) of the comments are saying that normalisation is the way to go and that denormalisation is a step you take AFTER you have normalised and then identified performance issues, rather than what I think Jason was hinting at which is that normalisation is an option which you can use if you want to but starting with a denormalised structure is ok. Quite rightly, most of the comments also say that problems/issues etc are database design have been well solved for many years and while creatively is (at least in my mind) totally needed for database design, the process itself of normalisation is well and truely defined and has been for years.

Now, I love database design, and over the last 13 years I have spent a lot of my time in both massive IBM DB2 and IMS systems, MS SQL server database driven financial systems and plenty of time with web systems running MySQL, Oracle or MS SQL Server data mining 10s of millions of rows.

Only last weekend, Phil and I were discussing database design while driving to get some take away food and we covered this very issue.

This leads me to my point (finally). Which is, that with the lowering of barriers for people to develop web applications using tools such as PHP and MySQL, the overall quality of database design has dropped substantially as has database query performance, and this is mainly due to the people developing these applications not having an understanding of good database and query design.

Now this is all and well for what Clay Shirky refers to as Situated Software, which is an emerging pattern of software creation for a specific, generally small scale, community or solution. e.g. Meeting room booking, class note support system etc.

But the problem arises when you don't have a good database design and then try to scale a system. What worked for a few users in testing and development, or even thousands of rows in a larger test, just doesn't work well with a significant (depends on each app) level of use. At this point, the system is well used and changes to retrospectively normalise the database design (and then denormalise poor performing parts) is very very expensive and time consuming not to mention a pain in the ass.

Web based systems which should be high performing given a) their user base b) the query structure and c) the content in the db, when running on one box, run slowly, even on a massive database box with multiple application servers.

in addition to poorly design databases (which I generally consider the root of the problem), developers with a lack of knowledge about how databases work, often end up designing poorly performing queries which, even with a great db design, would cause the system to grind to a halt.

For example, searching for a movie in a db, e.g. The Terminator, should be a simple select statement getting back a handful of rows depending upon the text search you want to do. What it should not be, is a query returning all movies in the database to the application server and then having the application server loop through each and every row from A to Z finding "The terminator".

Over the last 5 years I have developed a way of designing database driven web systems which seems to always start with doing a data model in conjunction with the functionally task definition. This approach seems to work well in most cases, where the web pages are driven by select, update, insert and delete statements. Its almost an databasey object modeling approach which works well for me.

So I believe in finding a balance with ease of access to tools and the responsibility of the developers to spend some time learning (what is not a difficult task really) database design methods and basic query practice. Then using this knowledge to at least create a system which, while not perfect, can start to be scaled or just maintained in the future without major rework.

Design you database reasonably well (it doesn't have to be 3rd normal form), make sure your queries are optimised (indexes used etc) and then develop to your hearts content, the extra time spent up front will save you heaps in maintenance, bug fixing due to data inconsistencies, and scaling issues.

Monday, October 18, 2004

IcyPole at Sloane School Of Management
Woo hooo.... I'm very excited!!!

Just checking my logs and seeing a search about IcyPole from google. So I decided to check what is referencing my leading edge wireless P2P app.

It turns out that it is being referenced in some presentations on Music Distribution and Core Edge Dynamics at Sloane School of Management, which is very very cool. They have done some lovely MBA style diagrams showing its position with other P2P and edge services which is all nice.

So if you want to see the future of Music Distribution...check it out.

Tuesday, October 12, 2004

Alert for Patent
Set it and forget it...as the audience on some infomercial yesterday were saying. This morning I came into work and got an email alerting me that there was a press release about my latest patent on the news wire.

Ages ago...so long ago that I don't remember, I must have set an alert on yahoo to send me any news about AgentArts. So this morning in came an alert.

Very nice. And if you wish...go and read the press release.

Sunday, October 10, 2004

Shitty morning
What a crap way to wake up on a Saturday. I'm in San Francisco for the next few days and really thought that the weekend would be an activity filled fun fest.

But alas....I woke at 4 this morning and was incredibly sick...icky...and only managed to get back to bed around 6 so I'm sort of feeling like I did 10 years ago after a big night out and an early rise.....tired, hungry, dehydrated and ug..

Added to this, John Howard or fearless fly as I like to think of our beloved leader, has just been reelected for his 4th term. Double ug.

The only cool thing about this weekend, aside from shopping, the apple store, marios and breakfast tomorrow is that its Fleet Week here this weekend. So, big ships and lots of cool planes flying around!!!

If only I'd brought some vegimite I'm sure I'd feel better.