Sunday, April 03, 2005

Re: Databases Part 1

Waiting on part deux.

You'll have to wait for some more time... Very busy with the current module at Ncb as reaching the submission deadline.


A table can have a primary key.

When you say "can" dyou mean a primary key is NOT necessary? I thought it would be.


Right. A primary key is NOT necessary. But searching for a particular row will be more complicated as duplicates can be added to the table.


One thing that needs to be considered is where the error checking happen? Does it happen in the "business logic" layer or directly in the data layer? I would think a lot of times the database should ONLY be conserned with storing information.

I guess the question comes down to what responsibilities the database should take on. It seems the trend is to make more and more things happen in the data layer.


Basically databases are becoming more intelligent. So more Business logic is being pushed into the databases. It seems like mixing of responsibilities, but only specific business logic could be put into the db. Like you need to retrieve some data occasionally which spans across many tables and this is done very often. Rather than have loads of SQL in your business logic, just call a stored proc.

Another trend is distributed databases, which still have to mature well.

Error handling is possible in Stored procs. But I am not very sure how errors are sent to the host language. I think a simple SQLError could be returned by the db. We returned C-style status codes from the Stored procs.


Each trigger is associated with a TABLE on some event.

This is quite cool. I didn't know this was possible. You can basically keep a log/trace of everything that's going on in your database this way.


Care has to be taken that triggers are not written on 'events' of heavily used tables. Events could be insert, delete. Now if I have 500 inserts per sec on a table and I write a trigger on that event, its going to really load the server.


I'm interested in Stored Procedures just because I've read that they are beneficial to app performance. Is that really true?


Yes its true.

Now for some explanation.

When a simple query is submitted to a database, a few steps are carried out after which the result is returned. These include checking if the query is valid and tries to reference valid tables and columns. Then optimization of the query is done. Most queries in SQL are commutative and associative, so the order of the operations in the query can be changed without changing the meaning of the query as a whole. So the db does this based on the information it has about its tables and some heuristics as well. After this, the new SQL query tree is compiled and finally executed to return a result. Now as you have noticed, this is a very complicated and lengthy process. (and trust me, db's are really smart already wrt optimization!!)

Back to Java. There are three ways to execute a query in Java SQL - A Statement, a PreparedStatement and a Callable statement.

Now in a simple statement, the query is sent from the host language, then goes through all the steps of checking and optimization mentioned above.

A prepared statement is cached into the db for some time. So it does not have to go through all the optimizations steps each time. Probably only the new arguments have to be checked which should be faster.

A Callable statement is used to run a Stored Procedure. Now a stored procedure query resides IN the db. Only arguments are passed similar to a Prepared statement. So all optimization, compilation is already done.

I guess that should explain performance effects of various methods.

Also Stored procedures have other advantages like Mohn mentioned in his Hacking Websites blog.

< Excerpt Begin >
Be paranoid and ALWAYS validate data coming from the client. And as far as possible use Stored Procedures. Besides getting a performance boost (since they are compiled vs SQL statements that are interpreted), they use typed parameters
< Excerpt End >


What db are you working with? I've had basic experience with Access and SQL Server.


I used to work on Access before leaning what a 'real' db is supposed to do. We worked on SQL Server here at Ncb. Access does not do most of the things I blogged about. Just has tables with Primary and Foreign keys. Probably something more!!

I guess this blog introduced Stored procedures well enough. An example later..

No comments: