Databases Part I

Mohn mentioned Stored Procedures which I have been working on as part of the current DBMS module. You guys interested in the topic?

Of Course, don't even have to ask. Post it.

I was just trying to buy some time.. keep the blog rolling. The Database Module (DBMS) is almost over. And like always learnt a LOT. I'll try and cover a few topics I learnt in brief. Before the module I was a DBMS dumb-ass. Now I am a novice..

There are different models for DBMS. The one all of us use and see generally are Relational DB's. These are based on Relational Algebra and Relational Calculus. In Relational DB's information is stored in tables and various realationships exist between different tables. A table consists of rows and columns. The SQL language is based on Relational math, with some additions and ommissions. Most SQL queries are easy to understand. Then you can combine, nest etc to get info from the DB.

The thing I want to focus on is Integrity and Constraints. There are different constraints which can be set up which maintain the integrity of the database. These contraints are used to make sure that data in the DB satisfies some rules.

A table can have a primary key. A primary key is used to uniquely identify a single instance within a table (ie a row). So no two rows can have the same primary key. Also no row can have a NULL in the primary key. This is known as Entity Integrity Constraint.

Like mentioned before, tables are generally related to each other. In a DB, a table has a Foreign Key to show a relationship. Consider an example with two tables Employee and Department with cols as shown below.

Table Employee { (PK) Emp_ID, Emp_Name, (FK) Dept_ID }
Table Department { (PK) Dept_ID, Dept_Name }

In this case Employee belongs to a particular Department. So the FK in the Employee Table is a Primary Key in the Department Table. This brings us to Referential Integrity. Referential Integrity means that if a FK exists in a table then the corresponding PK has to exist in the referenced table. In simpler terms suppose there is an Employee { E101, Scott, D505 } then there has to be an entry in the Department table with D505 as a PK like { D505, Management Board }. If D505 does not exist, then the DB will no allow you to enter the E101 row.

The above two contraints were simple.

Things get a bit messier in real life scenarios. Consider Age was one of the fields in the Employee table. Now Age could be modelled as an Integer data-type. An Integer can be negative, but Age can definitely not be. So such simple rules are enforced by using a CHECK's. The definition of the Table is called a Schema. This Schema is mentioned using the CREATE statement. PK's and FK's are mentioned along with the Field names and this is where CHECK's can be applied too. I'll not get into the syntax here. A simple line like CHECK ([AGE>0]) will enforce the integrity of the Age field.

And obviously things can get even more complex (read nastier) in real life scenarios. Suppose I can insert a particular entry only on the basis of checking some other tables or the table itself. This can be thought of as having a sub-query along with every INSERT. This is where TRIGGER's come into play. Consider this example.

CREATE TRIGGER reminder
ON titles
FOR INSERT, UPDATE, DELETE
AS
EXEC master..xp_sendmail 'Blah', 'Blah'
GO
Each trigger is associated with a TABLE on some event. In the example above, the TRIGGER reminder, is called whenever there is an INSERT, UPDATE or DELETE on the TABLE titles and a stored procedure is called with the EXEC statement. So based on some event some action can be taken. Wrt to business rule integrity, I could ROLLBACK if some condition was not satisfied.

In conclusion, Entity Integrity and Referential Integrity are more pertinent to the integrity of the data in the DB. Simple CHECK's and TRIGGER's are for validity of data wrt a particular business rule.

I'll post later on Stored Procedures, Prepared Statments and simple Queries. If you guys want some other info, I'll try. Was this blog too complex/easy?

codeWord - Thoughts on Software

Friday, February 11, 2005

Databases Part I

No comments: