Tuesday, November 25, 2003

Polymorphism

What is it? Why is it useful?

First consider a world WITHOUT polymorphism. This really just involves inheritance between classes. Here's a quick simple (classic) example.

// C++

class Shape
{
private:
int m_x;
int m_y;

public:
Shape( int x, int y ) : m_x( x ), m_y( y ) { }

void move( int x, int y )
{
this->m_x = x;

this->m_y = y;
}

double area() const
{
return 0;
}
};

class Circle : public Shape
{
private:
int m_r; // radius

public:
Circle( int x, int y, int r ) : Shape( x, y ), m_r( r ) { }

double area() const
{
return ( 3.14 * this->m_r * this->m_r );
}
};

class Rectangle : public Shape
{
private:
int m_w; // width
int m_h; // height

public:
Rectangle( int x, int y, int w, int h ) : Shape( x, y ), m_w( w ), m_h( h ) { }

double area() const
{
return ( this->m_w * this->m_h );
}
};

So, pretty basic inheritance going on here. Shape is the base class. It has a move() method which changes its x and y coordinates. Shape really doesn't have any concept of area so its area() method just reurns 0.

Circle and Rectangle both inherit from Shape. This means that they both get the m_x and m_y position variables as well as the move() method.

So you could do stuff like...

Shape s( 1, 2 );
s.move( 3, 4 ); // Calls Shape's move()

Circle c( 1, 2, 3 );
c.move( 3, 4 ); // Calls Shape's move()

Rectangle r( 1, 2, 3, 4 );
r.move( 3, 4 ); // Calls Shape's move()

Now since both Circle and Rectangle define area() in them, it overrides the area() they inherited from Shape. This is a good thing, since it gives each derived class the ability to provide custom functionality.

s.area() // always 0

c.area() // (3.14 * 3 * 3)

r.area() // (3 * 4)

So far so good.

The idea behind inheritance is that you could have a pointer to a base class and yet, have it point to derived objects. Consider this...

Shape* p;

p = new Shape( 1, 2 );
p->area();

Pretty straightforward - Calls Shape's area().

p = new Circle( 1, 2, 3 );
p->area();

What does this do? It still calls Shape's area(). Same thing here...

p = new Rectangle( 1, 2, 3, 4 );
p->area();

This is the big downside. It does NOT do the right thing. Why does this happen? Since p is a Shape pointer, it only has access to Shape's methods. How is it supposed to get to Circle's or Rectangle's area?

Enter virtual functions. This is the key to polymorphism.

If you make Shape's area() virtual, the correct area() will be called. The reason is that when you declare a function as virtual, the compiler will add a virtual table pointer to every object of the class hierarchy. This vtable contains the address of all virtual functions. Not only that, but they are kept in strict order. That means that the address Shape's area() will appear in the same slot in its vtable as the address of Circle's area() and Rectangle's area() appear in their vtables. So when you run this...

Shape* p;

p = new Circle( 1, 2, 3 );
p->area();

It will look to see if area() is a virtual function. If it is, it will look to see what slot the address of area() appears in Shape's vtable. Once it figures that out, it will go to Circle's vtable, find the same slot and dispatch the call to the address that is found there.

That's how you get dynamic behavior. Obviously, this comes at a price. All classes that have virtual functions will have vtable pointers. Plus, everything dynamic is done at runtime (obviously), so there's a performance penalty there too. You pay a price in space as well as time. But on the plus side, it allows you to grow your systems. You could come up with another class like Triangle, derive from Shape and define custom behavior for area() and it will do the right thing. This is especially useful if you have a collection of Shape objects. You could run through that collection and call virtual methods and it will do the right thing.

C++, Java and C# differ in the way you get polymorphic behavior. The behavior is the same, but how you enable it is different. I'll talk about that later.

No comments: