codeWord - Thoughts on Software: April 2006

Saturday, April 29, 2006

Cool Calendar - Javascript Web Based Calendar Application

Last term for a user interface design class, I made the a web-based calendar. The calendar was about 3000 lines of Javascript. It emulates Outlook in that you can drag and drop select and so forth. The calendar however was just built for the user interface and has no backend. Therefore any changes you make are only held in the client machine's memory. When you reload they are reset. Also its only Firefox 1.0* + compatible. To see pre-populated events , go back to Dec. 2005.Try it out here. -Pawan

Monday, April 17, 2006

Phew

I have been coding in JSP for a while now and worked on the Struts framework as well. The servlet API's allow adding of instances to various levels of scope within the webapp. Like Application scope for the entire webapp, Session scope for the user session, Request scope for a particular request. Request scope is useful as the particular request response can be generated in parts by chaining servlets together. Now these methods make life pretty easy. Something which will used throughout the webapp can be kept at application scope and similarly at varying degrees of granularity at other scope. Like on correct credentials for a login form, a String can be stored in session scope.

This would always disturb me. Simply because client specific data was being stored in-memory. Upto how many users can be supported in such an architecture? even if 10 bytes are stored per client, that can count to a lot of memory for say 10k users. Then the overhead of caching, maintaining state of live data and associating it with the client.

Initially storing stuff in scope would make me think of the size of the objects I used to store. Then I realised frameworks like Struts store huge amounts of data at various scope. Like the action form etc in the request scope. So that made me feel that this is not such an issue. If such relatively heavy objects can be stored then that must be fine. And now with JSF etc these object are just getting larger.

Then I read on the Php Share-Nothing architecture. So the Share-Nothing architecture advocates... simply sharing nothing about the client on the server. As simple as that. Instead of storing stuff in memory, store details in a DB and make DB calls always. Now I am not actually sure how performant this is. But DB's have been around for a long while and have been well tuned. Plus DB's can be easily made to work in parallel redundant mode and have very good features.

Again I know I am being premature to be so anti-in-memory stuff. Many large systems have been built, especially in the enterprise. But it just does not feel right. In the end it seems easier to scale horizontally with the Share-Nothing approach. The in-memory approach seems to force towards vertical scaling.

Some examples... Flickr and Yahoo are two php based webapps. And they dont get bigger than them. Ebay is the biggest java based webapp i know of. But Google for ebay architecture and they too have a custom Share-Nothing like system in place. Gmail too uses Java, but have some super optimizations in place.

Rails is one very hot webapp framework right now. This is one of the more intelligent discussions I have read. Coupled with this blog post Rails seems like something I would like to learn soon. These guys seem to be aware of both php and Java webapp dev. Any idea on their principles on this issue?

What dyu guys think? Any of you really stick to the Share-Nothing principle?

Friday, April 07, 2006

C# futures

Its been interesting to see the path static languages like C# and Java have taken over the past year with their 2.0 and 5.0 releases respectively. There has been a drive towards more 'staticness' with generics i.e. more type specification at compile time. This has generally been seen as a good thing... more descriptive, more type safe and more performant (in the case of C# ;-) code. But then you have dynamic languages like Python and Ruby which are essentially the complete opposite with no static type specification at all. All the variables only have dynamic types with everything being inferred.

What's even more interesting is that inspite of this core difference C# has borrowed features for its 2.0 release from dynamic languages with more on the way for 3.0.

Iterators are a well known design pattern for traversing collections. They are a good way to loosley couple collections from the actual iterating process. So one can have multiple iterators, iterating collections in different ways. Iterators are sprinkled all throughout the Java Collection Framework and similarly C# has Enumerators having pretty much the same interface. Creating these iterators/enumerators invovles creating classes which keep track of state. Thats pretty much all they do... some logic to know where you are and to provide the next element. C# 2.0 introduced the yield keyword which generates iterators dynamically that manage the state automatically. Ruby and Python both have it. Here's a simple ex...

class MyCollection {
private int[] myElements;

public IEnumerator GetEnumerator() {
   foreach ( int i in this.myElements ) {
    yield i;
   }
}
}

Thats it. No creating a class which implements IEnumerator. It's all generated automatically dynamically. Huge productivity booster. Complements the foreach functionality nicely.

C# 3.0 which is a ways off from being released has a lot more in store. Probably the biggest annoucement was about LINQ (Language Integrated Query) which introduces new syntax within the language to work with datasets - collections, relational databases (DLINQ) and xml (XLINQ). This major feature brings with it many smaller ones which again seem to borrow a lot from dynamic languages...

Probably, the most surprising one is Implicit Typing. You can do things like

var i = 1;
var s = "string";
var d = 1.11;
var numbers = new int[] { 0, 1, 2 };

Surprising since it goes against what C#/Java type languages have been known for. But this is a feature needed to make LINQ work since you don't know the final type which will be the result of queries.

Another interesting one is Extension Methods. Ruby has this feature. You can add new methods to existing types without being part of any type hierarchy. You can even add methods to sealed classes like String. As an example, think about a method that checks if a string is a palindrome. Normally, one would create something like a Palindrome class which has a static isPalindrome method which takes a string... Palindrome.IsPalindrome( text ). Pretty inelegant. With extension methods, you can define a method like this

static boolean IsPalindrome( this string text ) { ... }
And call it like this

string text = "civic";
bool palindrome = text.IsPalindrome();

It's a pretty cool feature which, again, is needed to add some LINQ funtionality. But I think there is potential for abuse here and without proper documentation it might cause some confusion.

Lambda Expressions have been available in many languages for a while. C# is finally getting this feature in 3.0. These expressions are popular when filtering datasets and as you can imagine would be an integral part of LINQ. Here's a simple example...

List<int> numbers = new List<int>;
numbers.add( 0 );
numbers.add( 1 );
numbers.add( 2 );
numbers.add( 3 );
numbers.add( 4 );

List<int> evenNumbers = numbers.FindAll( i => ( i % 2 ) == 0 );

So FindAll() will filter the list based on the lambda expression. Syntax seems a bit strange.

A final interesting feature was Anonymous Types. Languages like Python and Ruby have this concept of a tuple which can hold multiple values. So you can have methods returning multiple values. In C# or Java this isn't possible. What many end up doing is to return an array with 2 or more values. Pretty inelegant. Or you have to actually define a type which just holds those values and return that. It's a hassle. With anonymous types you can again dynamically create types without (as the name would suggest) giving it a name...

var person = new { Name = "C Sharp", Age = 4 };
Console.WriteLine( "Name: {0}, Age: {1}", person.Name, person.Age );

Again, as you can guess this is another needed feature for LINQ.

I haven't mentioned much about LINQ itself since I've only read a little about it and seen a video by the man. So dunno a lot of details myself. What would be interesting is to see all the IL that is generated to make all these abstractions work.

Anyway, it's something to look out for. Maybe Java will also be including some data related features for their Dolphin release.