codeWord - Thoughts on Software: February 2005

Saturday, February 26, 2005

Re: More on C# and C++

The basic focus in creating C# was simplicity (not to the levels of Java, but close).

This is a very important point you made. The one time I picked up a C# book and read some fundamentals,.. I was really quite lost. I kept wondering where is all this in Java and is it really needed. The features did have merit but life was simpler without them.

Regarding Coherent Libraries... MS keeps saying how they want to get the community involved with their technology. But it pales compared to the Java community. utting everything in one gigantic framework just keeps bloating it up and you have to pay the size penalty regardless of whether you will use them APIs or not.

Is NUnit widely used? I think its similar to the JUnit library.

Java is also a pretty huge framework. It is divided into Standard and Enterprise Editions, but are still very huge in themselves. Are .NET api's divided similarly? And the Java api's are bloating too. So there doesn't seem to be much of a diff there.

There area however many widely used 3rd party Java libraries and apps. Some even competing with the Standard api's. What do you think is the reason for this state in .NET?

As as aside, have you guys read anything on Eiffel? I've heard that it's heavily contract oriented. As in, even pre and post conditions can be specified in code.

I just read something similar in a book recently - Class invariants. Never thought of this stuff before. Could Mohn post an example.

Sunday, February 20, 2005

New Scientist

The magazine "New Scientist" has a special issue focusing on India.

Wednesday, February 16, 2005

Movie recommendations

Had a great valentine's day!!

Saw two hindi movies.. one good and the other just simply amazing.

The good one is "Page 3". Based on high society life in Aamchi Mumbai.

The great movie is "Black". If you haven't seen the movie yet, get up and go. There is a reason why a lot of Indians think this movie deserves somthing at the Oscars. Definitely one of my all time favs and highly recommended.

Thank God Bollywood is moving away from the standard masala movies. Getting some variety of late. The degree of sleaze is also increasing alarmingly. Its funny how we Indians are hypocrites in a way. Aware of sexuality but yet conservative. The generalization I made was what I think is the general thought.

Friday, February 11, 2005

Google Maps

Few days back the Do No Evil company released another amazing web app... http://maps.google.com. Only works with IE and Firefox. And it subscribes to the map of the world after George Bush's second term (joke from slashdot).

Anyway, the blog world being what it is, responded super fast. There's already a post demystifying how it works. As expected a lot of DHTML, XML and XMLHttpRequest.

Databases Part I

Mohn mentioned Stored Procedures which I have been working on as part of the current DBMS module. You guys interested in the topic?

Of Course, don't even have to ask. Post it.

I was just trying to buy some time.. keep the blog rolling. The Database Module (DBMS) is almost over. And like always learnt a LOT. I'll try and cover a few topics I learnt in brief. Before the module I was a DBMS dumb-ass. Now I am a novice..

There are different models for DBMS. The one all of us use and see generally are Relational DB's. These are based on Relational Algebra and Relational Calculus. In Relational DB's information is stored in tables and various realationships exist between different tables. A table consists of rows and columns. The SQL language is based on Relational math, with some additions and ommissions. Most SQL queries are easy to understand. Then you can combine, nest etc to get info from the DB.

The thing I want to focus on is Integrity and Constraints. There are different constraints which can be set up which maintain the integrity of the database. These contraints are used to make sure that data in the DB satisfies some rules.

A table can have a primary key. A primary key is used to uniquely identify a single instance within a table (ie a row). So no two rows can have the same primary key. Also no row can have a NULL in the primary key. This is known as Entity Integrity Constraint.

Like mentioned before, tables are generally related to each other. In a DB, a table has a Foreign Key to show a relationship. Consider an example with two tables Employee and Department with cols as shown below.

Table Employee { (PK) Emp_ID, Emp_Name, (FK) Dept_ID }
Table Department { (PK) Dept_ID, Dept_Name }

In this case Employee belongs to a particular Department. So the FK in the Employee Table is a Primary Key in the Department Table. This brings us to Referential Integrity. Referential Integrity means that if a FK exists in a table then the corresponding PK has to exist in the referenced table. In simpler terms suppose there is an Employee { E101, Scott, D505 } then there has to be an entry in the Department table with D505 as a PK like { D505, Management Board }. If D505 does not exist, then the DB will no allow you to enter the E101 row.

The above two contraints were simple.

Things get a bit messier in real life scenarios. Consider Age was one of the fields in the Employee table. Now Age could be modelled as an Integer data-type. An Integer can be negative, but Age can definitely not be. So such simple rules are enforced by using a CHECK's. The definition of the Table is called a Schema. This Schema is mentioned using the CREATE statement. PK's and FK's are mentioned along with the Field names and this is where CHECK's can be applied too. I'll not get into the syntax here. A simple line like CHECK ([AGE>0]) will enforce the integrity of the Age field.

And obviously things can get even more complex (read nastier) in real life scenarios. Suppose I can insert a particular entry only on the basis of checking some other tables or the table itself. This can be thought of as having a sub-query along with every INSERT. This is where TRIGGER's come into play. Consider this example.

CREATE TRIGGER reminder
ON titles
FOR INSERT, UPDATE, DELETE
AS
EXEC master..xp_sendmail 'Blah', 'Blah'
GO
Each trigger is associated with a TABLE on some event. In the example above, the TRIGGER reminder, is called whenever there is an INSERT, UPDATE or DELETE on the TABLE titles and a stored procedure is called with the EXEC statement. So based on some event some action can be taken. Wrt to business rule integrity, I could ROLLBACK if some condition was not satisfied.

In conclusion, Entity Integrity and Referential Integrity are more pertinent to the integrity of the data in the DB. Simple CHECK's and TRIGGER's are for validity of data wrt a particular business rule.

I'll post later on Stored Procedures, Prepared Statments and simple Queries. If you guys want some other info, I'll try. Was this blog too complex/easy?

Tuesday, February 08, 2005

Managed Code Verifiability

Recently James Gosling (of Java fame) made some comments about .NET being unsafe.

Expectedly, there were rebuttals from the Microsoft camp. Here are two which I found particularly nice, in that they not only responded to his claims, but also explained the basic idea of managed code verifiabilty. The first one is particularly good.

Testing James Gosling's Thoughts on C++ in .NET

Huge Security Hole in Solaris and JVM

Re: Hacking Websites

The You suck message did feel nice for a couple of days.

I was surprised it showed up. I thought I took the precautions I mentioned in the article. I guess the blogger server is manipulating even that. Anyway, I finally settled on putting spaces between the tags, to display them.

Mohn mentioned Stored Procedures which I have been working on as part of the current DBMS module. You guys interested in the topic?

Of Course, don't even have to ask. Post it.

Sunday, February 06, 2005

Pure C++

Sorry if I'm beating this drumb a little bit too often lately, but another nice intro article to C++/CLI on MSDN mag.

One of the main concerns about moving from native manual memory managed to an automatic GC'd environment is one of deterministic destruction i.e. the lack of it. He mentions some stuff at the end about how they are trying to achieve similar functionality in .NET with C++. It's pretty interesting how they achieve it.

Re: Did Microsoft lose the API war ?

I googled "Google Desktop search" and the first few articles mention that it does indeed have a local web server. Dyou think a web server is needed just because they are trying to present the results in the Google Web Search like interface i.e within the browser, using html?

Simply to provide one app both for Google-Desktop and Google-Web. The user does not need to shift between apps.

Do any of you know how Gmail works internally?
They have a LOT of data redundancy to counter failure.
Regarding the client. Yup, it's a lot of dhtml. Have you viewed the source?

The major reason why google has huge amounts of redundancy is because they expect their hardware to fail. Seriously. Google do not use the most top-end hardware in the market. They use off-the-shelf x86 hardware. This helps in keeping costs down. So they can upgrade systems more regularly at a cheaper cost. Their software is built aroud the assumption that some node will fail. Amazing stuff. Everyday the Java and I think .NET API's are getting more and more complex. Layers of patterns over patterns are being built. You just need the most advanced hardware to run stuff for say 10,000 simultaneous connections. This stuff feels a bit awkward. I'll explain more later..

I haven't tried looking at the source. Dunno DHTML etc.

So, all old java code will work on the newer 1.5 jvm. New java code, developed with generics and newer types won't work on pre 1.5 jvms. This is exactly the case with .NET. I don't see any reason why you can't add NEW bytecodes. Don't change any existing bytecodes. How would there be a break in compatibility?

The only valid answer I have is wrt to the compatibility of all JVM's supporting a fixed set of bytecodes. The JVM's do not need to be re-written to work with Java 5.0. I too am not totally convinced with the rationale behind this.

Basically, you can do everything in ASP that you can with ASP.NET, but it was a mess to code, debugging was a nightmare and it was based on COM which was torture. ASP.NET has made the development experience much better. I don't know anything about the web tech in Java, but from what I've read, it seems similar to ASP i.e. complicated. Can Rahul provide some details about what's required to develop a dynamic site using Java tech? Is JSP enough or dyou need other thing like JSF, Struts, Servlets, EJBs etc...?

I made a statement that ASP was lame compared to JSP. One reason was that JSP pages are compiled only once. For ASP it happens more often. Is it for every request?

Using JSP by itself is easy, but you'll manage to mix UI and business logic etc. It is possible to create MVC apps by making some Servlets, but rather use Struts. Struts is an API which helps you develop MVC based Web Apps. Servlets and JSP are technically the same thing. A JSP page is easier to code and is converted to a Servlet by the Web Container.

I read quite a bit on Struts lately. What I learnt is Struts is HARD, and JSP by itself is messy. In most apps you need to connect to some back-end database. You can use JDBC for connecting to a database. EJB and other stuff are used for advanced persistence of objects and advanced stuff like transactions etc. This area of persistence is where Java sucks at the moment compared to .NET.

Struts also internally uses Servlets. Basically Struts helps structuring the Web App really well. Obviously there is an overhead of performance. But that is not what concerns any one. I brushed through two books by Wrox on JSP and Struts. In addition to Struts, they advised using two additional layers of "design". One to manage Persistence and another to store the Business rules. Both these layers are to be included at the Model layer within the Struts MVC.

I have to admit that in the end the Web App was beautiful. I mean the design felt so damn right. But at the same time, it also felt too damn bloated. Too many things happening on a simple request. Its like screw performance. So it was a funny feeling.

Now the usual questions!! How does stuff happen in the .NET world? How are ASP.NET pages structured? How easy to develop, and can it be done without an IDE easily? Also if you really want good design, how bloated does it get? How much can as ASP.NET dev get to know about inner workings, under the hood?

Thats when you see something like google apps. Which really are efficient and are great at the same time. Doesn't it feel like Java and .NET are making like easier but making you pay somewhere else.

Re: Hacking Websites

I attended a talk recently at ADNUG (Austin .NET Users Group) which was supposed to be about Garbage Collection. Unfortunately, the dude who was supposed to give the talk couldn't make it in time so they had another guy present in his place. He didn't care much for Garbage Collection and instead gave a talk on hacking websites.

Blah blah...

Yes ... I am responding to the super blog by Mohn after so long that I need to provide a link to codeword itself!! That was one of the best blogs we've had.

The You suck message did feel nice for a couple of days.

Haven't come across more cracking methods though. You guys tried anything? Leart anything new?

Mohn mentioned Stored Procedures which I have been working on as part of the current DBMS module. You guys interested in the topic?

Wednesday, February 02, 2005

Re: More on C# and C++

For the most part I agree with his points. The basic focus in creating C# was simplicity (not to the levels of Java, but close). So they tried to remove everything that would get in the way of that goal. Long back, in one of his posts he mentioned how when the began the design, they started off giving each (potential) feature a -10 and then tried to make arguments for it to be included in the language. This meant it need a really strong case to get in.

A couple of points where I feel I disagree with him.

Regarding Coherent Libraries... Yeah, the .NET framework includes everything but the kitchen sink. And yeah, it's very nice and convenient. But at the same time it seems to be shutting out other people from coming up with libraries of their own. MS keeps saying how they want to get the community involved with their technology. Towards that end, they have opened up quite a bit with their dev blogs. But it pales compared to the Java community. In the Java circle, a number of APIs have come out and risen to prominence (Rahul can mention them). They are not part of the standard Java API, but everyone knows about and uses them. I don't think I can even name one third party popular .NET library. Putting everything in one gigantic framework just keeps bloating it up and you have to pay the size penalty regardless of whether you will use them APIs or not.

I think I actually read it on some Sun dev's blog that this was a big difference in view points of Java and .NET. Although, Sun seems to have this overarching hold on Java, it really is quite an open thriving community. MS needs to change their attitude and let others join in.

Regarding const... In this one I completely disagree with him. There is this "Principle of Least Privilage" where you only allow the absolute lowest possible acess to get the job done. const enables you to make use this principle in C++. You can make the contract clear that you won't mess with someone else's object. Why not do the same for C#? Yeah, C# is a more "safer" language than C++ with less potential for abuse, but anything that makes the programer's intentions clear in code can't be bad. In C# most arguments are passed by reference (except for value types and even they could (unkowingly) be passed by reference through boxing). So the callee could modify it any way they want. I really think that const should have been part of the language.

As as aside, have you guys read anything on Eiffel? I've heard that it's heavily contract oriented. As in, even pre and post conditions can be specified in code.

More on C# and C++

Part deux from Eric G. And check the comments for more discussions on why things missing in C# from C++ is good/bad.

Cross-Platform

When I wrote the last post, I was thinking of cases where you have a choice between languages, so I didn't think of cross-platformk, since if you need to run on a platform where a language isn't present, that pretty much eliminates the language from consideration.

It's true that C++ is available on far more platforms, and if that's important in your case, C# probably isn't an option for you.

Templates, template metaprogramming, STL, Boost

I missed templates as an advantage that C++ currently has, as I forgot that Whidbey isn't there yet for C# programmers. When Whidbey is widespread, C# will have the majority of the features that I'd want related to generic types, though it won't be able to do as much as C++ does.

In my mind, that's (mostly) a good thing. While there are things that aren't in C# generics that I'd like, I think that, because of the indirection involved, generic types are something that are best enjoyed in moderation, as they're near the limit of what most programmers can easily understand. Which brings us to template metaprogramming. The discussions I've read on this topic list "power" and "optimization" as the big advantages of this technique, and I'd have to agree with that evaluation. But the code that I've looked at makes normal template code look simple and straightforward. So, I'm not sorry that you can't do this with C# generics.

Something I do miss is the ability to do Mixins, which would be a nice complement for a language without multiple inheritance. They would be helpful to add in system functionality without burning the base class.

STL isn't the kind of library that I like to use, as I think it's too baroque. Sure, you can do a *ton* of things with it and easily switch things around, but I've never found that I need to switch things around that often, so it's complexity that I don't use, but still have to deal with. So, for me, no thanks - I'd rather have foreach, which covers about 90% of my loops. Oh, and before I leave this topic, I should mention that the richness of data structures in STL is a lot greater than that in C#, though you should keep your eye out for C5 and PowerCollections when Whidbey shows up.

In the current C#, foreach only supports one way of iterating. About 3 years ago I wrote an article on some collection wrappers you could use to support other ways of iterating, though at some cost to performance. Unfortunately, I chose to call the "iterators", which, of course, is also the name of a C# 2.0 feature that allows you to make objects iterable more easily, and support multiple ways of iterating a collection.

Boost seems like an obvious C++ advantage, if you're working in an environment where you can use outside libraries.

Object Lifetime

There were a lot of comments around deterministic destruction, and there is certainly a big difference between the "programmer owns the allocations" and the "the GC owns the allocations" approaches.

I will admit that when I first started using C#, I missed that feeling that I had full control over what was going on in the system. But over time I found that while I did need to be concerned with scarce resources (db connections, file handles, and other system resources), I didn't really need to be spend a lot of attention on memory resources. For scarce resources, "using" works well for me, and I prefer the scoping that "using" gives me over the scope-based lifetime that you get with smart pointer approaches in C++, and I also like that it's more explicit.

This does not mean that you can totally ignore the issues around object allocation in C#, as Rico's has said, repeatedly.

Oh, one other point on object lifetime. Having an environment where there is no automatic scope-based lifetime makes supporting exceptions much cheaper in C#, as there isn't the overhead of tracking what objects are live at any point that is required by C++ exceptions.

const

Which brings us to const. My experience with const is as follows:

When I used const in my projects, I always ran into situations where I needed a routine that was const to become non-const. That meant either changing that routine - and then updating all of the callers so that they were non-const - or creating non-const versions of existing routines where applicable. Neither of those is a particularly nice and/or fun thing to do, and after trying it for a while, my conclusion was that having const didn't give me enough benefits to make it worth the disadvantages.

I do agree that const can give some protection against the programmer doing the wrong thing (which, interestingly, is not really in keeping with the general C++ philosophy that programmers should be able to do whatever they want, even if it's wrong (yes, I'm being a bit extreme there)), but since it's merely a convention and not a guarantee (as I can cast const away whenever I want, or just use "mutable"), I don't see a lot of value.

I've talked to enough people to know that my opinion is not shared by all.

C# things I missed

There were a couple of notable things I missed from my C# list.

Events

Events are much much more useful than I had originally thought. While you can do a lot of similar things with interfaces, events are great for the sort of loosely-coupled components that I like to create. For me, events are a feature that work exactly the way I want them to.

Data types

This is a big one that I missed.

In C#, there is one string type.

In C++, I'm currently dealing with code that uses:

CString (the ATL/WTL type)
LPCTSTR
LPTSTR
WCHAR*
TCHAR*
BSTR
_T("constant")
and needs to transform strings from one type to another fairly regularly. I also spend time making sure I have the right distinction between byte count and character count when dealing with such types.

Re: Demo Perils

I doubt any of us have actually given demos considering we haven't really created something interesting enough for others to use (ie. something like DTrace)! The closest thing I've come to a "demo peril", is when running my project for grading. It works fine on my box, but when I run it for the grader, it bombs.