Sunday, July 23, 2006

midday

I've always liked Mid-Day. I was pretty happy to find their site. But the format they have is pretty sucky. They provide a pdf for each page. So to read the entier paper, each page has to be downloaded separately. Not the most convenient thing.

I had started looking at Python a bit (going from the previous post). This seemed to be a pretty good project to use it for. The script - midday - automates the process. Given the paper you want (sunday|mumbai|vashi), it downloads each paper's pdf pages to your machine. It also uses a third party app - pdftk - to combine all the pages into one pdf. It needs to be downloaded and the location it was installed in needs to be passed into the script, but its not necessary. The script still works w/o it, but the pages are in separate pdfs. Should work on both Windows and Linux, but I've tested it only on Linux.

Overall, I liked Python quite a bit. Definitely more productive that Java... for something like this. Less upfront design needed, but is still structured. All the data structures mentioned in the last post came into play. Also, being interpreted, it's pretty easy to make it work on multiple platforms. The library is pretty extensive and with the easy ability to call other programs (especially in the linux environment where the culture is to create small programs that can be stiched together), python can be used for a wide range of applications.

No comments: