Pieter’s Programming Blog

I’m a freelance Java developer based in Belgium (better known as Brussels outside Europe). Here I write about my experiences based on my daily struggle with man’s most feared enemy: computers.

Blocking bad bots

Today I blocked some bad bots that were spidering some of my sites. Most notably Custo, which downloads your entire site.

An interesting solution is posted here (I used the mod_rewrite option). You can test this by changing your user agent in Firefox.

This guy seems to be following bad bots.

I added Java, Nutch, Jakarta, Vagabondo and an empty bot name to the list of bad bots.

GWT: follow-up

This is a follow-up post on Why I dumped GWT.

First of all, I had a really long week + weekend and I was tired when I wrote the post. I apologize for the rather in-your-face title. I should have chosen a more subtitle wording, especially since I really appreciate all the work that developers donate in their free time to open source projects.

I did’t expect my blog post would end up as the main article on ongwt.com. I use my blog mainly to communicate with colleagues about my work. It probably has something to do with my recent switch to feedburner. Hooray for feedburner!

Like I wrote, I like GWT and its approach. It’s really nice to see how intelligent the development team approached and solved the problem at hand. The image handling (sprites), js compression and http round-trip optimisations are really clever.

I’ll start by describing how my site came to what it is now. The site started as a playground for me. I wanted to try the latest new thing (ajax!) and so I first started with scriptaculous. I didn’t succeed in getting the layout right with pure css (after all, I’m only a Java developer) and I stumbled upon GWT. The mail app demo is really nice and this gave me the idea to start a site that searches on-line marketplaces and lets users treat the classifieds as e-mail: with the possibility to delete, mark items as read/unread and star them. Much like Google Reader.

For this, GWT was the perfect match. Everything went as I expected it to do, sometimes with some cursing about why my onclick events were not fired and why a non-existing background image in css stopped the hosted mode to work, but all-in-all it was very good.

After some (positive) discussion with my other half, I wanted the work I put in it to give me some return-on-investment (money!). It turned out after some basic user testing (she sitting at the keyboard and I shouting “why would you do that?” and “that’s not meant to be used like that”) that the whole idea was too complex for a standard user (no offense to my super-intelligent girlfriend) who stumbles upon my site. So week after week I removed some of the functionality to make the page less overwhelming. Until I finally found myself using GWT only for the autocompleter, which clearly wasn’t the intention of the GWT framework. This, together with the remarks I gave in the previous post (adsense, analytics and seo) made me decide to temporarily stop developing with GWT.

I expect to start again with GWT once the site “gains some momentum”, and then I will re-enable those more complex features which should be easier than with mootools. I’ll probably ask some advice from a usability expert about how to design the page with all this functionality without overwhelming first-time users. And I will check out MyGWT and GWT-Ext more thoroughly.

Why I dumped GWT

I’ve used GWT for over half a year now on koopjeszoeker.be. Two weeks ago I decided to stop development with GWT and go with plain HTML and mootools for the autocompleter. I’ve used mootools already a lot and I’m really getting the hang of it.

Why? Why did I spend all this time developing in GWT and why did I decided to stop?

First of all, GWT is a fantastic framework for doing web development. I think it’s the best tool at the moment if you want to build the next GMail or an intranet application. For all those slow and lousy web interfaces (for timesheets, CMS, …), GWT could come to the rescue. But my site is completely different.

Some of the reasons below are not really related to GWT, but more to using ajax in general. It is my opinion however, that these problems are easier to solve with ‘standard’ javascript libraries like mootools, prototype, dwr or scriptaculous since these have a nice way to add some ajax to certain DOM elements. For example, in GWT I had to subclass the autocompleter textbox so I could attach it to an input field that already existed in the HTML. Maybe all of this could by solved if GWT had constructors that accept a DOM id too.

SEO

I’m entering in a highly competitive segment where SEO is really important. Since most of the html is build with GWT, you end up with a pretty empty page for Google. I added some noscript tags, but this was not really helpful.

Adsense

Another problem were my adsense banners. Since I didn’t have a lot of content on the page, the banners were sometimes off topic. An even bigger problem was that the banners stayed the same when people searched for different keywords (since the ajax refresh didn’t trigger an adsense refresh). I solved this by doing the search with a page refresh instead of an ajax call. The ajax part of the site was limited to sorting, faceting, i18n and displaying tips.

Google Analytics

I’m also using Google Analytics. Although no real evidence exists, it would be naive to think that Google isn’t using this data. But because of the ajax calls, I don’t get as many pageviews as a static version of competing sites. Every visitor is seen as doing 1 page visit, while he may have browsed several pages. This makes my bounce rate in Google Analytics really high. This can’t be good for my Google rankings.
In Belgium we have CIM Metriweb, a kind of archaic tracking system that is used when marketeers look for sites that have many hits. I’m not currently using this, but this thing depends on pageviews if you want the big guys to donate to your site.

What now?

I wanted a fully functional HTML version, where GWT was injected in some places to replace the full page loads with ajax calls. However, I couldn’t find an easy way to do this. And once I succeeded, I found that I had almost no code left in GWT that was worth using it instead of mootools. So now, after a lot of research and experimenting, I decided that I’ll go for the plain-old html way and spiced up some parts with ajax (like the “so 2007″ textbox autocompleter).

I discovered the Blueprint CSS framework (version 0.7 now has semantic classes) and CSS sprites. I’ve used Kuler and read a lot about CSS tips and tricks. I even read a bit about usability.

And since I spend 3 hours a day on the train, I have time to redesign the site. Using blueprint, it really was easy and the result is a much better looking, stable, fast site. Check the homepage: it only has 1 css, 1 javascript, 1 gif and 1 jpeg, but there are 25 images! Ah, the magic of blueprint, sprites and jawr…

Update: please see GWT follow-up

Compress Javascript and CSS with Jawr

Today I used the nice Jawr taglib which compresses javascript and css files. There’s enough information on the Jawr website about how to configure everything, so I won’t write about this.

Things to remember are:

  • Better structuring / versioning of your development javascript and css versions while still publishing them as 1 file
  • Gzip support for compliant browsers
  • Give the css and js files cache headers ‘until the sun explodes’
  • When you deploy a new version of your site, a new css and js version will be downloaded by the browser

Net result: our YSlow score went from 49 to 69!

2 Things I want in CSS 3

I’ve done some html/css restyling lately and there are some things I would like to see added to CSS 3. The process to request some changes to be incorporated into CSS 3 is a bit overwhelming to me, so I just post them here and hope they will be picked up by someone.

CSS variables

CSS variables would be nice. I want a way so I can easily change all colors in my CSS with one adjustment, not by searching for the color in the file and replacing it with the new value. I also think this would increase readability of the file.
This would allow to define recurring parts of the layout in the CSS file like this:

var backgroundcolor : #FFFFFF;
var border: 1px solid #CCCCCC;

.container {
background-color: $backgroundcolor;
border: $border;
}

.navbar{
border: $border;
}

Path variables

If I could rename a path selector to a variable, I could remove a lot of classes from the html and still be able to easily change the css.
Take this example:

.navbar ul li a {
text-decoration: none;
}

.navbar ul li a:hover {
text-decoration: underline;
}

This would become:

var listItem: .navbar ul li;

$listItem a {
text-decoration: none;
}

$listItem a:hover {
text-decoration: underline;
}

These examples look simple, but my experience is that a lot of the same values can be found in many CSS files. Wouldn’t it be nice if we have a block of variable definitions at the top so we only have to specify once that the color of all borders should be changed?

This would also make it easier to let users (on a blog for example) override the colors with their own stylesheet, which overrides the variables with their settings.

The only way I know off to achieve this at the moment is by generating the CSS with a templating framework like JSP or Velocity (or why not PHP), but this seems like overkill to me.

So, anyone with the power to move the W3C board, go on (and let me know of the results)!

Website Performance tuning with Firebug and YSlow

Today I discovered a cool plugin for Firefox: YSlow.

In combination with Firebug, it allows you to quickly get a report about performance issues with your site, like too many css or javascript files, missing cache headers and much more.

I got a score of 66 for koopjeszoeker.be! Not much to improve there, besides switching to a CDN, but I don’t think this is something that will happen very soon (maybe with Amazon S3?).

New server almost complete

I bought (together with my brother) a new server. The old one is definitely ready for retirement: 120.000 visits, 1.600.000 pages and 50.000.000 hits (not counting frequent Google crawls, integration with SMS services and Nieuwsblad.be) for pets.be in a month was a bit too much for 1Gb RAM on a hyperthreaded processor which also runs some other websites and now my koopjeszoeker.be site which definitely needs more memory and faster disks.

The investment wasn’t small, but should be worth it: 2 servers, each with 2 quad-core cpu’s and 4GB RAM, all in one unit. I ordered the server on a tuesday morning and could pick it up the same evening. 3 weeks without free time later, the server is ready to be shipped from under my bed (the noise!) to the data center. Ubuntu, Varnish, Apache 2, Tomcat, MySQL, Subversion, CVS, Firehol, … all is installed and (a little bit) tested.

Those dreaded “server busy” messages should be gone soon and koopjeszoeker.be will be ready to go out of beta! (Jay!)

Varnish

My Squid book has arrived, but is it bit disappointing: only one chapter about reverse proxies! Frank told me to have a look at Varnish, so that’s what I’ll do!

Apparently, Varnish isn’t written with 1975 programming and should be much faster

Ubuntu or CentOS or …

So, if one day I have my new dual quad-core server, what do I install on it? Fedora made maintenance on my current server a bit hard because I had to go through long steps to go from one core to the next every 6 months (and sometimes a trip to Brussels to press the reset button when I messed up).

Ubuntu seems easy to install and has long support for the 6.06 version (till 2011).

On the other hand, CentOS seems reasonable too, since I know of some bigger companies who use it in production. I personally don’t know any companies running Ubuntu (I’m sure there are).

Has anybody any experience with the Ubuntu server version? I already installed it on an old computer at home, which worked ok, but what with multi-core processors?

Firefox add-ons

In the series “which Firefox add-ons do you need as a web developer”, here’s my list:
- Firebug
- Web developer toolbar
- Download statusbar

Follow

Get every new post delivered to your Inbox.