July 28, 2004

Bookmarking is still a vague concept

I accidentally bumped into a W3C XForms mailing list posting by yours truly, sent back in January 2002 and strongly advocating the role of GET http method. I had no memory trace of posting something like that, but I still agree with many of the points.

Not just GET, but the message also talks about related ability to bookmark: Bookmarking is considered an essential tool for web browsing. But what is it really? Is it just storing the url for the current frame (as most browsers do it)? Or, as IE does for framesets, is it storing the frameset and the urls of pages within? For Mozilla and Firefox, you can also bookmark a group of tabs - essentially storing a few urls and their order in the tabs.

But it could be so much more: it could be about storing the position in-page, it could be about storing values of form fields, state of in-page script and external applets, cursor focus... Whatever. The bookmarking concept hasn't really evolved much since the first browsers. Meanwhile web applications have become massively complex, and any navigation help is even more necessary than it was before. If I think about the Finnish railway company's Internet ticket sales system, I certainly could use a bookmark pointing to a readily filled booking form with my basic information, usual route and so on. That's the kind of "state freeze" I'd love to see there.

As a web developer, I can easily come up with dozens of good reasons why implementing a broader concept of bookmarking is very impractical in the general case. Yet still, if it was possible for a web page to tell (via standardized markup) which elements of it are stable or relevant enough to have their contents bookmarked, considerably better implementations would be possible. And of course it would always be possible to fall back to a simpler bookmarking mechanism, ie. just the url of the current frame.

Those who think browser evolution is already at its peak couldn't be more wrong. Even fundamental concepts like bookmarks need a lot of innovation and dedication. It's great to have several serious browser projects going on again.

Posted by Jouni Heikniemi at 10:11 AM | Comments (1) | Web

July 26, 2004

Firefox toolbar adding custom local attributes for bugs

Yeah, the Bugzilla team has been crazy about this already, but I have to mention it too: Vladimir Vukicevic is developing a very promising tool called Bugwrangler. It's a system that allows you to create local metadata (custom keywords, personal priorities, freetext comments) on bugs and then use that metadata to sort bugs and whatever. It's not yet testable, but Vladimir's blog entry gives an idea.

Projects like this make developing Bugzilla worth it. People really use it, and they really work on making it more usable. And yeah, there certainly is a lot to be worked on. ;-)

Posted by Jouni Heikniemi at 09:55 PM | Comments (1) | Bugzilla

July 25, 2004

Tight syntax from a readability perspective

When reviewing the code for a massive new feature for Bugzilla, a question on the proper way to write logic stepped up again. Do we use the ternary conditional operator ?: ? Do we use explicit boolean constants true and false instead of relying on expression results? Let me demonstrate the issues encountered with some equivalent snippets of code. Assume this is C#, but C++/Java/Perl etc. wouldn't differ much.

// Style 1: The most verbose

public bool isEvenInteger(int num) {
  if (num % 2 == 0)
    return true;
  else
    return false;
}

// Style 2: Use ?: instead of the if conditional

public bool isEvenInteger(int num) {
  return (num % 2 == 0) ? true : false;
}

// Style 3: Discard ?:

public bool isEvenInteger(int num) {
  return (num % 2 == 0);
  // C++ posse would say just "!(num % 2)"
}

So the question: Which one of these is the most readable?

For the beginning programmer, one is tempted to think about style 1. After all, in that example, the only non-english things are the % (modulo) operator and the use of double ='s - but once you know them, it's pretty clear-cut. On the other hand, style 2 is much more compact. It uses the ternary conditional operator, where a ? b : c is roughly equivalent to if (a) b; else c;.

Since the a part - the condition of the ?: operator - is always a boolean expression, the ?: operator is useless when you want to return a boolean value. In fact, ? true : false can always be removed, and ? false : true can always be replaced with a not operator. And because of this, most programmers with some experience pick model 3, which is terse and compact.

And that marks the kickoff on the discussion of whether code is too tightly-packed (for readability) after all unnecessary parts are discarded. For the C++ form briefly noted in the example, I think it is. For the uncommented version of style 3, I'm inclined to say no. Here's some argumentation on the issue.

Tightness is readability, too (in code)

People often fear ?: because of its unnatural syntax. To some extent, this is also caused by misuse of the operator: f.e. construction of switch trees with it almost always results in unreadable code. Also, ?: is a prime candidate for obfuscating your logic. But that's not to say it couldn't be useful: terse syntax is not necessarily an enemy of readability. Take a look at the following typical code examples:

int age = AGE_NOT_GIVEN;
if (userinput_age > 0) 
  age = userinput_age;

// vs.

int age;
if (userinput_age > 0)
  age = userinput_age;
else
  age = AGE_NOT_GIVEN;

// vs.

int age = (userinput_age > 0) ? userinput_age : AGE_NOT_GIVEN;

Upon closer inspection, any programmer easily understands any of these structures. The first of them is probably the worst, as it forces the reader to both evaluate the first (default value) assignment, then think about the if condition and the other assignment inside the conditional block. The second one makes it much more clear that age is assigned either of the two values, so it beats the first one.

But both of the two first styles have one drawback: At a glance, it's not easy to tell that the age is assigned a value. The if blocks could contain whatever code, so you have to read through them to see that age value is actually getting assigned. With the third form, the first 10 characters tell you age is getting assigned: thereafter it's just a question of what's the value it gets assigned to.

When looking at the code from a more general perspective, you're not interested in the details; you're interested in the generic flow of things ("first age gets assigned, then WibbleMyToes(age) is called, after which the return value is saved"). The more pages you have to go through, the more time it takes. So, using compact syntax for logical trivialities avoids stealing focus from more important issues. And yes, this means there is a consequent rule: Do not try to pack logic that is critical to understanding the code flow. I believe most of the bad examples of using ?: stem from breaking that rule.

That said, it's also easy to see why you shouldn't write ? true : false: It is just another way to emphasize trivialities, just like unnecessary commenting in the spirit of i++; // Increment i by one.

The conclusion

Full circle is easy to achieve here: Beginning coders mostly (have to) focus on the code at the statement level. Advanced developers have already devised a skill to read code one page at a time. Let's return to the first set of examples. While the first style is probably the most readable form for beginners, it degrades the code browsability in the long run. The last one may take some time for beginners to read, but it is precise, to the point and exactly as unnoticeable as it should be.

There will always be somebody whining about using language features such as the ?: operator. That's understandable. People are bound to complain, because the tight syntactical structures are something they can point at. If you expand all the code, the big picture may still be unreachable for them (because of the length and depth in the code), but they're more likely to blame themselves as they can't spot a single culprit for their lack of understanding. It takes quite a lot of programming experience to come up with opinions like "You're commenting too much here" or "You should tighten this syntax".

Of course, no rule is a silver bullet. If your conditions get really long or if they're particularly critical for the software's functionality, use an if block - at least the structure of an if statement leaves you much more room for useful comments.

Also, everything said above shouldn't fool you into thinking there is a single correct way to do things. There isn't. These are all just factors in a bigger game of selecting a coding style and being consistent with it. That's one of the more interesting challenges in developing open source software - with a developer group of insanely varying backgrounds and competence.

Posted by Jouni Heikniemi at 10:43 AM | Comments (1) | Misc. programming

July 24, 2004

Perl's map and grep on C# 2.0

My article Implementing Perl-style list operations using C# 2.0 is now public on CodeProject. Go read if you want new tools for your array/list toolbox.

Posted by Jouni Heikniemi at 10:41 AM | Comments (5) | .net

July 23, 2004

Tools for beginners, pros and gurus

Microsoft has announced the product line overview of Visual Studio 2005. Since professional developers are already used to the full VS experience, the most interesting part is the Express product line, aimed at "beginning programmers and non-professional developers". Although the licensing conditions and final pricing remain to be published, it's rumoured that the price tag would be in the 49 - 99 $ range or 40 - 85 euros approximately.

I've been testing the Visual C# 2005 Express Edition Beta, and I must say this: If the price ends up the lower end of that rumoured range, it's going to be a hit. And it's going to hit the competition, Borland mostly. The VS line has gathered such a community behind it (take a look at CodeProject or GotDotNet just for examples) that it's going to be increasingly hard for Delphi to compete. I'm not even mentioning C#Builder here -- last I tried it, I quickly fled back to my text editor (not even VS at the time).

Even though Microsoft talks about beginners, I don't think most people are going to run out of features on VC#2005 Express even if they were quite able programmers indeed. I mean, these feature tables are just an illusion: for most part, they don't contain rows that have "Yes Yes Yes" - elementary features such as syntax highlighting or IntelliSense are taken for granted. Now, if we compare VC#2005EE to, say, Turbo Pascal 5.5 of the early 90s, VC# is packed with features nobody ever even dreamed about back then. Yet, people created massively complex software with those tools. For most everyday programming tasks, we've crossed the border where programming IDE stopped being a hindrance quite a long time ago - it's now a question of the developer's ability to develop and handle immense abstract structures.

By no means am I trying to say there's no longer need for tool development. VS2005 is much better than 2003, and there's still much room for improvement. But still, looking at feature charts makes you unnecessarily greedy. Those tools don't usually make you a better programmer which is - in the end - The Thing required to successfully create and maintain pieces of non-trivial software, be it for commercial or non-commercial purposes.

The next logical step here will be providing the Express tools for free. I'm looking forward to it. But even now, I'd be ready to toss 50 of my own euros to get the dev environment VC#2005EE provides.

Posted by Jouni Heikniemi at 09:55 AM | Comments (7) | .net

July 21, 2004

Perf testing .net framework 2: generics

After installing .net framework 2.0 beta 1 a while ago, I've been wanting to perf test C# generics to see if the speed increase from untyped containers is noticeable. I'm surprised on the little effect they had on tests, but on the positive side, Whidbey framework seems considerably faster than 1.1 anyway.

Testing part 1 was done by compiling the following on both 2.0 and 1.1:

  class MyClass : IComparable {

    public readonly int x;
    public MyClass(int x) { this.x = x; }
    public int CompareTo(object o) {
      return this.x.CompareTo(((MyClass)o).x);
    }
  }

  static void Main(string[] args) {

    ArrayList a = new ArrayList();
    Random r = new Random();
    for (int i = 0; i < 1000000; ++i)
      a.Add(new MyClass(r.Next()));
    a.Sort();
  }

The code simply creates custom objects and sorts them. On framework 1.1, running the program took about 4.3 seconds (average of 10 repetitions). On framework 2.0, the exactly same source produced an executable with a running time of 2.7 seconds - that's a 37% improvement just by switching the framework version!

At this time, I was expecting quite a lot from generics. So I changed the code a bit:

  class MyClass : IComparable<MyClass> {

    public readonly int x;
    public MyClass(int x) { this.x = x; }
    public int CompareTo(MyClass o) {
      return this.x.CompareTo(o.x);
    }
    public bool Equals(MyClass o) { return this.x == o.x; }
  }

  static void Main(string[] args) {

    Random r = new Random();
    List l = new List<MyClass>();
    for (int i = 0; i < 1000000; ++i)
      l.Add(new MyClass(r.Next()));
    l.Sort();
  }

The IComparable now uses a generic typed version, and I've replaced untyped ArrayList with the generic List type. And the runtime? 2.3 seconds. That's about 15% off from the 2.0 result - I wouldn't have been surprised by even more drastic figures. I still wanted to try reading the list through, so I added the following:

    long sum = 0;
    foreach (MyClass mc in a) { sum += mc.x; }

.net framework 1.1 used about 4.6 seconds; 2.0 did it in 2.9 secs and the generic version clocked 2.5 seconds. So, reading the array through in a foreach loop didn't really make a difference - the relative speed differences were equal.

One shouldn't be too disappointed on the performance of generics, though: The speed increase from 1.1 to generics version is a whopping 46% - the fact that even untyped containers got quite a speedup doesn't really make generics worse. Couple that with the better syntax and less risk for nasty runtime errors, and I think we'll find generics quite useful indeed.

Posted by Jouni Heikniemi at 10:04 AM | .net

July 20, 2004

MSDN Magazine 8/04 is out!

The articles for MSDN Magazine 8/2004 are out. Nothing particularly dazzling at this time, unless you're interested in SQL Server Reporting Services, Genetic Algorithms, Sharepoint, ADO.net, ASP.net and Windows Forms internals and, uh...

I guess I just summarized my main problem with MSDN Magazine: the breadth and depth of the articles in every freakin' issue is baffling. Although it's nice to have all that information sitting in your bookshelf even if you don't actively read all of it, month by month it's becoming harder - and not to mention more time-consuming - to digest even a fraction of the contents. But I guess this is the way things are now; or, as Joel On Software puts it: No developer with a day job has time to keep up with all the new development tools coming out of Redmond, if only because there are too many dang employees at Microsoft making development tools!"

Posted by Jouni Heikniemi at 09:23 PM | .net

... and the warm welcomes ...

So what's going on here? It's Jouni's blog about IT stuff, mostly coding-related. The history of this content stream actually dates back pretty far; I've published my writings more or less regularly for several years now. Since my other blog (Lakiblogi, in Finnish only) became focused on law issues only, I was left without an easy publishing channel for this sort of things. The apparently growing role of IT and coding in my life is another reason for doing this now.

A short introduction is in order here: I'm Jouni Heikniemi from Finland, 25 years old at the moment of writing, and working for 1,5 weeks more as the New Media Manager for MikroBitti Magazine. Starting from 1st of August, I'll be working at Blue Meteorite Ltd. as a software architect. By heart I'm a .net man, but we all have dark sides in our character: I'm also one of the core developers of Bugzilla, The open source bug tracking solution (and damn, it's written in Perl!). Outside of the IT world, I'm a law and CS student at the University of Helsinki.

I'm not the guy who lives his life on the blog, so you're not supposed to expect a steady flow of postings about my vacations, relatives and life in general. I write when I have something to say, and usually for the purpose of conveying some tidbit of information. Oh, and why English instead of Finnish? Well, why not - everybody in the IT field will be sufficiently adept in English anyway, and this is one way to make the archives useful for a wider audience. Besides, it's good practice.

Without further ado, welcome!

Posted by Jouni Heikniemi at 08:56 AM | General