Code quality, part I: Why ArrayLists are a bad interface

I'll start my planned series of code quality related posts with a .NET specific issue. I was planning on a more generic rant for a starter, but I've decided to try to blog things as they cross my mind – otherwise my list of topics will just keep growing (it's about three pages on 10-point Arial now) and I'll never post anything. So here we go!
The .NET System.Collections.ArrayList class represents a mutable-size collection of untyped objects (fwiw, very similar to Java's Vector). In contrast, .NET has typed but inflexible normal arrays (for example, int[] in C#). Now, as you're writing your application logic methods and returning lists of things, using an ArrayList is certainly a tempting alternative. Often, you need to use an ArrayList to construct a list-alike result (f.e. reading from a firehose DB cursor, you couldn't preallocate an array of proper size anyway). So, since converting to a typed array is somewhat cumbersome (something like (int[])myArrayList.ToArray(typeof(int))), why should I bother? Why not just return the ArrayList?
Well, the apparent reason is that "untyped containers suck". At some point, a user of the method is likely to cast the objects in the container into the wrong type (Were those values ints or longs? Did that method return Persons or Employees?), causing an exception. Good use of comments – particularly XML ones – will help, but won't resolve the issue. When returning objects, doing the actual conversion is easy. Thus, the rule is simple: Don't return ArrayLists. Return typed arrays. It's extremely rare for the caller to need to mutate the returned collection, and when that need arises, they can just use new ArrayList(myArray) to pull the array's contents into your newly created list.
What about when you're taking a list of something as a parameter? Again, don't use ArrayLists. Actually, I can think of only one situation where you'd want to do this: When you're passing an arraylist and intend to modify it. Most of the time you can safely replace the parameter modifying by taking an read-only parameter and returning a new array instance. When you can't, you should pass an IList instead of an ArrayList – if you're ready to abandon strong typing, then why not reap the benefit of allowing more flexible input? An IList parameter will happily take not just ArrayList, but also other containers that have list semantics.
For read-only (usually foreach-only) lists of items – such as a typical public decimal GetSalarySumByEmployeeIDs(ArrayList employeeIDs), typed arrays are again one of the best choices. Replace ArrayList with int[], and you just can't pass Employee objects by accident. However, if all your existing code uses ArrayLists, forcing all callsites to start converting their arraylists into arrays may prove to be too much of a strain. In this case, use IEnumerable. It's exactly as type-(un)safe than ArrayList, but allows passing of both arrays and arraylists (and a host of other containers).
To sum it up: ArrayList is useful as an internal structure, but sucks as an interface or public method signature element. Avoid using it publicly. Whenever you can, replace ArrayLists with typed arrays. If you can't, replace them with an interface that provides sufficient functionality. If you're on Whidbey, replace read-only items (both return values and in-parameters) with IEnumerable<T> and modifiable parameters possibly with IList<T>.

December 29, 2004 В· Jouni Heikniemi В· Comments Closed
Posted in: .NET