Creating new terminology
Many programmers seem to be afraid of creating new terminology and concepts. This certainly has some positive consequences as well – the world has enough confusion already :-) But when wielded properly, the possibility of introducing new concepts is a powerful weapon of simplification. In this article, I'll explore the concept a bit further.
The example
You're creating an application with some built-in user management features. You want different accounts to have different permission sets. It's becoming obvious that you commonly check for a certain pattern of permissions, namely
At that point, it's feels pretty natural to resort to naming the phenomenon of having group 0 permissions. If you have them, you're an administrator (or superuser or root or whatever suits you). The condition is now hidden under a common name. You can create something called
This example feels natural to most of us. I believe this is because the notion of an administrator is common enough to strike a chord even without prior knowledge of the application's domain. The true challenge lies in creating concepts which the reader cannot know and understand in beforehand.
The practice
Most applications operate in a complex domain where many objects are typically identifier by a set of conditions, not just their object identity ("Customer #1234"). Much of this can be easily expressed with proper prefixes. For example, a customer's "LastOrder" is often defined by a lookup of orders with a certain sort criteria, a "MailingAddress" is a lookup in the Addresses collection with the requirement of "Address.CanSendMail = true" (or whatever) and so on.
It helps a lot if you can use terminology that maps logically with the natural language. Previous examples do this nicely. However, it gets trickier when you have to introduce terms that are ambiguous or unknown by nature. Terms such as "primary", "preferred", "valid" and "external" are often useful but are relative by nature. For example, it would be valid to ask "external to what?" or "valid by which rules?" On the other hand, words like "last" are commonly interpreted pretty uniformly – in this case, the vast majority of people would consider "last" to mean "the most recently created".
Still, creating new terminology based on relative attributes is often necessary – it just requires some special attention. For example, if you're creating a web application that employs complex urls without fixed semantics (i.e. a request to an url can return various results based on the application state), you might want to create a notion of "a stabilized uri" meaning an address that can be used to return to this page even without the state information (such as cookies). Nobody has ever heard of a "stabilized uri", and understanding the word "stable" in this context pretty much requires understanding its negation, i.e. the dynamic nature of the application's normal uri composition.
The rules
Looking at the previous example, you need to weigh in a few factors before making the call. Here's a stab at establishing some common criteria for picking the terminology.
First, if you didn't call it a "stabilized uri", what would you call it? How would that term work in natural language ("An uri that can be used without previously set cookies")? How about APIs (GetUriThatWorksWithoutCookies)? Would your planned replacement be considerably easier?
Second, would your new concept conflict with existing natural terminology used in the field? For example, calling our stable uri an "absolute uri" would be a bad idea due to the commonly used distinction between absolute and relative uris – some words are already "reserved" or at least excessively burdened by previous associations.
Third, does the new terminology scale well? Concepts you create often turn out long-lived, perhaps even moreso than the system (as they tend to get used in the specification for the next system). Does the term have a sufficiently stable and universal definition so that it doesn't become misleading after a few rounds of refactoring? Of course, unless you're a psychic, you'll have to make an educated guess here.
Fourth, do you have the means to sufficiently document and broadcast the meaning of your new terminology? If you can significantly reduce ambiguity and complexity by creating the new term, you'll probably want to do it. But if you can't (or won't) document the new concept you've created, you should abandon the idea. Even a clumsy vocabulary beats an unclear one. Also, you will have to be able to make the use of the new term relatively uniform and ubiquitous – if you can't, it's often better not to mess things up by adding new concepts aside the old ones.
Summary
It's not inherently bad to require the reader to learn some terminology before understanding the code. We do this all the time. Think words like "assembly" in .NET, "expression" in mathematics, "bag" in C++/Java and so on. Still, you don't want to raise the language barrier artificially, so you'll have to strike a balance. Totally abstaining from new terminology often turns your source code into a quagmire of complex expressions, thus removing the human advantage of remembering and combining concepts instead of their building blocks (often called "abstraction").
I feel many programmers do not make these decisions consciously. The gut reaction often involves initially opting for not-naming, but with growing pains later on, they then create an ad hoc name for some specific context. At this point, propagation often fails and mixed terminology throughout the project is born. I hope this article has been able to provide some tools for thinking things through at the point where the changes are still relatively straightforward to make.
December 13, 2006
В· Jouni Heikniemi В· Comments Closed
Posted in: General