TechDays Finland 2012: my talk on unit testing fundamentals

TechDays Finland 2012 is going on at full speed. I was lucky to be talking in the first slot right after the keynote, so that the rest of the event is way less stressful for me :-)

imageUpdate 2012-03-29: The video link was added.

The venue: Helsinki Fair Centre

As with a few previous TechDays events, the venue is Helsinki Fair Centre. For the past years, TechDays has occupied the conference wing. This year, it spans quite a bit of the actual fair halls as well. Hugely more space – no more irritatingly crowded corridors.

Also, bigger rooms for sessions! Most sessions have fit into their rooms, and a couple have been handled by adding an overflow room with video/audio streaming. This thing is getting quite professional, indeed!

With the first day soon behind us, a second one still lies ahead. Quite a few presentations going on – there are a dozen simultaneous tracks! And with 2800+ peeps aboard, it’s definitely the biggest Microsoft pro event in Finland so far.

My talk: Unit testing

IMG_5289The title for my talk was “Picking up unit testing” (a liberal translation; I did give the talk in Finnish). Instead of going through the basics of unit testing techniques (setups, assertions, fixtures, …), I took a different road: I talked about the politics, the goals and the various logical approaches to unit testing. To the right: The beginnings of the crowd as I saw it.

Here’s a shortened list of the items I discussed:

  • How to convince your boss into unit testing?
  • What to expect from unit testing?
  • How much additional time do I need to spend – and when?
  • What are the horrible dependencies and why do I need to tackle them?IMG_5301
  • What kind of project/module should I start unit testing with?
  • Should I go with easiest-first, toughest-first or maybe just test when fixing bugs?
  • Black box vs. white box
  • The pitfalls of code coverage

The slides are here, and a video recording is available on YouTube. Also, I referenced two of my blog posts on code review – a process I heartily recommend also for unit test code:

http://bit.ly/katselmointi1
http://bit.ly/katselmointi2

Thanks to everyone who attended – and enjoy your TechDays!

March 8, 2012 · Jouni Heikniemi · 6 Comments
Tags: ,  · Posted in: Misc. programming

My talk on ASP.NET and modern web development

On 18th January I was speaking in an HTML5 seminar arranged by Microsoft Finland (agenda in Finnish). Although my presentation was in Finnish, you can find a short link-annotated recap of the presentation – and links to the material – here.

The material (partly in Finnish)

I also demoed WebSockets and SignalR using Microsoft’s demos from the //build conference. Paul Batum has a detailed blog post on those demos. The demos themselves are available on GitHub.

The presentation was recorded and will be available on YouTube. I’ll update this blog post with the link once the video is out.

A short recap of the presentation

I started by reminding the audience that ASP.NET is no longer equal to Web Forms, and the pain points regarding web forms are largely no more. After that, I went on to discuss the fact that HTML5 and CSS3 aren’t really server-side features, and no server-side feature can greatly affect the effort of building HTML5 apps.

imageWith that said, it was time for the first demos. I pulled up the Demo #1 (Html5DemoApp2.sln), containing a customer info form. I pointed out that MVC3, using client-side validation, already uses HTML5 data attributes to convey the validation criteria and messages. To enhance the HTML5 experience, I pulled HTML5 Editor templates from NuGet, enhancing the form with HTML5 input types. This was demoed with Firefox validating the email field properly, as well as Opera rendering the birthday field as a datepicker.

Moving on, I started talking about passing data from the server to JavaScript – a topic generally considered surprisingly difficult. I first showed Demo #2 (Html5DemoApp3.sln) with a simple ASP.NET action method returning a JSON representation of three customers, showing the JavaScript needed to render this data to the page. Then, I took a quick dive into OData, and showed how develop a customers-returning API.

imageThe last segment of the session was all about fast communication between the client and the server. I showed a few slides presenting polling, long polling and finally, WebSockets. I then went on – with some mandatory demo effects! – to show how to build a simple chat application using IIS 8 and Visual Studio 11. Further on, I discussed how the abstraction level of websocket communications can be raised by leveraging the SignalR library.

For the conclusion, I reminded the audience that Microsoft developers can no longer afford to live in their private closed confines. Microsoft has opened up by supporting open source endeavors and open protocols, and developers should follow suit. Building HTML5 applications with .NET is in fact very easy – as far as the server platform can help you, the Redmondian stack pretty much does it.

January 20, 2012 · Jouni Heikniemi · No Comments
Tags: , , , , , ,  · Posted in: Web

Valio.fi deep dive #8: Resources and ORM

Now we’re standing at the edge of the code pool. Let’s dive in! Good background reading: Deep dive #4 on ORM choice, deep dive #7 on database schema.

imageThe resource model

As I described in my previous post, we ended up storing all different types of resources in one table. However, we model each of them as a separate class (a subclass of Resource). Even though resources themselves don’t have too many fields to separate them from one another, we have lots of code specific for each resource type. Thus, strong typing of resources improves intellisense experience and type safety. Mapping-wise, we use the Type column as the discriminator.

When the project was born, NHibernate 3 wasn’t out yet, so we used NH 2.x + FluentNHibernate – thus, the following mapping examples are in the fluent syntax instead of the canonical XML.

Before we look at that, let’s recap what we’re trying to achieve:

We have a “data” column of type xml. That XML contains two distinct elements of data (simplified but sufficient truth): First, the widgets structure, which is really common for all types of resources, and second, the fields specific to each resource type.

This duality causes us some headache. When mapping an XML column in NHibernate, you have to map it as a custom type. A user type mapper does not support splitting a table column into several properties. It does allow us to serialize and deserialize a complex object, though.

Getting the containment right

Since our data column contained structure specific to a resource type, we had to map the data xml column in Resource’s subclasses. Enabling the widget data (common to all Resources) to be accessed on the Resource base class level only required a little class design.

Our Resource class contains a property of type ResourceData which contains an IList<WidgetConfiguration>. ResourceData is in fact abstract: each of the Resource’s subtypes also defines a ResourceData derivative that matches its own set of custom fields. For example, a RecipePage.CustomData class adds a RecipeId field.

At the other end, the subclasses of Resource introduce fields driven by their custom data. For example, RecipePage exposes:

public virtual Guid RecipeId { 
  get { return ((CustomData)CustomResourceData).RecipeId; } 
  set { ((CustomData)CustomResourceData).RecipeId = value; } 
}

The typecast is somewhat ugly and expects that all Resource.CustomResourceData instances must be in sync with the equivalent Resource subtypes (a RecipePage always expect a RecipePage.CustomData). Of course, this isn’t much of an issue in practice.

We could have added some strong typing by making Resource a public class Resource<TCustomData> where TCustomData : ResourceData, and then have a “public virtual TCustomData CustomResourceData” on it. That way, a RecipePage would have inherited from Resource<RecipePage.CustomData> and thus gained a strongly-typed handle on its custom data.

The main reason we didn’t do this was because it would have required us to introduce an IResource or a ResourceBase for all those scenarios where we didn’t care about the resource type – the vast majority of all resource-based code. In retrospect, this would probably be a good idea. It was just too risky to implement at that stage of the project. Still, it’s no more than a small blemish – the typecasts really don’t affect your everyday programming at all, and modifying the existing resource types is pretty uncommon.

Subclass mapping and other fun

To load the resources this way, we then had to implement quite a few NHibernate mappings. We had a mapping for each of the resource type classes:

public class ProductPageMap : ResourceSubclassMap<ProductPage, ProductPage.CustomData>
{
    public ProductPageMap() : base("Product") {}
}

The parts that had to be changed in each of the mappings are in bold. In short, we had the mapping name (not interpreted, just a convention), the resource class type argument and the custom data type argument. To support such a (relatively) terse declaration, we had implemented the ResourceSubclassMap helper:

public class ResourceSubclassMap<TResource, TCustomDataType> : SubclassMap<TResource>
	where TResource : Resource
	where TCustomDataType : ResourceData, IEquatable<TCustomDataType>, new()
{
	public ResourceSubclassMap(string discriminatorValue)
	{
		Extends(typeof(Resource));
		DiscriminatorValue(discriminatorValue);
		Map(r => r.CustomResourceData).Column("data").CustomType(typeof(ResourceDataMapper<TCustomDataType>));
	}
}

To sum it up, wiring up mappings of this kind was actually very simple. Also, due to the nice extensibility of FluentNHibernate's mapping syntax, we could even easily throw in some calculated fields. For example, since we commonly accessed the RecipeId property of the RecipePage class (a member stored in the data xml for that resource type), we declared a public Guid RecipeId and specified a mapping like this:

public class RecipePageMap : ResourceSubclassMap<RecipePage, RecipePage.CustomData>
{
	public RecipePageMap() : base("Recipe")
	{
		Map(ap => ap.RecipeId)
                  .Formula("cast(data.query('/data/recipeId/node()') as nvarchar(max))")
                  .ReadOnly();
	}
}

Why bother, you ask? Since our RecipePage objects have a strongly typed notion of RecipePage.CustomData, why not just access that?

This is one of the places where we can actually deliver some performance improvements by tweaking the query model. Defining the formula of the XML access – the data.query syntax is XQuery that gets executed inside SQL Server – enables us to run NHibernate HQL queries on that particular XML data fragment. A condition of “RecipeId = foo” gets translated into SQL, and SQL Server does a good job of optimizing the XML field queries.

Without this trick, we would have to load all the RecipePages into memory and filter the list there; of course, for most scenarios they’re all cached anyway. Still, using the formula enabled us to skip caching on scenarios that would have otherwise been far too slow and still required real-time data with no cache impact surprises.

ResourceDataMapper<T>, then?

And finally, it all winds down to ResourceDataMapper<T>, referenced in the ResourceSubclassMap constructor. How does that construct the data objects? Well, that’s easy, because we simplified quite a few things. ResourceSubclassMap is just an NHibernate IUserType implementation, essentially meaning that it has a converter from the database format (in this case, xml) to the complex user type (in this case, the CustomData type). Skipping some boilerplate, here’s the meat:

public class ResourceDataMapper<T> : XDocumentCompositeField<T>
	where T : ResourceData, IEquatable<T>, new()
{
	public override T XmlToComposite(XDocument data)
	{
		var result = new T {
			RedirectUrl = data.XPathSelectElement("/data/redirectUrl").NullSafeSelect(e=>e.Value),
			AvailableWidgetZones = WidgetConfiguration.GetZones(data.XPathSelectElement("/data/zones")),
			WidgetConfigurations = WidgetConfiguration.ParseWidgetConfigurations(data.XPathSelectElement("/data/zones"))
		};
		result.ParseXml(data.XPathSelectElement("data"));
		return result;
	}

The most important things here are:

  1. ResourceDataMapper takes a type argument – the custom data type – and requires that it’s construable (the new() constraint) and that it inherits from ResourceData. It also requires IEquatable<T>, but that’s just to satisfy the C# compiler’s insistence to make sure we’re following the IUserType contract.
  2. Those assumptions are then relied upon: a new T – for example, a RecipePage.CustomData – is constructed. Next, its common fields, i.e. the ones in every ResourceData object such as the widget list and the redirect url, get populated. Finally, an abstract method called ParseXml is then called.
  3. And no surprise here, each of the CustomData implementations then have a ParseXml implementation of their own. An example follows.
internal override void ParseXml(System.Xml.Linq.XElement xmlSource)
{
	var configuredRecipeId = xmlSource.Descendants("recipeId").FirstOrDefault().NullSafeSelect(n => n.Value.TryParseGuid());

	if (configuredRecipeId.HasValue)
	{
		RecipeId = configuredRecipeId.Value;
	}
}

We also have all of this backwards, i.e. each CustomData has a SerializeToXml() method that allows us to save the changes in the objects. A ResourceDataMapper then calls this method and NHibernate gets its precious XML to be stored in the database. Not particularly complicated once you have it written up!

Widgets come next

I already made a passing mention to widgets getting parsed from the XML above. Yep, they do get pulled out by the ResourceDataMapper. But after that, they are brought to life and rendered. In the next episode, I’ll cover the widgetry. It’ll be just another example of the same paradigm we’re using here, but with a bit more actual logic (rendering, editors and whatnot). Until next time!

January 10, 2012 · Jouni Heikniemi · One Comment
Tags: , ,  · Posted in: .NET, Web

Upcoming: SANKO event on ADM code modeling & generation

The Finnish .NET User Group SANKO hasn’t been particularly active lately, much due to the busy schedules of potential speakers. But we’re back on the roll: On December 14th on 15:00-17:00 Finnish time, we’ll be having a session on an application modeling methodology called ADM.

ADM is the brainchild of Finnish senior developer Kalle Launiala, who currently works as the CTO of Citrus Oy. He has been blogging about ADM extensively at abstractiondev.wordpress.com. In the SANKO session Kalle will take the stage and define ADM, explain his ideas, ADM’s use of Visual Studio’s T4 template mechanism and the impact to software development. As usual, there will be time allocated for discussion.

The event takes place in Microsoft Finland’s auditorium in Espoo. The presentations and the discussion will be in Finnish. If you’re interested, register here with the invitation code “E023B1”. Welcome!

Oh, and of course you can also follow SANKO on Facebook or join our LinkedIn group.

November 24, 2011 · Jouni Heikniemi · No Comments
Tags: ,  · Posted in: Misc. programming

Valio.fi deep dive #7: The resource data storage model

After the previous post on how our CMS works on a concept level, it’s time to explain the technical details. Note that remembering the previous post’s concepts is pretty much mandatory for understanding this discussion. Oh, and this gets quite detailed; unless you’re designing a content management platform, you probably don’t care enough to read it all. That’s ok. :-)

The resource table

image

So, this is our Resource table in its entirety (at one stage of the development, but recent enough for this discussion). Many of the fields are not really interesting. In particular, there are a bunch of fields concerning ratings, view counts, likes and comments, which are cached (technically trigger-updated) summaries of other tables storing the actual user-level interaction. Others are simple mechanics like creation/modification data and publication control.

The interesting bits are:

  • Url is what maps a given incoming request to a resource. A typical example is “/reseptit/appelsiinipasha”, a reference to a particular recipe.
  • Type is a string identifying the resource type, e.g. “Recipe” or “Article”.
  • Layout is a string that describes a specific layout version to use. Typically null, but for campaign articles and other special content, we could switch to an alternate set of MVC views with this.
  • Data is an XML field that contains quite a lot of information. We’ll get back to this later.

imageThis is the key structure for managing resources. There’s very little else. One more table deserves to be mentioned at this stage: ResourceLink. A resource link specifies (unsurprisingly) a link between two resources. The exact semantics of that link vary, as represented by the html-ish rel column.

Typical rels are “draft” (points from a published resource to its draft), “tag” (points from a resource to its tags – yes, tags are resources too) and “seeAlso” (a manually created link between two related resources).

Modeling the resource types

The resource types – Recipe, Article, Product, Home, … – are a key part of the equation, because the resource type specifies the controller used for page execution. It also specifies the view, although we can make exceptions using the Layout property discussed above.

Storage-wise, how do these resource types differ? Actually, the differences are rather small. They all have very different controllers and views, but the actual data structures are remarkably similar. For example, a product resource has a reference to an actual Product business object (stored within a Product table and a few dozen subtables), but that’s only one guid. Some resource types have a handful of custom fields, but most only have one or two.

We originally considered a table per subclass strategy, but that would have yielded ridiculously many tables for a very simple purpose. Thus we decided to go for a single table for all resources, with the Type column as the discriminator. However, the option of dumping all the subclass-specific columns as nullables on the Resource would have yielded a very ugly table (consider what we have now + 50-something fields more).

XML to the rescue

Enter the Data xml column. “Ugh”, you say, “you guys actually discarded referential integrity and hid your foreign keys into an XML blob?”. Yeah! Mind you, we actually even considered a NoSQL implementation, so a relational database with only partial referential integrity was by no means an extreme option.

Let’s compare the arguments for and against the XML column.

Pro-XML Con-XML (more like pro-tables)
  • The relational representation would have involved really many columns, most of them null on any given row. We didn’t want that mess.
  • We would have been changing the table’s schema all the time, as finding the final set of variables for all types took quite a while.
  • XML enables resource type templates to be stored (mostly) as a single XML document; contrast with the verbiage needed to declaratively describe and instantiate a database row with close to 100 fields.
  • Since some of the resource types required multi-value properties, we couldn’t have gone cleanly with one table anyway, or alternatively we would have ended up with non-relational encodings (e.g. int arrays in nvarchars like “1,2,3”)
  • We still needed a clean way to store the widgets (see below).
  • Lack of referential integrity. What to do when a user hits a recipe page which refers to a recipe object that was deleted? True, we need to deal with that, but such scenario is still the result of a bug; we do have code to clean up refs.Also, realize that cascades aren’t a perfect solution either. If product deletion just set the reference to null, we’d have the same problem as above. If it removed the resource altogether, we’d potentially wipe lots of valuable information and content.
  • Performance. Granted, while MSSQL provides decent XQuery tools, XML column performance is suboptimal. However, it’s not as bad as some think, and it can certainly be mitigated by caching.
  • OR mapper support. Yeah, that’s a problem, but writing it ourselves was less work than what one might imagine.

 

Widgets, the last straw

Ultimately, much of the XML decision finally hinged on the question of where to store widgets. A short recap: A typical page has perhaps a dozen widgets organized in a few zones. The widgets are selected from something like 30 widget types, each of which has a specific set of parameters which must be stored per widget instance.

Now, the widget question is just like the resource one, but on a smaller scale. In order to construct a relational model for the widgets, we would be creating a dazzling amount of tables. For example, if a Product Highlight widget had a specified fixed product to highlight, a pure relational implementation would have at least the following tables:

  • Widget (with at least resource id, zone name and a position index, plus a type reference)
  • ProductHighlightWidget (with the same identity as the Widget, plus columns for single-value configuration properties)
  • ProductHighlightWidget_Product (with a ProductHighlightWidget reference, a Product reference and an index position)

Granted, a deleting cascade would work great here except for collapsing the index positions, but even that we could easily handle with a trigger.

But I’m accepting some compromises already: I don’t have a WidgetType table, and my zone name is a string. Relationally speaking, a Resource should actually have a reference to a ResourceLayout (which technically defines available zones), which should then be linked onward to ResourceType (which defines the layout set available). Oh, and we’d still need a ResourceLayout_Zone to which Widget would link, but since the zone set is actually defined in cshtml files, who would be responsible for updating the RL_Z table?

The previous mental experiment reveals some ways in which many applications could benefit from the use of NoSQL solutions. We only touched on one property, and those widget types contain quite a few of them.

As it was obvious that we needed XML to store the widget setup, it became quite lucrative to use the same XML blob for resource subclass data as well.

After the widgets discussion, there is one more thing I want to highlight as a benefit of XML. Since most of the page’s content is defined in that one blob, certain scenarios such as version control become trivial. For example, publishing and reverting drafts mostly involves throwing XML blobs around. Compare this to the effort it would take to update all the references properly.

Finally, you may want to ask “Why XML instead of, say, JSON?”. XML admittedly produces relatively bulky documents. However, it has one excellent characteristic: we can easily query it with SQL Server, and that makes a huge difference in many scenarios. Implementing the equivalent performance with JSON would have required cache fields, reliable updating of which would in turn require triggers, but since parsing JSON with T-SQL is a pain (to say the least), it would have drawn SQLCLR in as well. Thus, XML was actually simple this time.

Now show me the XML!

The actual size of the XML varies heavily by resource. On one of my dev databases, the shortest data document is 7 bytes; the longest is 28 kilobytes. But here’s a fairly short one:

image

This is for a recipe page, where the only real property is the recipe reference. The page template also specifies a zone called “Additional”, but it has no widgets specified – thus, a very short XML document.

Here’s a snippet of a significantly longer one.

image

This is from an article page. As you can see, the Article resource type has a subtype field declaring this instance as a “product article”, i.e. one that describes a single Valio product or a family thereof. Since an article does not derive its content from a linked business entity, its content is mostly located in widgets. Therefore, the <zones> element tends to be fairly long with a hefty stack of various widgets. In this example, you can see an image carousel, a text element and a brand banner (with a specific layout set – yeah, we have those for widgets too).

After the data comes the code

When I initially set out to write the description of our CMS features, I was thinking of a single long post. I have now rambled on for three medium-sized ones, and I still haven’t shown you a line of code. But, I promise that will change in the next episode: I will wrap up the CMS story by discussing our NHibernate implementation of the above data structures. And I’ll go through the controller/widget rendering framework as well.

Meanwhile, if there are some questions you’d like to have answered, please feel free to post comments.

November 19, 2011 · Jouni Heikniemi · No Comments
Tags: , ,  · Posted in: Web

Valio.fi deep dive #6: Features of our custom CMS

In the last post, I touched on the choice of using a CMS product or writing your platform yourself. We picked the custom platform approach, and this time I’ll tell you what that led into.

What’s in a Content Management System?

Wikipedia defines CMS in a very clumsy and overgeneric way. Let’s not go there. Given my last post’s definitions on applications and sites, I’ll just list a few key tenets of a modern CMS:

  • Administrators must be able to produce and publish content.
  • The content will consist of text and images, which will be mixed relatively freely.
  • The administrators must be able to define a page structure (and the URIs) for the content.
  • Typically, the administrators must be able to maintain a navigation hierarchy and cross-linkage (tagging, “similar content” highlighting etc.) between the pages.
  • The system must be ready to accept user feedback (comments, likes, whatever).

These are typical features for complex blog engines. Full-blown CMSes often add features like workflows, extensibility frameworks, versioning and whatever.

Dissecting Valio’s content

For Valio, we had two content sources: First, the actual database content derived from Valio systems (recipes, product information) and second, content produced and entered on the site level. Most pages rely mostly on either of the sources, but almost all contain pieces of the other type.

Case 1: The Recipe page

Let’s look into a typical recipe page with some annotations first (click for larger size):

valio-reseptisivu-exp

There are four distinct regions marked in the image.

First, we have elements from the page template, the actual HTML. This contains everything that is not in a box in the image: There are fragments of template HTML here and there. For us, the template is technically a set of recursively contained MVC views and partial views. They take some structural metadata as a model and thus render the correct elements (including breadcrumb paths, possible highlight elements and so on).

Second, there is the recipe itself. In the Valio case, this is a typical example of “application data” – the recipes are imported from an internal LOB system designed specifically for recipe maintenance, and the actual recipe data has minimal editing tools on the site; the exception is the users’ ability to create and edit their own recipes, but that’s a slightly different scenario – and at any rate, it is a way to modify the business data, not maintain the site content.

Third, there are the recipe links on the right side of the page. These links are definitely a spot for maintenance: content editors are free to customize whatever is shown with each recipe. However, due to the volume of the data, hand-crafted maintenance cannot be the only option. Thus, we generate appropriate links automatically from business data whenever there are no more specific requirements. This is clearly an example of a CMS-style function, although with some business understanding thrown in.

Fourth, there is the commenting feature. This is standard user generated content, and most definitely a CMS-like element.

All in all, recipes are a very app-like thing from a content perspective, although with some CMS elements added, Consider this in the context of your own development work: How would you add features like this (commenting support, link generation, ability to customize the highlights)?

Before looking at our approach, let’s slice down a totally different kind of page.

Valio-artikkelisivu-expCase 2: A product article

Articles represent the entirely other extreme from recipes: They are not apps in the sense that their content is entered mostly on the site.

Now, look at the article page image – a typical, albeit a very short, example of its kind. You’ll notice a couple of things: Basically, the page has a template (the unboxed parts). Also, it has two boxes, columns – or let’s just call them zones.

Why zones? Well, if you look closer at the article page (click to get a larger picture), you’ll notice that the zones are split by dashed lines. Those dashed lines represent individual fragments of content. We call them widgets.

Now, this sounds awful lot like a CMS – perhaps even SharePoint with its Web Parts and Web Part Zones. In fact, our CMS functionality is very similar. We have a couple of dozen different widgets, and you can drop and rearrange them into zones. Widgets can also be parameterized – some only trivially, others extensively.

Let’s quickly run over the widgets on the article to the right. On the main content zone, the first one is a picture widget. You can simply attach an image or several to it, and create either a static image or a gallery. The introduction text is just a text widget (very much a DHTML editor on the administrative end), but set to use a layout that renders the default text with the intro font.

Below that, there’s a short run of text – it’s another text widget, this time with another layout. And further down, there’s yet another widget: a recipe highlight. This one shows a specific, predefined recipe with a largish image and the key ingredients. The layout for the recipe highlight has been defined for the site, but the data is pure business data, not designed or edited for the site.

On the right-hand side, there’s a Link widget (the link to the “Piimät” category), an Article Highlight widget set to show three highlights – some of them may be editor-customized, while the rest are automatically filled by metadata-driven searches. Then there’s another recipe highlight, but with a very different layout from the one in the main zone. And finally, there’s a three-item Product Highlight widget.

The building blocks finally take shape

CMS structureAfter explaining this all, let’s look at the big picture. Our key concept is a resource – you might more easily grasp it as a page. Each resource has a URI and some content.

Ok, but how is that content laid out? Each resource also has a type, and we have a few dozen of them. The most understandable ones are those like Recipe, ThemeArticle, Product and FrontPage. Each of these types defines a template, which consists of the two things:  an HTML template that defines the raw markup plus the available zones, and a default set of content (prepopulated widgets in zones etc.). In addition to the template, the resource type also defines the code needed to execute a page – in practice, a controller.

A resource template contains a set of widgets for a typical layout scenario, but often writers will create additional widgets to flesh out the article: they’ll want to include sidebar elements, perhaps use product galleries, embed video or whatever.

App-like resources such as recipes are different. First of all, these resources are typically born when an integration task creates them. Suppose Valio devises a new recipe for a meat stew. As they enter it into their recipe database (an operative system beyond our control) and the publication time passes, the Recipe resource is automatically spawned. The resource is populated with the reference to the business entity (The New Stew), and product data such as ingredients and preparation instructions are properly shown.

But that’s not the end of the story. Even with these relatively self-sufficient app-like pages, the page still has widgets. Although the content editor only has limited influence in the app-driven part of the page, the right columns in particular are open to customization. The templates define these widgets as auto-populating: typically, “find three to five items that match current resource’s metadata”. But using the admin view, the content manager can define a custom search (ignoring the local metadata) or even specify the actual search results themselves. If the admin only wants to specify one thing to highlight, the rest can still be populated through automation.

Auxiliary features

The previously described elements take care of the main content production and editing workflows. Resources enable us to semi-seamlessly arrange together content from varying sources. But there is more to all these resources, and even this list isn’t extensive.

At one end, we have user participation and user-driven content production. For the sake of simplicity, let’s split this into two. First, there is the custom recipe editing, which is a huge topic in itself. In a nutshell, the recipe editor creates the business entities (recipies, ingredients etc.), whips up a new Recipe-type resource and links all these things together for display. The second, more approachable part is everything else: the ability to comment, like and vote on things. We record all this data – as well as popularity information such as view counts – per resource, allow moderation and content blocking on a resource level and so on.

Another additional feature provided by resources is the preview toolset. Each of the resources has a copy of itself created. Only content managers can see it, and it’s called the draft. In fact, you can’t even edit the published resources – you’ll always edit the draft. And then we have two actions to help further: Publish, which replaces the published version with a copy of the draft, and Revert, which replaces the draft with a copy of the public version.

Conclusion

As I write it, the features listed sound simple. In fact, they are, and they make up a reasonable usable CMS. That doesn’t, however, indicate triviality of implementation: there were quite a few difficult compromises made during the design process. From a purely technical standpoint, the system is very incomplete in terms of features and tools. From a practical usability and efficiency standpoint, I think we hit a pretty good medium: we catered to the key business needs without bloating the codebase (or the budget).

In a further post, I will finally cover the topic that originally drove me to write about the whole CMS design issue: the database implementation of resources. Yes @tparvi, I heard you, I just wanted to make sure I can focus on the technical specifics once we get there :-) Up next: Inheritance hierarchies, custom XML columns with NHibernate, and semi-reflected widget construction. Oh yes.

November 5, 2011 · Jouni Heikniemi · One Comment
Tags: ,  · Posted in: Web

What’s new in .NET Framework 4.5? [poster]

.NET Framework 4.5 had its CTP released in Build, and RTM is coming next year. The key improvement areas are asynchronous programming, performance and support for Windows 8/WinRT – but worry not, it’s not all about those new thingies.

Instead of just listing it all out, here’s a poster you can hang on your wall and explore. The ideal print size is a landscape A3. If you want it all in writing, follow the links at the end of this post. Click on the image for a larger version.

[UPDATE 2011-11-16: I have changed the poster to include changes in F# 3.0.]

[UPDATE 2012-03-07: The poster has been updated for .NET 4.5 Beta release. Also, the poster is being delivered to TechDays Finland 2012 participants – the new updated version is equal to the one available in print.]

More information

Check out these links:

 

If you prefer to have the poster in Finnish, we have published it on the ITpro.fi Software development expert group site.

Any feedback is naturally welcome, and I’ll make a reasonable effort to fix any errors. Enjoy!

October 29, 2011 · Jouni Heikniemi · 31 Comments
Tags: ,  · Posted in: .NET

A lesson in problem solving: Never assume the report is correct

A while ago, a colleague of mine reported that our OData services were functioning improperly. I fell for it and started looking for the issue. I never should have. Not that soon.

“The OData services don’t seem to expand entities properly when JSON formatting is used.”

imageI was like “Huh?”. We had a bunch of OData endpoints powered by Windows Communication Foundation Data Services, and everything had worked fine. Recently, the team using the interfaces switched from the default Atom serialization to JSON in order to cut down data transfer and thus improve performance. And now they’re telling me that entity expansion, a feature very native to the OData itself, is dependent on the transportation format of the data. Really?

The alarm bells should have been ringing, but they were silent. I went on and found nothing by Googling. Having spent a whole 15 minutes wondering about this, I then went on trying it myself. Since WCF only provides JSON output through content negotiation, I had to forge an HTTP header to do this. So I went on typing:

PS D:\> wget -O- "--header=Accept:application/json" "http://....svc/Products/?$expand=Packages"

And to my surprise, the resulting JSON feed shape really did not contain the expanded entities. Could it be that .NET had a bug this trivial? Baffled, I was staring at my command line when it suddenly hit me.

Can you, dear reader, spot the error?

 

 

 

The problem is that the expand parameter doesn’t get properly sent. See, I’m crafting the request in PowerShell, and to the shell, $expand looks like a variable reference. It then gets replaced with the value of the variable (undefined), resulting in a request to “http://…svc/Products/?=Packages”. No wonder WCFDS isn’t expanding the entities! Of course, we don’t see this with Atom, since we typically do Atom requests from browser, which doesn’t have this notion of a variable expansion.

So I run up to my colleague to verify he wouldn’t be falling victim to the same misconception. He was issuing the request from bash shell in Mac OS X, but variable interpolation rules for bash are roughly equal to PowerShell, so he was seeing the same issue. So everything actually worked exactly as it should, we were just asking for the wrong thing.

If I had tried removing the –header part from the request, I would instantly have spotted that the expansion didn’t work with Atom either, but I didn’t. Why? Because I was paying too much attention to the problem report’s JSON part, thinking the expansion for Atom works automatically, and neglecting to check the connection between the two. Next time, I’ll be more analytic.

October 23, 2011 · Jouni Heikniemi · No Comments
Posted in: General

Valio.fi deep dive #5: Content Management Systems as platforms

After a brief hiatus, it’s time to look at the Valio case again. This time, I’ll explain a few decisions behind our content management model. I’m sure some of this sounds familiar to almost every web site developer, although most won’t probably dive as deep as we did.

The setup

So you’re developing an ASP.NET MVC web application. You whip up Visual Studio and crank out a few controllers, views and whatnot. Ka-zoom, you have the working skeleton version available in a day or so. At this point, what you’re actually doing is that you’re exposing your key business objects – typically the database – through a server and a browser.

A few days or weeks later you ship the app, and immediately a business user tells you that “Actually, that segment of text is wrong. The correct one is…”. You quickly update a view. Two weeks and thirty updates later, you grow tired. Not only do they want to change the text, but they also want to swap in new images. And one day, you’ll get a mail asking you for a chance to slip in an additional information page. At a specified URI, of course. “Can I have it added to the navigation, too?”

At this point, you’re likely to have a few content delivery related customizations across your codebase. Your route table has a handful of entries specifically for campaign and info pages. Your version control history for the Views directory shows an increasingly high percentage of checkins related to one-off customization requests.

There will be a day when you find yourself asking: Should I just have started with a content management system (CMS), using extensibility hooks to implement your business functionality?

The problem

All web sites require some kind of CMS functionality: the ability to edit typical fragments of web content. Because of this, almost all web sites are based on a CMS. And there are so many of them; see the Wikipedia list.

At the other end of the spectrum, most web applications have very limited needs for content editing. Almost all have some requirements in this regard: even the most closed warehouse management tool often has a small area reserved for administrative announcements.

Most web projects fall between these two extremes. A typical e-commerce application is mostly an application, but it definitely needs content input: product descriptions, images, special offers etc. all need to be designed, tested and typed in on the go, without needing a developer’s input. A complex and diverse collection of content (such as Valio) is by definition a site and needs to be editable, but it also has a huge load of functionality – some of which may be considerably easier to implement using plain programming tools, not a CMS framework.

Picking our sides

When planning the technology strategy for the Valio project, we knew we were in trouble no matter what we picked.

There were requirements for considerable application-like functionalities including user-produced complex content (recipes), APIs for third parties, and demanding search rules. Simultaneously, we knew we would have to entertain a diverse group of content producers, from experienced web editors to amateur moderators.

If we chose a CMS… We would have at least a stub implementation for most of our CMS functionalities.

We would implement all our logic using the CMS’s extensibility API. This might range from being moderately acceptable to extremely painful, but we probably wouldn’t know before we tried.

We would have an authentication / user profile system available out of the box. However, most profile systems are not designed to be very well extendable. In particular, most don’t support complex integration to external directories.

If we wrote a custom application… We would have a relatively straightforward task implementing all the custom things. Hard things are still hard, but we’d be able to estimate them with reasonable confidence.

We would have to write all the CMS tools by hand. Given our reasonably complex set of requirements (page previews, limited versioning, web part –like composability, customizable URIs, page templates), we knew this was going to be quite a few lines of code.

We could do whatever we want with the authentication. Of course, that meant doing it all by hand, from square one.

 

As you probably know by now, we picked the custom application route. The CMS side had its perks, but we decided against it for the following reasons:

  • The client had a fairly particular vision of the content management model. While they didn’t have stated requirements for the administration UI, we knew there were quite a few practical requirements, particularly regarding bulk moderations, that were not typically sufficiently implemented in CMS packages.
  • While we had lots of CMS experience, we also had the requirement of using Microsoft technology, and an inner desire to use MVC to enable very fine-grained HTML output management. We also preferred an open source approach to ensure we could deliver everything we wanted. That left us in a situation where none of us had experience on a CMS that would match the requirements. Since evaluating CMS extensibility is very time-consuming, we didn’t want the additional risk of eating our precious days.
  • On the other hand, having lots of CMS experience (including developing two of them) gave us a head start on designing the necessary infrastructure. Thus, we felt less intimidated by the challenge of creating our own tooling.

At the crossroads?

If you’re facing the same choice, I wouldn’t blindly recommend following us. We have been met with plenty of criticism and surprised faces when telling this story. Many people consider custom app writing a symptom of the NIH syndrome, particularly on a field as well established as CMSes. Also, it is a non-trivial exercise even if you have a good idea on what you’re doing.

The key lesson here is to play to your strengths, and choose by the project type. If you have a team with experience in a particular CMS and you have complex content management requirements, that particular CMS is likely to be a good idea. Then again, if all your users need is a single bulletin board for announcements, taking on a CMS framework is probably a hugely unnecessary piece of extra baggage.

However, one important thing is schedule predictability. If custom code and a CMS seem equally strong, consider any pre-baked system a risk in terms of change management: you quite likely cannot predict all its design constraints in beforehand.

For example, the requirements for Like button throttling in the Valio case were discovered reasonably late in the project, as was the moderation workflow for user's recipes. Most CMSes don’t offer smooth customizability in scenarios like this, and thus a small requirement can suddenly result in a larger refactoring, perhaps writing a module that replaces a part of the CMS itself. You would also be excessively optimistic in thinking that relatively obscure elements such as community content workflows would – or even could – be defined before the project.

The diagram below illustrates some of the design aspects and their weight in a totally customized scenario as well as a CMS-based one.

image

Not all CMS platforms are equal. The right end of the axis represents the versatile but hard-to-extend CMS solutions like SharePoint. The middle section of the chart represents CMS stacks that are more like toolkits for do-it-yourself site development; there are plenty of open source CMSes that are unfinished enough to be called such.

The conclusion

We picked what we picked because of 1) the requirements we had and 2) the people we were. Neither is irrelevant, and this is a key takeaway: A different team might pick an entirely different solution for the same requirements, and they might quite well be right.

You won’t have an easy time deciding on this: it’s a complex architectural choice, and estimating the actual impact of either option is reasonably hard. In the next post, I’ll discuss the key technical choices of our CMS implementation (on class/table level) to give you an idea on what sort of challenges you might be facing and help you in the process of gauging your cliff.

October 20, 2011 · Jouni Heikniemi · 2 Comments
Tags: ,  · Posted in: Web

Looking back at TechDays Finland 2011

Pretty close to half a year ago, Microsoft held the largest annual developer + IT Pro event in Finland, TechDays 2011. In six more months, it’s happening again.

As I was considering the various topics I might talk about, it always took me back to thinking about the years I’ve been talking there. What did people like? What kind of topic would interest people? How can I be better? My talks generally seem to fill a room of 100-200 people, but what do I have to say that’s worth 100-200 hours of Finnish developers’ time?

The TechDays speakers have had almost nonexistent visibility at the feedback gathered from the event. Well, that is, until now. Allow me to present the TechDays 2011 Feedback Infographic: (click on the image for additional resolution)

 

Td2011-Feedback

Challenges for TechDays 2012 speakers

There are plenty of conclusions one can draw from the data. The one that struck me the most is that people want highly practical information, but not without the theory. Case presentations beat pure theory hands-down, but the most popular sessions were the ones with high-energy presenters, a sound theoretical basis and a continuous pummeling of practical demos.

As I’m left pondering this, I want to offer three TD2012 challenges for fellow Finnish speakers:

  • Sessions in Finnish scored 0.27 (on a scale of 1..5, that’s a lot) lower than sessions in English. Many attendees seem to gravitate towards the heavily rehearsed tracks coming from foreign travelling speakers. This must stop. Even though I think many of the travellers are absolutely great, Finnish technology professionals must be able to be more relevant and interesting*.
  • Developer sessions scored a 3.61, which isn’t particularly bad, but it’s not good either. Many developer sessions are far too much based on the equivalent PDC/TechEd/Build sessions, perhaps even slides. Let me repeat: people want practical information and your experience with the theory. Let’s do better this year.
  • Read Scott Berkun’s Confessions of a Public Speaker and either Presentation Zen or Slideology. Reduce the amount of verbiage in your decks and talks, and replace it with raw energy.

*) While berating the state of Finnish speaking, I must tip my hat off to Sami Laiho, a local Windows MVP. As stated in the infographic, the man did five different presentations, scored a 4.43 average (which would, had Sami’s presentations been a single track, been the most popular track of all TechDays), and pulled off the most popular presentation with a stunning 4.58 score. Oh, and the 4.58 was done on day 2 first slot, which speakers typically shun as the previous night’s attendee party is considered to hamper the contact with the audience. Blah blah.

Data disclaimer

A few words on the data used:

This post is based on a data dump of TechDays 2011 feedback I have received from Microsoft. No personally identifiable information was ever transmitted to me. The infographic has been cleared for publication by Microsoft, but such publication probably indicates no endorsement or broader approval. I do not have permission to redistribute the raw data, so questions for further info may or may not get addressed.

This blog post contains conclusions and opinions, which are naturally mine. On the other hand, the infographic is based on objective, raw event data with the following exceptions:

  • Some questions have been grouped together for better illustration (namely, the food and venue/organization ones).
  • I have manually divided the presentations into “type” categories (theory, practice, case, lecture series). This grouping is non-official.
  • Grouping presenters was done by me. Deeming people “professional speakers” or “nobodies” involved subjective analysis.
  • The textual description and the visualization is mine. The original data sheet nominates no kings, nor were pizza slices actually available.

Thanks for listening – and let’s hear your thoughts on TechDays 2011 and its feedback :-)

October 10, 2011 · Jouni Heikniemi · 5 Comments
Tags: ,  · Posted in: General