VSPaste is great – but vile
VSPaste is a Live Writer plugin I have been using to help in pasting code segments from Visual Studio to my blog. It’s been working great – until I noticed it’s polluting my blog with hidden links to itself.
VSPaste allows me to copy code in Visual Studio and just hit the paste button. And hey presto, code appears on my blog, with syntax highlighting and all. I was happy all along, until I noticed that each of the pasted segments actually has a hidden link to the plugin’s home page.
When I paste code like this:
static void Main(string[] args) { int foo = 5; }
I actually get a fragment of HTML like this:
<pre class="code"><span style="color: blue">static void </span> Main(<span style="color: blue">string</span>[] args)
{ <span style="color: blue">int </span>foo = 5;
}</pre>
<a href="http://11011.net/software/vspaste"></a>
The hidden link with no text doesn’t appear every time, but often it does.
Why and why not?
Why: Google Juice, of course. Having lots of links pointing at the site is a good way to improve your ranking, although perhaps not in practice: search engines have become quite a bit more intelligent in weeding out fake links, so the actual effect is somewhat more dubious.
Why not: First off, I don’t, as a matter of principle, want to have uninvited links in my blog content. Second, what happened if search engines actually tracked the empty link and then due to whatever circumstances deemed that page as spam, illegal or whatever? Such measures also tend to negatively affect the rank of pages linking to it, possibly even totally removing them from search results.
This sort of risk could potentially wipe out all the blogs using VSPaste from the search engine results – a massive loss to those who make their living out of blogging. Although unlikely to happen in practice, it shows the problem in a harsh light. Actually, it is exactly this sort of scenarios that have led to the practice of having rel=”nofollow” on links in blog comments.
Plug-in authors should be careful not to mess with the content they help in producing.
PS. In order to work against the effect of all the hidden links, this article doesn’t contain a link to the VSPaste plugin.
January 16, 2010
Posted in: General
No Comments
OpenOffice.org and Microsoft Office: A serious threat for the empire?
The blog world is abuzz on a Microsoft job posting. The US subsidiary looking for a “Linux and Open Office Compete Lead” – and a team of 13 people – seems to signal a meaty victory for the OS crowd, as it implies Microsoft is taking OpenOffice seriously. Or does it?
As I pointed out in the comments of one of the longer posts on the subject, a dozen people really isn’t that much when you consider the fact that the Office business grinds money at $15 billion a year. But I do agree that it’s a change: It’s a public admission that customers actually have valid alternatives and that the situation warrants discussion, also from Microsoft’s end.
Fair enough. Were OOo and Linux powered by companies, Microsoft would probably buy them out of the market. The beauty of Open Source is that it cannot be bought away or controlled. It cannot be stopped by simply turning a few shareholders rich. Instead, market superpowers must fight the OS threat by investing their capital more constructively: making their own solutions better and learning to justify their cost structure. All this advances the state of things by far more than behind-the-scenes stock trading.
But does this imply a serious threat to Microsoft Office? I don’t think so. OOo will steal market share, that’s for sure. But my personal opinion is that it’s not competitive with Microsoft’s offering yet, and its true long-term TCO still remains to be measured. Still, without a grain of doubt, it’s extremely important to have competition.
As for Linux on the server-side, it’s whole lot different. The openness of the Linux Server paradigm has already catalyzed a change in Windows Server and driven Windows Azure towards a more platform-agnostic model of thought. With the amount of serious large-scale backers Linux has, it is no longer bound to the typical limitations of an OS project. I expect the starting decade to be great.
So what does this Compete Lead hiring mean? I’d say it means that Microsoft is slowly getting rid of its arrogance. Focusing a bunch of people to actually think about the competitive losses and improve on what they do is exactly what a responsible business should do. Is it a win for the OS movement? In a recognition sense, maybe. Businesswise, I think those 13 people aren’t going to make the Linux and OpenOffice.org march any easier.
As for the future, I’m hoping that we’ll see similar hires for the Google’s Cloud offering in both infrastructure and app suite segments. Meanwhile, happy new year!
January 5, 2010
Tags: competition, open source Posted in: General
No Comments
What is an "open" API anyway? (case YouTube / TotLol)
TotLol is a membership-based site that aggregates YouTube content for kids. What’s interesting is its background story and how it went from being ad-based to almost non-existent to membership-based.
The author’s version of the story is interesting. Harshly compressed: He claims to have created a service that was one of the first on YouTube APIs. Then Google gradually and suspiciously changed the Terms of Use to cut out the business from TotLol. The author claims this is because Google wanted to steal his idea.
So what’s going on?
Is the story true or false? Tinfoil hats on? Hard to tell.
However, the story does carry an important message: An API may be technically solid, but business conditions can still wreck a perfectly good app. As long as the terms of use remain as vague as they often are, broadly co-operative offerings such as content aggregation are a risk.
Some have considered this a story of Google’s evilness. I wouldn’t go that far. But it’s certainly a reminder: A great service company may be an abysmal platform company. The mental model for providing stable platforms for building business is vastly different from providing hip and cool services for users.
A platform is a turtle, services can be rabbits
Somehow, in my head all of this adds up to the general discussion on speed of change and business agility. For example, Microsoft is very much stuck on supporting IE 6 on Windows XP, even up to the forthcoming years when it will be even more massively outdated for the web. That blows, but it’s one part of a strategy that has helped business thrive on the Microsoft platform.
Licensing terms, pricing and product structures have changed, but slowly enough to keep most clients on board. Upgrades are offered and sometimes even required, but in spite of that, the Microsoft platform keeps rolling on. It does so for equipment manufacturers, software companies, training consultants and everybody else. While the Redmond-based economy certainly has its flaws, it’s quite an achievement to actually have that sort of a critical mass – and to have had it for so many years.
Let it be said out loud: Assuming Google actually did all that maliciously, Microsoft could have done the same, particularly in the past years. I’m not discussing the relative evilness of these two companies. There is a marked difference in the service/platform orientation though, and I expect it to play more and more of a role as all the cloud hoopla really hits the mainstream.
December 30, 2009
Tags: Google, YouTube Posted in: Web
No Comments
ReaderWriterLockSlim performance
A while ago I blogged about the performance of various thread synchronization primitives. Due to the insufficient accuracy of my memory cells, I forgot ReaderWriterLockSlim out of the comparison. Let that be fixed here and now.
The comparison method is still the same, and I have amended the previous post with the results of the Slim version. To summarize:
The Slim version performs significantly better, at approximately 34% of the time it takes for the older version ReaderWriterLock. Below is a duplication of the table containing the relative execution times from my other post. So, the ReaderWriterLockSlim beats its “full” sibling hands down, but is still considerably slower than using Interlocked, and loses somewhat to the Monitor.
| Method | Execution time |
| Non-locking | 1 |
| lock statement / Monitor | 18 |
| ReaderWriterLock | 93 |
| ReaderWriterLockSlim | 32 |
| Interlocked | 8 |
Also: The Slim version exhibits the same characteristics re lock type as the ReaderWriterLock: Acquiring a Reader lock takes the same time as acquiring a Writer one. Acquiring an Upgradeable reader lock is also equally fast, but upgrading takes roughly the same time as acquiring a full lock, putting a Read+Upgrade cycle at approximately 65 in the table above.
I strongly urge everyone to read the performance notes in the previous post before making conclusions based on these numbers. The fact that the ReaderWriterLock is slower than a lock statement doesn’t mean you should use lock statements in your real-world apps. For example, the benefit of allowing multiple simultaneous readers might well offset the slight impact of acquiring the lock.
December 29, 2009
Posted in: General
One Comment
UTF-8 preamble is a problem when you concatenate files
You’re just changing a couple of words in an XML file with Notepad. Your data modifications are guaranteed to be valid by schema. That couldn’t possibly break anything, could it?
<insert the ugly buzzer sound>
It quite likely couldn’t, unless you were editing an XML file that happened to be using UTF-8. Because while Notepad certainly looks like a very innocent, raw data text editor, it really isn’t when it comes down to UTF-8 encoding.
Files encoded in UTF-8 can contain a Byte-order mark (BOM), also known as a preamble or a signature. It consists of the bytes 0xEF, 0xBB and 0xBF right at the start of the file, and identifiers the encoding of the text file. If you ever see “”, it’s the usual visual interpretation of an unparsed BOM, although other character sets can lead to other kinds of misrepresentations.
Why is this a problem?
Normally, it’s not. Most modern UTF-aware consumers (XML parsers, text editors etc.) understand the BOM just fine, although some problems exist particularly in Unix environments. But if files get concatenated together as binary, the BOM gets embedded in the middle of the file – turning into just normal data.
So, we had strange application somebody a long time ago had written. It created XML files by concatenating together various strings and XML files. The files were pushed into the ASP.NET Response stream by simple Response.Writes and Response.WriteFiles.
At this point, you probably guessed the rest. Somebody went ahead and edited one of the XML files (changing those classic “just two words”) that got added through Response.WriteFile, which is a binary operation… And boom, you have invalid data in your XML file. In this case, the file had always before been edited in a text editor that didn’t add the preamble, but Notepad did.
Removing the BOM
It’s really as trivial as just removing the first three bytes of the file, but unless you happen to have tools for that at your disposal, paste the stuff into an editor that does not add the BOM. Alternatively, use a more sophisticated editor that allows you to choose if you want a preamble or not.
For example, in Visual Studio, you can just choose File > Save As, then drop down the Save button and choose “Save with Encoding”. After that, you’ll have a dialog with lots of options, including “Unicode (UTF-8 without signature)” as well as a “Unicode (UTF-8 with signature)” one.
If you ever need to do this in your own code, the .NET StreamWriter has a constructor that lets you choose whether or not to use the BOM. The default is false, and since most Framework methods use Encoding.UTF8 as the default encoding, BOMs get removed by just reading data in and then writing it back out.
December 21, 2009
Tags: charset Posted in: .NET, Misc. programming
4 Comments
Visual Studio 2010 delayed – but it’s not a bad thing
Yesterday, Soma announced that Visual Studio 2010 would be delayed for a few weeks (apparently from the original 22nd March 2010 launch date). But really, we should all be happy – we’re getting a February Release Candidate in return.
If you’ve followed the VS2010 lifecycle, you’ll already guess it’s the performance. If you’re interested and are not already following, start reading Brian Harry’s blog where the latest advances on the performance front have been pretty openly discussed.
This time, it’s a delay I definitely find appropriate. While VS2010 isn’t bad at all, its performance issues are a great risk. A Visual Studio release hasn’t been deemed a failure in ages, but with the problems beta 1 faced, VS2010 was bound to be the first in a long time. Addition of a Release Candidate and a small delay will hopefully help ensure that the course has been corrected.
Surprisingly, the public feedback to the delay seems pretty positive. The quality problems with the early betas actually seem to have woken people up to the fact that fixing the problems takes time. And for once, the feedback loop between the Developer Division and people having performance problems seems to be intact, including private test builds and all. Looks promising to me.
December 18, 2009
Tags: Visual Studio Posted in: .NET
No Comments
Performance overhead of thread synchronization
One of the main problems with multithreaded application development is handling the synchronization of data. The failure to do so can result in data corruption. On the other hand, over-synchronizing causes loss of performance. But how slow is synchronization, really?
Edit: As discussed in the comments section, ReaderWriterLockSlim was accidentally forgotten out of the comparison. See a separate follow-up for that.
Parallelization is very tricky, and proper optimization always requires measuring. But even with the complexities of reality, it’s valuable for a developer to understand the rough performance implications of various solutions. To this purpose, I crafted a test scenario to try a few methods out.
The test
Essentially, the idea was to do a very simple operation without synchronization and then apply synchronization primitives around it. The whole simulation is run in just a single thread. Its purpose is not to measure the performance of multiple threads, but to gauge the impact of synchronization.
This is important. For very many applications, synchronization protects against situations that occur very rarely – i.e. threads might not be hitting the same piece of data commonly anyway. In that sense, this test scenario isn’t very much different from many real-world applications.
On the other hand, some pieces of data may be hit very frequently, and queueing for access to it is common. Thus, some real-world applications may see more performance benefit by minimizing the time a resource is kept locked rather than choosing the best synchronization primitive.
With all that said, here’s the key segment of the code:
double sum = 0; for (int loops = 0; loops < LoopCount; ++loops) { int j = 0; DateTime start = DateTime.Now; for (int i = 0; i < LoopLength; ++i) { j++; } DateTime end = DateTime.Now; sum += (end - start).TotalMilliseconds; GC.Collect(); } Console.WriteLine("Average execution time: {0:0} ms", sum / LoopCount);
The variables LoopCount and LoopLength are set to 5 and ten million, respectively. So, the variable j is incremented one-by-one to 10000000 five times, and the average of these runs is used. Some testing has shown this suite to produce reliable enough results.
Various approaches
The basic code above does no locking, and is therefore the fastest competitor – no surprises there. The other tested alternatives are the C# lock statement (equal to using the System.Threading.Monitor class) and the System.Threading.ReaderWriterLock class. Also, since this case happens to handle a very simple operation (incrementing an integer), I also tested System.Threading.Interlocked.
Each test was done by wrapping the “j++” statement in some code, or in case of the Interlocked scenario, replacing it.
object lockObject = new object(); // Loop structure cut out lock (lockObject) { j++; }
The C# lock statement, above, is the simplest and the most generic of the locking approaches. Behind the scenes, it translates to a Monitor.Enter/Monitor.Exit-pair enclosed in a finally-block, so this isn’t language-specific.
If you need more complex locking, get yourself a ReaderWriterLock. It will allow you to handle a scenario where multiple simultaneous readers are OK, but only when nobody is writing – and that writing is only allowed from a single thread at a time.
ReaderWriterLock rwLock = new ReaderWriterLock();… rwLock.AcquireWriterLock(TimeSpan.Zero); j++; rwLock.ReleaseWriterLock();
In the special case of incrementing (or comparing, or value-setting), you can also use the Interlocked class:
System.Threading.Interlocked.Increment(ref j);
How fast are they?
Fixing the execution time of the non-locking implementation as a reference point (execution time = 1), I got the following table of relative execution times (smaller is faster):
| Method | Execution time |
| Non-locking | 1 |
| lock statement / Monitor | 18 |
| ReaderWriterLock | 93 * |
| Interlocked | 8 |
*) Acquiring and releasing a reader lock performs pretty much the same as a writer lock.
The results are pretty clear: Don’t lock if you don’t have to. Firing up a monitor around your increment operator will slow your app to almost 1/20th of the speed. Of course, the relative locking overhead will shrink as your locked operation becomes heavier, so most practical scenarios won’t see such dramatic differences between different models.
That said, don’t even think about skipping on thread safety if your application actually has a multi-threading scenario. Any data corruption issues you may face are extremely harmful and notoriously complex to debug. But in a case when you can choose between various approaches to thread synchronization, choosing a speedier method instead of a slow one may give you quite nice benefits. In particular, it’s important to know when to choose Interlocked operations over a full-blown monitor.
These tests were performed on .NET 3.5 SP1. I will look back into this matter once .NET 4.0 leaves beta stage – its new synchronization primitives are worth another round of testing.
December 17, 2009
Tags: concurrency, performance Posted in: .NET
3 Comments
Opalis acquisition adds more automation to the System Center family
Microsoft is boosting its System Center family of products by acquiring a Canadian IT automation company called Opalis Software. What does this mean?
Opalis does IT automation. Taking a look at the material on the Opalis site, the key components seem to be:
- A workflow platform that enables easier automation of IT tasks (responding to problem tickets, automating VM deployment etc.)
- A set of integration tools, allowing smoother co-operation between system management products of different vendors
- Loads of ready-made, packaged workflows for handling many common scenarios out-of-the-box
That much was said directly. Next up, some speculation:
- The long-overdue System Center Service Manager product is probably getting a load of new stuff from here, but not in its first release (due H1/2010). The interesting part is that the SCSM features already claim “A workflow engine for automating all or portions of IT processes and for integrating System Center solutions” – something pretty heavily duplicated in the Opalis solution.
- The new technology from Opalis might get rolled into SCSM v2 (hypothetical, since nothing has been announced yet), but unless the workflow engines happened to be compatible, it might not be an easy ride. At any rate, the automation and integration story might be a strong candidate for the lead feature for the next version of SCSM.
- The oncoming additions of v2 may cause organizations to delay Service Manager installations in fear of breaking changes later on. Such a reaction wouldn’t be surprising, given the generally slow maturation rate of System Center products and Microsoft’s recent adventures in rewriting the Windows Workflow Foundation just two versions after its original launch.
- More and more automation will appear within other System Center products, particularly Operations Manager and Virtual Machine Manager.
- Given that Microsoft has pledged to extend the on-premises manageability to the Cloud, I would expect the Opalis’s heavy use of the C word to play a part in the acquisition. When and how will such features be released within System Center remains a mystery.
All in all, this acquisition seems to play well into Microsoft’s System Center strategy. The fact that SCVMM can now manage VMware environments as well as Hyper-V ones signals a change into the direction of openness. Microsoft has an interest to push their system management products themselves, instead of only using the management tools as a leverage to push more Windows Servers in. The Opalis portfolio would seem to give them a nice running start in many common integration scenarios.
Read more in the announcement on the System Center team blog and the acquisition FAQ.
December 11, 2009
Tags: Opalis, System Center Posted in: Windows IT
No Comments
Security problems with downloaded .NET assemblies
Have two copies of the same file with exactly the same content on a bit-to-bit level, yet one works and the other one fails with a security error? Yeah, that could happen.
In this post, I will discuss two features of Windows that may not be familiar to you. First, files in the NTFS file system can have hidden content. Second, Windows uses exactly that feature to remember things you didn’t expect it to.
First stop: NTFS Alternate Data Streams (ADS)
While you’re used to referring to the contents of the file by their path, that’s not the whole truth. The path really refers to the default stream of the file. Additional streams can be accessed by appending a colon and a stream name: C:\foo.txt:bar.
What’s in a stream? Whatever you put there. However, that’s tricky, as most tools don’t really support alternate streams. The cmd.exe’s redirection operators do, though:
D:\temp>echo "This is hidden content" >foo.txt:bar
Now we actually do have a file, but its size is set to 0 – because dir only shows the size of the default stream.
Directory of D:\temp 10.12.2009 19:16 0 foo.txt
Open it in notepad, use the type command on it, whatever – it’s empty. But look at it through the redirection operator and the more command:
D:\temp>more <foo.txt:bar "This is hidden content"
If you really need to discover these streams, get your hands on the SysInternals streams tool, which prints out the embedded streams just nicely (and can also be used to delete them, if you want).
An ADS application: The Attachment Manager
Download a file from the Internet and ponder, how Windows can know it came from the net. Yep, you bet: Alternate Streams. Since Windows XP SP 2, files downloaded from different security zones have been flagged as such. This flag is stored in an alternate stream called Zone.Identifier.
D:\temp>streams test.exe Streams v1.56 - Enumerate alternate NTFS data streams Copyright (C) 1999-2007 Mark Russinovich Sysinternals - www.sysinternals.com D:\temp\test.exe: :Zone.Identifier:$DATA 26 D:\temp>more <test.exe:Zone.Identifier [ZoneTransfer] ZoneId=3
The ZoneId of 3 indicates Internet zone.
And yeah, the functionality depends on the client you use. The file, as hosted in the internet, does not have an innate notion of “a zone” – it’s tagged when the file is saved after downloading.
Internet Explorer does the tagging, as does Firefox 3. Other browsers won’t, some email clients might. Therefore, you could well end up with files whose default streams are bitwise equal but which operate on different permission sets.
The most visible effect of the zone tag is the unverified publisher dialog (“The published could not be verified. Are you sure you want to run this software?”) when running an exe.
By now, you probably want to get rid of that “came from the internet” tag. There are three basic approaches to this:
Windows even has a UI support for this. Open the properties for a file that has the zone identifier applied, and you’ll see an “Unblock”-button. Click that, and the Zone identifier is history. This, however, isn’t exactly pleasant for lots of files.- Use an utility. The already mentioned streams.exe works fine, but there are specific apps like ZoneStripper too.
- Copy the files over to a FAT file system which doesn’t support NTFS ADS and then back again; a USB drive is usually the best option.
So what’s this got to do with .NET?
It really has nothing to do with .NET per se: the Attachment Manager is designed to protect Windows users with all types of files, not just .NET files. But there are two corollaries that do affect .NET applications in specific.
First, Visual Studio dislikes project files with partial trust. If you’ve ever received an error dialog that starts with “The project location is not fully trusted by the .NET runtime”, you’ve seen this. Unblock the solution, project and source files using any of the previous methods.
Second, Code Access Security may limit what your code can do. If you suddenly find your code unable to write into files or registry and instead get SecurityExceptions, your code might be running with more limited permissions. Note that this can also bite you indirectly: Deploy an Internet-zoned DLL into an otherwise working application, and you may encounter some truly surprising error messages.
Further reading and references
- KB105763: How to use NTFS Alternate Data Streams
- KB883260: Description of how the Attachment Manager works in Windows XP Service Pack 2
- Some really good tips on configuring Attachment Manager
Also, thanks fly out to my colleague LenardG for debugging a related issue a few weeks ago.
December 10, 2009
Tags: IE, NTFS, security Posted in: Windows IT
2 Comments
ASP.NET MVC 2 Beta and Visual Studio 2010
If you’re eager to try out new things, you probably already run Visual Studio 2010 beta 2. If you do, it’s good to realize that trying out ASP.NET MVC 2 beta on the same machine isn’t really supported – not even on Visual Studio 2008. But as usual, it works just fine if you hack a bit.
The problem is that VS2010 Beta 2 ships with ASP.NET MVC 2 Preview 2. The IDE tooling within VS2010 is engineered to work with that version. ASP.NET MVC 2 Beta installer downright refuses to work on a computer with VS2010, because it senses the conflict.
However, as Phil Haack points out, you can fix this by uninstalling and installing things in the correct sequence. Namely,
- Uninstall the item called “Microsoft ASP.NET MVC 2 – Visual Studio 2008 Tools”
- Uninstall the item called “Microsoft ASP.NET MVC 2”
- Run the MVC 2 Beta installer.
With these instructions, everything seems to work fine on the 2008 side. I have also noted no problems with VS 2010, although some might occur due to the mismatches between the IDE’s expectations and the runtime version.
If you don’t care about the Visual Studio experience, just get your hands on the ASP.NET MVC 2 Beta DLLs and reference them from your project. Again, the design experience may not be perfect, but it’s a no-hassle way to try out the new beta – and works both on 2010 and 2008.
December 9, 2009
Tags: ASP.NET, ASP.NET MVC, Visual Studio Posted in: .NET
No Comments
