Most people have a gut feeling about when to use StringBuilder for concatenation and when to just add strings together with the + operator. But what are the exact situations in which each of the approaches is better? When the question gets asked, people often give out overly simple rules such as "5 catenations". Is that really correct for the vast majority of cases? Of course, being the dubious me, I decided to test it and resolve the question once and for all.
The basic setting is this: StringBuilder.Append is faster than String + String. However, new StringBuilder() requires time. Now the question is: How many Append calls are required to have the speed benefit exceed the construction cost of the StringBuilder? Ultimately, the answer would be just one magic number. Unfortunately, in practice it isn't.
Here are the simplified conclusions. They shouldn't be taken literally, because situations vary and there's a code readability issue as well (most people read String + String more easily than sb.Appends). Regardless, for most cases these rules do provide the correct answer from a performance perspective.
I don't expect you to believe me any more than any other information source on the net. But to back up my claims a bit, I'll discuss the background of these results next.
String objects in .net are immutable. Once the string has been created, the value can't be changed. When you type s = s + "foo";, you actually discard the old s and create a new string object containing the result of the concatenation. When repeated several times, you end up constructing many temporary string objects.
StringBuilder, on the other hand, represents a mutable string. The class itself contains quite a few methods to change the contents of the string. This includes appending new strings to the end - the most common operation by far. Internally, StringBuilder reserves a buffer of memory which is used only partially at first (usually). Concatenations that fit into the buffer are just pasted in and the string length is changed. If the new resulting string wouldn't fit into the buffer, a new buffer is allocated and the old contents are moved in. In no case new objects need to be created.
The sore points of StringBuilder are the construction cost (which makes the "magic number" practically always at least 3) and the cost of allocating a new buffer when the resultant string would exceed the current buffer size. The latter one explains why the preknowledge (or a good estimation) of the resultant string size helps so much: StringBuilder can just allocate a sufficient buffer once.
Testing this is actually pretty simple. Choose a string operation, implement it using both ways and repeat sufficiently many iterations while measuring the execution time. There are basically two factors involved: the length of the strings being handled and the number of concatenations. Few real-world scenarios use a fixed amount of concatenations with fixed-length strings, so a very realistic test case would do real-world concatenations. However, constructing such a scenario isn't easy as it tends to adds non-string operations into the loops, thus messing up timing.
Pure string concatenation loops are very rare in any case, so even if you're able to speed up your string operations by 50%, it's very unlikely your software will speed up that much. The point here is this: if you want absolutely best performance, measure it yourself - in your real-world scenario. However, fair amount of testing on some of my applications has convinced me that the simple rules outlined above actually do hold up even with fairly varying material.
So, my test was essentially a loop of string concatenations with each iteration appending another string of predetermined length and content to a temp variable. I mostly varied the number of concatenations (iterations of the loop) to find out the cutoff point, but I also played with the string length. All tests were repeated 10 million times by an outer loop to provide better sampling. Everything was run on my AMD Athlon 2800+ with 1 GB of Memory, XP Pro and .net Framework 1.1.
The following source snippet shows the basic versions of the testing loops:
// String version
string s2 = new String('x', Int32.Parse(args[0]));
int loops = Int32.Parse(args[1]);
for (int j = 0; j < 10000000; j++) {
string s = "";
for (int i = loops; i > 0; --i)
s += s2;
}
// StringBuilder version
string s2 = new String('x', Int32.Parse(args[0]));
int loops = Int32.Parse(args[1]);
for (int j = 0; j < 10000000; j++) {
StringBuilder sb = new StringBuilder();
for (int i = loops; i > 0; --i)
sb.Append(s2);
sb.ToString();
}
The extra ToString call at the end of StringBuilder version is there to level the field for the approaches: the first one's end result is a String, so it should be the same for the last one as well. Leaving that ToString out had a marginal effect on the results: while it did make a 8% difference with a single concatenation, the effect quickly died as the number of operations increased.
I started with 10-character strings, running from 1-50 concatenations (each repeated 10 million times as outlined above). The result is the chart below, displaying the relative execution times against the number of iterations (1-15). Absolute execution times aren't shown since they're hardly relevant.
The blue line is the performance of the pure String approach. It looks linear at first sight, but it isn't. If the String approach had to allocate space for X chars (where X is the length of the string being added, 10 here) per loop iteration, the time requirement would grow in a linear way. However, the amount of memory needed - and also, the amount of existing data being copied to the newly constructed string object - increases with every iteration. For Nth iteration, the String version allocates space for N*X chars. Thus, every iteration is slower than the previous one, and the String time curve steepens quickly as N grows.
The red line is StringBuilder at its basic settings. If you add a trendline, SB actually performs fairly linearly with increasing N*X. The bumps in the line are caused by the buffer allocations. Now, knowing how StringBuilder works in .net helps here: The default buffer size is 16 chars, and it's doubled each time it overflows. Remembering that X is 10 here, it's no big surprise that the bumps appear at 2 (after 16 chars), 4 (32), 7 (64) and 13 (128) iterations.
As you can see here, the first time the SB result is below the String result is at six concatenations. However, the memory alloc bump at 7 concats makes SB again slower than pure strings. After that, however, the results are clear. Even though the bump at 13 catenations is considerable, it's nevertheless much below the blue line. However, the exact figures aren't relevant: the bump locations are much tied to the amount of chars gathered so far. However, with most normal strings the cutoff point is somewhere between 4 and 8.
The green line represents a StringBuilder initialized to the size of the final string (using the StringBuilder's int-taking constructor). As you can see, this is the fastest approach by a very clear marginal. And, as you can see, the cutoff is at three catenations! The obvious drawback here is that you have to know the buffer size beforehand, which you usually can't do. For the cases you do know it (such as this simple fixed-length scenario), it's blazingly fast. At 50 catenations with 10-char strings, it's 550% faster than pure String-based catenations and 35% faster than uninitialized StringBuffer. The differences tend to grow as the size of the data increases.
The good thing is this: even a rough estimation of the resulting string size helps. If you overestimate the string size, you're allocating extra memory, but you're avoiding mid-loop buffer expansions. The extra memory allocation will slow you down at some point, but the effect may be negligible. If you underestimate the string size, you're going to have a buffer operation at some point. However, it's very likely you've still skipped early reallocations.
For example, if you're generating a 150 char string in 10 char increments (but you don't know these characteristics beforehand), initializing the StringBuilder with default values causes four buffer reallocations (16 -> 32, 32 -> 64, 64 -> 128, 128 -> 256). While initialization to 150 (or any larger value) would avoid the allocations altogether, even an initialization to a rough estimate such as 100 will help: you'll have only one realloc happening.
The moral of the story: Estimate whenever you reasonably can. Even a bad estimation will usually provide 10-20% benefit over a StringBuilder constructed with the default values. However, if your strings are very long, you'll want to read the following chapter first.
How about string lengths? Varying the string component length (X above) with a default StringBuilder has actually pretty little effect. For fairly short strings, the cutoff point is usually a bit lower, but this is largely caused by the fact that more short strings fit into the default StringBuilder buffer of 16 chars. However, the absolute gain here is usually irrelevant since the concatenations on short strings are very fast regardless of the method used.
The pure String-based concatenation slows down as the number of chars in the string grows. The worst scenario is many additions of short strings at the end of a long string. For example, when 2 chars get added at the end of a 500 char string, 99,6 % of the memory allocated is for the old part of the string. Duh!
For StringBuilders, later buffer reallocs are slower, of course. More memory needs to be allocated and more old content needs to be moved around. So, the longer your strings become, the more you'll gain by estimating. For 50 catenations of 50-char strings, a perfect estimation gets you a 50% speed benefit over a StringBuilder with default settings!
However, there's a catch. As the memory allocations grow, the significance of your estimation accuracy plays a bigger and bigger role. Suppose we have the previously discussed 50x50 char string, resulting in 2500 bytes of final size. Now, the following table lists the execution times with different estimations. Times are relative to the default settings, so that the default is indicated by 100%; smaller figures mean faster execution (less time).
| Initial buffer size | Time |
|---|---|
| 16 (default) | 100 % |
| 50 | 97 % |
| 2000 | 88 % |
| 2499 | 104 % |
| 2500 | 49 % |
| 3000 | 53 % |
| 4000 | 62 % |
| 5000 | 103 % |
| 10000 | 268 % |
As you can see, if you can guess the final size of the resultant string, you're very fast - only 49% of the default execution time. However, make the buffer one byte too small (2499 in this example), and you've just ruined your performance. Adding the last element doubles the buffer to 4998 bytes, which has quite a lot of overhead in it. In the other direction, even a 60% overalloc at 4000 bytes is pretty fast (only 62% of the original execution time). Unfortunately that costs memory, and with strings at the sizes of several megabytes, you probably can't afford that luxury.
On the other hand, you also saw that also slight underallocation wastes RAM eventually. Neither is the default approach perfect: always doubling the buffer tends to allocate extra space, too. So, slight overallocation might be both the fastest and the most memory-sparing approach unless you can do a perfect estimate.
Guessing is hard, but luckily the consequences of a bad guess aren't usually catastrophic. If you can avoid massive overallocation, you're not likely to do much worse than the default settings. In any case, the execution time without StringBuilder is 712% on the scale above; it's pretty unlikely you could do worse than that. :-)
StringBuilder performance is a tricky thing. In the last chapter you saw that the StringBuilder with perfect size estimation can be 15 times faster than normal string concatenation. But earlier in the article you also saw that even the default StringBuilder beats normal string catenation by a clear marginal once the cutoff point of 4-8 concatenations is passed.
Except for the most critical string handling loops, optimizing the process to the point of making perfect estimations isn't usually worth it. For reasons of code clarity you might even want to avoid using StringBuilder when the amount of concatenations is only slightly over the cutoff point and you're working with an operation that's not critical to the millisecond level. For example, constructing a ten-part SQL statement is likely to be faster with StringBuilder, but the speed difference is negligible when compared to the execution time of that statement. Though, once you become familiar with the StringBuilder class, you'll be reading sb.Appends just like you read plus signs.
Posted by Jouni Heikniemi at August 22, 2004 02:30 PMObviously, all measurements are heavily dependent on what you measure. I'd expect the concrete numbers to change between different implementations of Java, and I would expect the trends to stay same.
My rule of thumb is to use StringBuffer when building strings from variable number of components (essentially, when building a string in a loop). Otherwise, I tend to use String.
Posted by: Antti-Juhani Kaijanaho at August 22, 2004 05:28 PMOr was that Java? You don't mention the language, and that does look a little strange to be Java, but sufficiently similar to have fooled me.
Posted by: An at August 22, 2004 05:34 PMHeh. It's C# - the post was in .net category and the post does mention .net Framework, but perhaps that's not clear enough. I added ".net" to the post title as well.
It would be interesting to see similar benchmarks run on Java. Even though numbers are bound to be different, I believe the same principles apply to both worlds.
Posted by: Jouni Heikniemi at August 22, 2004 09:22 PMWell, in principle dotnet can run Java :) And my point about multiple implementations hold for C# too, since there is at least Mono.
In fact, it was the dotnet references that made me suspect my initial assumption.
(BTW, at least to me saying "dotnet" is clearer than writing it with a real period:)
Posted by: Antti-Juhani Kaijanaho at August 22, 2004 11:20 PMJouni Heikniemi,
Nicely explained in plain english...
Thanks.
Posted by: ghenz at June 14, 2005 08:20 PMFrom what i've read, java string concatenation operator '+' intenally uses stringByffer for expression that concatenates many strings. So only useful usage of stringbufer would be when you concatenate strings in loop, or you can't do that in one expression (because for some weird reason you need to do other operations in between concatenations ... )
I was wondering if that case was with c#, that seems rather obvious optimalization, but i couldn't find any mention of it
(in essence, would:
string str = "dsadsa" + "dsadsad" + someString +
"dadasdsad" + someOtherString + "dsaddsasad" + "dsadsadas" + "sadjhsadsadhsa" + someEndingString + "ehh, i got tired";
be done internally by StringBuffer in c# like in java ?
please tell me difference between String(capital S) and string(small s) in C#
Posted by: Atul Yadav at December 6, 2005 03:30 PMNo functional difference. The other one (string with a small s) is a language specific alias for String (which is a class name from the Base Class Library).
Posted by: Jouni at December 6, 2005 09:02 PMhi
Posted by: sachin at May 9, 2006 12:09 PMVery good article
Posted by: kartar at May 19, 2006 11:10 AMNice Explanation
Posted by: Rajiv at May 25, 2006 09:53 AMvery nice!
I'm curious when a string is created. Is it created when you put "" around it or only in the end.
Does "1" make 2 strings and then combine them into 1? So are these equivalent?
1.
string a = "abcdef" + "ghijkl";
2.
string a = "abcdef";
a += "ghijkl";
How do those compare with:
1.
StringBuilder str = new StringBuilder();
str.Append("abcdef" + "ghijkl");
2.
StringBuilder str = new StringBuilder();
str.Append("abcdef");
str.Append("ghijkl");
Craig, i found that best way to get such answer is to compile such examples and look up resulting bytecode with Reflector ...
... esentially when you do assignment:
String a = "sdas" + "dsada" + myInt.toString() + "dsadsad"
it gets efficient teatment - there is internal function used to append constant strings together
(doasn't constuct many interediate objects as:
a += "sdas"; a +="dsads"; a+= myInt.toString(); ...)
... so unless you concatenate in a loop, or have screenful of concatenations frequently separated with other processing (or overuse += where + and line break would do just as fine :) you can give StringBuilder a rest :P
Posted by: Niktu at June 29, 2006 12:27 PMVery good article. Thank you!
I've always had the gut feeling that StringBuilder would be more efficient in many cases, but didn't have the data to back it up and convince my colleagues to use it.
What about the performance of AppendFormat?
Which is faster:
-- 3 Appends --
sb.Append("fixed string 1");
sb.Append(stringVariable);
sb.Append("fixed string 2");
-- or AppendFormat --
sb.AppendFormat("fixed string 1{0}fixed string 2", stringVariable);
Erik
Posted by: Erik Molekamp at August 29, 2006 07:46 PMvery nice & cleary explained artice!
Posted by: Tomato at September 20, 2006 05:05 PMHi, as you seem to be quite experienced in these kind of problems, I wonder if the principles tought regarding the memory allocation is applicable in Java when considering String vs. StringBuffer (our servers still use 1.4, so StringBuilder isnt available yet.).
Posted by: JB at October 26, 2006 07:46 PMVery good and helpful article i want to thanks to you for this article.
Posted by: Manish at December 20, 2006 02:21 PMThank you for this benchmark, it will help me to finish more quickly one of my projects
Posted by: Harry at January 31, 2007 04:39 PMIs this Java relevant when comparing stringBuffer and stringBuilder classes?
Posted by: keith holdaway at February 16, 2007 07:59 PMThis is the very very excelent way of comparission between string and stringbuilder
Posted by: Nirbhay Kumar Singh at March 5, 2007 10:43 AMThanks so much for this--I had an app that was doing a large number of concatenations. In this case I knew what the final string length would be. I preset the starting capacity on all the stringbuilders used in the app--it actually visibly improved speed!
Posted by: Monica at March 14, 2007 03:27 PMsir,
How to concatenate different values of buttons i a single textbox in C#.net
Agreed that there are certain situations where the StringBuilder is useful, but in general I think it is not necessary and probably overused by most developers.
Check out my reasoning here:
http://codeslammer.wordpress.com/2007/07/07/do-not-use-the-stringbuilder/
Nobody should be using single concats over a bounded list of strings. Consider that most concatenations are bounded and in those cases StringBuilder cannot ever perform better than String.Concat(). The concat code above is flawed in thinking that one would use single concats over and over. Compare the StringBuilder code with this version of concat:
for (int j = 0; j 0;)
sargs[i] = s2;
String.Concat(sargs);
}
This concat runs almost twice as fast as StringBuilder at any loop count. And this is not just true of loops, as long as the counts are bounded, Concat is always faster.
There are definitely times when you should use StringBuilder, but general concatenation of strings is not one of them - no matter the number of strings being concatenated.
Entry form garbled the pasted code, second try:
for (int j = 0; j 0;)
sargs[i] = s2;
String.Concat(sargs);
}
I've taken the liberty to do some additional testing into the memory usage of various methods.
Might be useful: http://blog.cumps.be/string-concatenation-vs-memory-allocation/
Posted by: David Cumps at September 16, 2007 08:49 PMNice Article an useful one.
Posted by: Nachi at January 4, 2008 08:33 AMrg2QTa http://youtube.com/phentermines
Phentermine
http://www.fotolog.com/buyphentermine/about
Posted by: loltpgof at April 2, 2008 11:03 PMgood article.
but clear differences are not given.
point #1: If the new resulting string wouldn't fit into the buffer, a new buffer is allocated and the old contents are moved in. In no case new objects need to be created.
is not differentiating stringbuilder with string cause in case of string class also this thing happens.
so clear differentiation must be given
how Strings useful than the StringBuffer
Posted by: subhash at April 16, 2008 05:35 PMViva! , busty lesbian hardcore sex, [url="http://community.webshots.com/user/Gordi36/bookmarks"]busty lesbian hardcore sex[/url], http://community.webshots.com/user/Gordi36/bookmarks busty lesbian hardcore sex, anal teen hardcore, [url="http://battlecentre.net/forums/member.php?u=133"]anal teen hardcore[/url], http://battlecentre.net/forums/member.php?u=133 anal teen hardcore, hardcore teen lesbians, [url="http://www.asianave.com/Kelly3v6/"]hardcore teen lesbians[/url], http://www.asianave.com/Kelly3v6/ hardcore teen lesbians, anime hardcore xxx, [url="http://www.blackplanet.com/Jill3n6/"]anime hardcore xxx[/url], http://www.blackplanet.com/Jill3n6/ anime hardcore xxx, messy girls, [url="http://profiles.aim.com/volla2s"]messy girls[/url], http://profiles.aim.com/volla2s messy girls,
Posted by: Smith888 at April 22, 2008 09:15 AMViva! , nude drunk college girls, [url="http://www.dealdatabase.com/forum/member.php?u=70023"]nude drunk college girls[/url], http://www.dealdatabase.com/forum/member.php?u=70023 nude drunk college girls, xxx little girls young nude, [url="http://battlecentre.net/forums/member.php?u=135"]xxx little girls young nude[/url], http://battlecentre.net/forums/member.php?u=135 xxx little girls young nude, nude young beautiful girls, [url="http://community.webshots.com/user/Lilly3z8/bookmarks"]nude young beautiful girls[/url], http://community.webshots.com/user/Lilly3z8/bookmarks nude young beautiful girls,
Posted by: Ariana074 at April 24, 2008 06:47 PMXoclO0 vtmnfqoixwze, [url=http://mkvprpkpcxtr.com/]mkvprpkpcxtr[/url], [link=http://xgfitdulpdji.com/]xgfitdulpdji[/link], http://sjbduopnqgkz.com/
Posted by: bciryp at May 4, 2008 11:58 AMI've more or less been doing nothing to speak of. I just don't have much to say these days, not that it matters. Basically not much noteworthy going on worth mentioning. So it goes., girls hot web cam , http://www.vanderbilt.edu/modeldata/_tmp/girls-hot-web-cam.html girls hot web cam , free sex web cams live , http://international.wmich.edu/cache/tmp/free-sex-web-cams-live.html free sex web cams live , asian cam babes , http://www.vanderbilt.edu/modeldata/_tmp/asian-cam-babes.html asian cam babes , web cams naked , http://www.vanderbilt.edu/modeldata/_tmp/web-cams-naked.html web cams naked , free sex web cams live , http://www.irpa.net/includes/tmp/free-sex-web-cams-live.html free sex web cams live , free live cams , http://international.wmich.edu/cache/tmp/free-live-cams.html free live cams , free live web cam girl , http://international.wmich.edu/cache/tmp/free-live-web-cam-girl.html free live web cam girl , web cams private , http://www.isoe-network.net/includes/tmp/web-cams-private.html web cams private , free asian sex chat rooms , http://www.vanderbilt.edu/modeldata/_tmp/free-asian-sex-chat-rooms.html free asian sex chat rooms , girls on webcams , http://international.wmich.edu/cache/tmp/girls-on-webcams.html girls on webcams , free web live cams , http://www.irpa.net/includes/tmp/free-web-live-cams.html free web live cams , live web cam free gay , http://www.irpa.net/includes/tmp/live-web-cam-free-gay.html live web cam free gay , live free web cam chat , http://www.isoe-network.net/includes/tmp/live-free-web-cam-chat.html live free web cam chat , web cams hot girls , http://www.praguesummer.com/cache/tmp/web-cams-hot-girls.html web cams hot girls , free live chat video , http://www.vanderbilt.edu/modeldata/_tmp/free-live-chat-video.html free live chat video , porn free cams , http://www.isoe-network.net/includes/tmp/porn-free-cams.html porn free cams , free live web cams , http://www.praguesummer.com/cache/tmp/free-live-web-cams.html free live web cams , live web cam sex , http://www.praguesummer.com/cache/tmp/live-web-cam-sex.html live web cam sex , japanese porn live , http://www.irpa.net/includes/tmp/japanese-porn-live.html japanese porn live , girls live cam , http://international.wmich.edu/cache/tmp/girls-live-cam.html girls live cam ,
Posted by: webcam amateur at May 14, 2008 06:12 AM