10 Tips for Writing High-Performance Web Applications

Before becoming a workaholic, I used to do a lot of rock climbing. Prior to any big climb, I’d review the route in the guidebook and read the recommendations made by people who had visited the site before. But, no matter how good the guidebook, you need actual rock climbing experience before attempting a particularly challenging climb. Similarly, you can only learn how to write high-performance Web applications when you’re faced with either fixing performance problems or running a high-throughput site.
My personal experience comes from having been an infrastructure Program Manager on the ASP.NET team at Microsoft, running and managing www.asp.net, and helping architect Community Server, which is the next version of several well-known ASP.NET applications (ASP.NET Forums, .Text, and nGallery combined into one platform). I’m sure that some of the tips that have helped me will help you as well.
You should think about the separation of your application into logical tiers. You might have heard of the term 3-tier (or n-tier) physical architecture. These are usually prescribed architecture patterns that physically divide functionality across processes and/or hardware. As the system needs to scale, more hardware can easily be added. There is, however, a performance hit associated with process and machine hopping, thus it should be avoided. So, whenever possible, run the ASP.NET pages and their associated components together in the same application.
Because of the separation of code and the boundaries between tiers, using Web services or remoting will decrease performance by 20 percent or more.
The data tier is a bit of a different beast since it is usually better to have dedicated hardware for your database. However, the cost of process hopping to the database is still high, thus performance on the data tier is the first place to look when optimizing your code.
Before diving in to fix performance problems in your applications, make sure you profile your applications to see exactly where the problems lie. Key performance counters (such as the one that indicates the percentage of time spent performing garbage collections) are also very useful for finding out where applications are spending the majority of their time. Yet the places where time is spent are often quite unintuitive.
There are two types of performance improvements described in this article: large optimizations, such as using the ASP.NET Cache, and tiny optimizations that repeat themselves. These tiny optimizations are sometimes the most interesting. You make a small change to code that gets called thousands and thousands of times. With a big optimization, you might see overall performance take a large jump. With a small one, you might shave a few milliseconds on a given request, but when compounded across the total requests per day, it can result in an enormous improvement.

Performance on the Data Tier

When it comes to performance-tuning an application, there is a single litmus test you can use to prioritize work: does the code access the database? If so, how often? Note that the same test could be applied for code that uses Web services or remoting, too, but I’m not covering those in this article.
If you have a database request required in a particular code path and you see other areas such as string manipulations that you want to optimize first, stop and perform your litmus test. Unless you have an egregious performance problem, your time would be better utilized trying to optimize the time spent in and connected to the database, the amount of data returned, and how often you make round-trips to and from the database.
With that general information established, let’s look at ten tips that can help your application perform better. I’ll begin with the changes that can make the biggest difference.

Tip 1—Return Multiple Resultsets

Review your database code to see if you have request paths that go to the database more than once. Each of those round-trips decreases the number of requests per second your application can serve. By returning multiple resultsets in a single database request, you can cut the total time spent communicating with the database. You’ll be making your system more scalable, too, as you’ll cut down on the work the database server is doing managing requests.
While you can return multiple resultsets using dynamic SQL, I prefer to use stored procedures. It’s arguable whether business logic should reside in a stored procedure, but I think that if logic in a stored procedure can constrain the data returned (reduce the size of the dataset, time spent on the network, and not having to filter the data in the logic tier), it’s a good thing.
Using a SqlCommand instance and its ExecuteReader method to populate strongly typed business classes, you can move the resultset pointer forward by calling NextResult. Figure 1 shows a sample conversation populating several ArrayLists with typed classes. Returning only the data you need from the database will additionally decrease memory allocations on your server.
The ASP.NET DataGrid exposes a wonderful capability: data paging support. When paging is enabled in the DataGrid, a fixed number of records is shown at a time. Additionally, paging UI is also shown at the bottom of the DataGrid for navigating through the records. The paging UI allows you to navigate backwards and forwards through displayed data, displaying a fixed number of records at a time.
There’s one slight wrinkle. Paging with the DataGrid requires all of the data to be bound to the grid. For example, your data layer will need to return all of the data and then the DataGrid will filter all the displayed records based on the current page. If 100,000 records are returned when you’re paging through the DataGrid, 99,975 records would be discarded on each request (assuming a page size of 25). As the number of records grows, the performance of the application will suffer as more and more data must be sent on each request.
One good approach to writing better paging code is to use stored procedures. Figure 2 shows a sample stored procedure that pages through the Orders table in the Northwind database. In a nutshell, all you’re doing here is passing in the page index and the page size. The appropriate resultset is calculated and then returned.
In Community Server, we wrote a paging server control to do all the data paging. You’ll see that I am using the ideas discussed in Tip 1, returning two resultsets from one stored procedure: the total number of records and the requested data.
The total number of records returned can vary depending on the query being executed. For example, a WHERE clause can be used to constrain the data returned. The total number of records to be returned must be known in order to calculate the total pages to be displayed in the paging UI. For example, if there are 1,000,000 total records and a WHERE clause is used that filters this to 1,000 records, the paging logic needs to be aware of the total number of records to properly render the paging UI.

Tip 3—Connection Pooling

Setting up the TCP connection between your Web application and SQL Server can be an expensive operation. Developers at Microsoft have been able to take advantage of connection pooling for some time now, allowing them to reuse connections to the database. Rather than setting up a new TCP connection on each request, a new connection is set up only when one is not available in the connection pool. When the connection is closed, it is returned to the pool where it remains connected to the database, as opposed to completely tearing down that TCP connection.
Of course you need to watch out for leaking connections. Always close your connections when you’re finished with them. I repeat: no matter what anyone says about garbage collection within the Microsoft® .NET Framework, always call Close or Dispose explicitly on your connection when you are finished with it. Do not trust the common language runtime (CLR) to clean up and close your connection for you at a predetermined time. The CLR will eventually destroy the class and force the connection closed, but you have no guarantee when the garbage collection on the object will actually happen.
To use connection pooling optimally, there are a couple of rules to live by. First, open the connection, do the work, and then close the connection. It’s okay to open and close the connection multiple times on each request if you have to (optimally you apply Tip 1) rather than keeping the connection open and passing it around through different methods. Second, use the same connection string (and the same thread identity if you’re using integrated authentication). If you don’t use the same connection string, for example customizing the connection string based on the logged-in user, you won’t get the same optimization value provided by connection pooling. And if you use integrated authentication while impersonating a large set of users, your pooling will also be much less effective. The .NET CLR data performance counters can be very useful when attempting to track down any performance issues that are related to connection pooling.
Whenever your application is connecting to a resource, such as a database, running in another process, you should optimize by focusing on the time spent connecting to the resource, the time spent sending or retrieving data, and the number of round-trips. Optimizing any kind of process hop in your application is the first place to start to achieve better performance.
The application tier contains the logic that connects to your data layer and transforms data into meaningful class instances and business processes. For example, in Community Server, this is where you populate a Forums or Threads collection, and apply business rules such as permissions; most importantly it is where the Caching logic is performed.

Tip 4—ASP.NET Cache API

One of the very first things you should do before writing a line of application code is architect the application tier to maximize and exploit the ASP.NET Cache feature.
If your components are running within an ASP.NET application, you simply need to include a reference to System.Web.dll in your application project. When you need access to the Cache, use the HttpRuntime.Cache property (the same object is also accessible through Page.Cache and HttpContext.Cache).
There are several rules for caching data. First, if data can be used more than once it’s a good candidate for caching. Second, if data is general rather than specific to a given request or user, it’s a great candidate for the cache. If the data is user- or request-specific, but is long lived, it can still be cached, but may not be used as frequently. Third, an often overlooked rule is that sometimes you can cache too much. Generally on an x86 machine, you want to run a process with no higher than 800MB of private bytes in order to reduce the chance of an out-of-memory error. Therefore, caching should be bounded. In other words, you may be able to reuse a result of a computation, but if that computation takes 10 parameters, you might attempt to cache on 10 permutations, which will likely get you into trouble. One of the most common support calls for ASP.NET is out-of-memory errors caused by overcaching, especially of large datasets.Common Performance Myths

One of the most common myths is that C# code is faster than Visual Basic code. There is a grain of truth in this, as it is possible to take several performance-hindering actions in Visual Basic that are not possible to accomplish in C#, such as not explicitly declaring types. But if good programming practices are followed, there is no reason why Visual Basic and C# code cannot execute with nearly identical performance. To put it more succinctly, similar code produces similar results.
Another myth is that codebehind is faster than inline, which is absolutely false. It doesn’t matter where your code for your ASP.NET application lives, whether in a codebehind file or inline with the ASP.NET page. Sometimes I prefer to use inline code as changes don’t incur the same update costs as codebehind. For example, with codebehind you have to update the entire codebehind DLL, which can be a scary proposition.
Myth number three is that components are faster than pages. This was true in Classic ASP when compiled COM servers were much faster than VBScript. With ASP.NET, however, both pages and components are classes. Whether your code is inline in a page, within a codebehind, or in a separate component makes little performance difference. Organizationally, it is better to group functionality logically this way, but again it makes no difference with regard to performance.
The final myth I want to dispel is that every functionality that you want to occur between two apps should be implemented as a Web service. Web services should be used to connect disparate systems or to provide remote access to system functionality or behaviors. They should not be used internally to connect two similar systems. While easy to use, there are much better alternatives.
The worst thing you can do is use Web services for communicating between ASP and ASP.NET applications running on the same server, which I’ve witnessed all too frequently.

Figure 3 ASP.NET Cache 
There are a several great features of the Cache that you need to know. The first is that the Cache implements a least-recently-used algorithm, allowing ASP.NET to force a Cache purge—automatically removing unused items from the Cache—if memory is running low. Secondly, the Cache supports expiration dependencies that can force invalidation. These include time, key, and file. Time is often used, but with ASP.NET 2.0 a new and more powerful invalidation type is being introduced: database cache invalidation. This refers to the automatic removal of entries in the cache when data in the database changes. For more information on database cache invalidation, see Dino Esposito’s Cutting Edge column in the July 2004 issue of MSDN®Magazine. For a look at the architecture of the cache, see Figure 3.

Tip 5—Per-Request Caching

Earlier in the article, I mentioned that small improvements to frequently traversed code paths can lead to big, overall performance gains. One of my absolute favorites of these is something I’ve termed per-request caching.
Whereas the Cache API is designed to cache data for a long period or until some condition is met, per-request caching simply means caching the data for the duration of the request. A particular code path is accessed frequently on each request but the data only needs to be fetched, applied, modified, or updated once. This sounds fairly theoretical, so let’s consider a concrete example.
In the Forums application of Community Server, each server control used on a page requires personalization data to determine which skin to use, the style sheet to use, as well as other personalization data. Some of this data can be cached for a long period of time, but some data, such as the skin to use for the controls, is fetched once on each request and reused multiple times during the execution of the request.
To accomplish per-request caching, use the ASP.NET HttpContext. An instance of HttpContext is created with every request and is accessible anywhere during that request from the HttpContext.Current property. The HttpContext class has a special Items collection property; objects and data added to this Items collection are cached only for the duration of the request. Just as you can use the Cache to store frequently accessed data, you can use HttpContext.Items to store data that you’ll use only on a per-request basis. The logic behind this is simple: data is added to the HttpContext.Items collection when it doesn’t exist, and on subsequent lookups the data found in HttpContext.Items is simply returned.

Tip 6—Background Processing

The path through your code should be as fast as possible, right? There may be times when you find yourself performing expensive tasks on each request or once every n requests. Sending out e-mails or parsing and validation of incoming data are just a few examples.
When tearing apart ASP.NET Forums 1.0 and rebuilding what became Community Server, we found that the code path for adding a new post was pretty slow. Each time a post was added, the application first needed to ensure that there were no duplicate posts, then it had to parse the post using a “badword” filter, parse the post for emoticons, tokenize and index the post, add the post to the moderation queue when required, validate attachments, and finally, once posted, send e-mail notifications out to any subscribers. Clearly, that’s a lot of work.
It turns out that most of the time was spent in the indexing logic and sending e-mails. Indexing a post was a time-consuming operation, and it turned out that the built-in System.Web.Mail functionality would connect to an SMTP server and send the e-mails serially. As the number of subscribers to a particular post or topic area increased, it would take longer and longer to perform the AddPost function.
Indexing e-mail didn’t need to happen on each request. Ideally, we wanted to batch this work together and index 25 posts at a time or send all the e-mails every five minutes. We decided to use the same code I had used to prototype database cache invalidation for what eventually got baked into Visual Studio® 2005.
The Timer class, found in the System.Threading namespace, is a wonderfully useful, but less well-known class in the .NET Framework, at least for Web developers. Once created, the Timer will invoke the specified callback on a thread from the ThreadPool at a configurable interval. This means you can set up code to execute without an incoming request to your ASP.NET application, an ideal situation for background processing. You can do work such as indexing or sending e-mail in this background process too.
There are a couple of problems with this technique, though. If your application domain unloads, the timer instance will stop firing its events. In addition, since the CLR has a hard gate on the number of threads per process, you can get into a situation on a heavily loaded server where timers may not have threads to complete on and can be somewhat delayed. ASP.NET tries to minimize the chances of this happening by reserving a certain number of free threads in the process and only using a portion of the total threads for request processing. However, if you have lots of asynchronous work, this can be an issue.
There is not enough room to go into the code here, but you can download a digestible sample at www.rob-howard.net. Just grab the slides and demos from the Blackbelt TechEd 2004 presentation.
One simple method is to use the ThreadPool, with any background processes you must make sure you capture any exception so as you know something has gone wrong, as shown below:

ThreadPool.QueueUserWorkItem(delegate { MyCode.Import(); });

Tip 7—Page Output Caching and Proxy Servers

ASP.NET is your presentation layer (or should be); it consists of pages, user controls, server controls (HttpHandlers and HttpModules), and the content that they generate. If you have an ASP.NET page that generates output, whether HTML, XML, images, or any other data, and you run this code on each request and it generates the same output, you have a great candidate for page output caching.
By simply adding this line to the top of your page
<%@ Page OutputCache VaryByParams=”none” Duration=”60″ %>
you can effectively generate the output for this page once and reuse it multiple times for up to 60 seconds, at which point the page will re-execute and the output will once be again added to the ASP.NET Cache. This behavior can also be accomplished using some lower-level programmatic APIs, too. There are several configurable settings for output caching, such as the VaryByParams attribute just described. VaryByParams just happens to be required, but allows you to specify the HTTP GET or HTTP POST parameters to vary the cache entries. For example, default/?Report=1 or default/?Report=2 could be output-cached by simply setting VaryByParam=”Report”. Additional parameters can be named by specifying a semicolon-separated list.
Many people don’t realize that when the Output Cache is used, the ASP.NET page also generates a set of HTTP headers that downstream caching servers, such as those used by the Microsoft Internet Security and Acceleration Server or by Akamai. When HTTP Cache headers are set, the documents can be cached on these network resources, and client requests can be satisfied without having to go back to the origin server.
Using page output caching, then, does not make your application more efficient, but it can potentially reduce the load on your server as downstream caching technology caches documents. Of course, this can only be anonymous content; once it’s downstream, you won’t see the requests anymore and can’t perform authentication to prevent access to it.

Tip 8—Run IIS 6.0 (If Only for Kernel Caching)

If you’re not running IIS 6.0 (Windows Server 2003), you’re missing out on some great performance enhancements in the Microsoft Web server. In Tip 7, I talked about output caching. In IIS 5.0, a request comes through IIS and then to ASP.NET. When caching is involved, an HttpModule in ASP.NET receives the request, and returns the contents from the Cache.
If you’re using IIS 6.0, there is a nice little feature called kernel caching that doesn’t require any code changes to ASP.NET. When a request is output-cached by ASP.NET, the IIS kernel cache receives a copy of the cached data. When a request comes from the network driver, a kernel-level driver (no context switch to user mode) receives the request, and if cached, flushes the cached data to the response, and completes execution. This means that when you use kernel-mode caching with IIS and ASP.NET output caching, you’ll see unbelievable performance results. At one point during the Visual Studio 2005 development of ASP.NET, I was the program manager responsible for ASP.NET performance. The developers did the magic, but I saw all the reports on a daily basis.
The kernel mode caching results were always the most interesting. The common characteristic was network saturation by requests/responses and IIS running at about five percent CPU utilization.
It was amazing! There are certainly other reasons for using IIS 6.0, but kernel mode caching is an obvious one.

Tip 9—Use Gzip Compression

While not necessarily a server performance tip (since you might see CPU utilization go up), using gzip compression can decrease the number of bytes sent by your server. This gives the perception of faster pages and also cuts down on bandwidth usage. Depending on the data sent, how well it can be compressed, and whether the client browsers support it (IIS will only send gzip compressed content to clients that support gzip compression, such as Internet Explorer 6.0 and Firefox), your server can serve more requests per second. In fact, just about any time you can decrease the amount of data returned, you will increase requests per second.
The good news is that gzip compression is built into IIS 6.0 and is much better than the gzip compression used in IIS 5.0. Unfortunately, when attempting to turn on gzip compression in IIS 6.0, you may not be able to locate the setting on the properties dialog in IIS. The IIS team built awesome gzip capabilities into the server, but neglected to include an administrative UI for enabling it.
To enable gzip compression, you have to spelunk into the innards of the XML configuration settings of IIS 6.0 (which isn’t for the faint of heart). By the way, the credit goes to Scott Forsyth of OrcsWeb who helped me figure this out for the www.asp.net severs hosted by OrcsWeb.
Rather than include the procedure in this article, just read the article by Brad Wilson at IIS6 Compression. There’s also a Knowledge Base article on enabling compression for ASPX, available at Enable ASPX Compression in IIS. It should be noted, however, that dynamic compression and kernel caching are mutually exclusive on IIS 6.0 due to some implementation details.

Tip 10—Server Control View State

View state is a fancy name for ASP.NET storing some state data in a hidden input field inside the generated page. When the page is posted back to the server, the server can parse, validate, and apply this view state data back to the page’s tree of controls. View state is a very powerful capability since it allows state to be persisted with the client and it requires no cookies or server memory to save this state. Many ASP.NET server controls use view state to persist settings made during interactions with elements on the page, for example, saving the current page that is being displayed when paging through data.
There are a number of drawbacks to the use of view state, however. First of all, it increases the total payload of the page both when served and when requested. There is also an additional overhead incurred when serializing or deserializing view state data that is posted back to the server. Lastly, view state increases the memory allocations on the server.
Several server controls, the most well known of which is the DataGrid, tend to make excessive use of view state, even in cases where it is not needed. The default behavior of the ViewState property is enabled, but if you don’t need it, you can turn it off at the control or page level. Within a control, you simply set the EnableViewState property to false, or you can set it globally within the page using this setting:
<%@ Page EnableViewState=”false” %>
If you are not doing postbacks in a page or are always regenerating the controls on a page on each request, you should disable view state at the page level.

Conclusion

I’ve offered you some tips that I’ve found useful for writing high-performance ASP.NET applications. As I mentioned at the beginning of this article, this is more a preliminary guide than the last word on ASP.NET performance. (More information on improving the performance of ASP.NET apps can be found at Improving ASP.NET Performance.) Only through your own experience can you find the best way to solve your unique performance problems. However, during your journey, these tips should provide you with good guidance. In software development, there are very few absolutes; every application is unique.
It’s also worth taking a look at Mads Kristensen blog:

Server Side Viewstate

Background

ViewState is a feature provided by ASP.Net that stores information for server controls used on a web page. The primary advantage of this feature from a developer perspective is that it keeps the values of controls on a form. If the user forgot a single required field on a form, ASP.Net can put back the other values they had entered rather than forcing the developer to repopulate them. The information is encoded into a string and placed in a hidden form field called “__VIEWSTATE” that is sent to the client with the response (you can see this by using View Source in your browser when you view a / page). ASP.Net by default also calculates a “hash” of the information and adds it to the ViewState, then on a postback it can validate that the data has not been corrupted. That is a basic description of what viewstate is and how it is implemented, this article will focus on where the viewstate information is stored.

The Problem

After converting our web site from ASP to ASP.Net, we began to have problems with viewstate exceptions. We would receive "The View State is invalid for this page and might be corrupted" about 50 times a day. We searched for known causes for this problem and none of them applied to our situation. After several network traces and calls to Microsoft Support, we determined that when network retransmissions occurred the client would sometimes have a corrupt version of the HTML page (we were never able to determine why TCP did not detect the corruption). If the corruption happened to occur inside the viewstate hidden field value, it would cause the exceptions (the recomputed hash on the server did not match the original hash). We also determined that several browsers (such as older WebTV clients) could not handle large hidden form fields. Our solution was to save the viewstate for the pages on the web server using the Session object instead of the HTML result sent to the client.

Benefits of Server Side ViewState

Avoid corruption problems
Avoid issues with browsers that limit the maximum size of hidden fields
Reduced client page size – since the viewstate information is stored on the server it is not sent in the response to the client which saves download time and bandwidth
Disadvantages of Server Side ViewState
Uses server side memory for the in-process session object
Not as obvious to developers how much space is used by viewstate

The Solution

The idea is to override the default handling of viewstate provided by ASP.Net and implement a custom implementation to store the viewstate information on the server. Originally I was not enthusiastic about the idea of rewriting some of the basic functionality provided by ASP.Net, but it was very easy to do and eliminated our problems.

Since the viewstate is specific to each user, we decided to store the information using the Session object. Our servers use in-process session state (stored in web server memory), so we wanted to limit the amount of viewstate we had to store for each user. We cannot simply store the last viewstate for user, however, because a user can visit a page in Internet Explorer, hit the back button to a previous page and push a button (which would require the viewstate for that page). This requires us to support multiple versions of viewstate for each user. The number of page viewstate values to store is a tradeoff between memory and support for pressing the back button a number of times. The example solution stores the last 5 viewstate values for each user. The viewstate values are stored in a circular array using the modulus operator (% in C#). Every time a user requests a new page on the site, the code increments a request number and uses that number as a key to the proper viewstate in the array on the server. The code uses the modulus of the request number and the size of the array to determine the index into the circular array. This starts storing viewstate for the first page (request number 1) in index 1 of the array (1 % 5 = 1), request 2 would use index 2, etc. This continues until the request number hits 5, which wraps back around and uses index position 0. This is a simple way to use a fixed size array efficiently. The code stores the request number in the “__VIEWSTATE” field that was previously used to store the actual viewstate information. The code consults the web.config file to determine if it should use server side or client side viewstate (changes to the web.config will drop the in-process Session object for the user which will drop all current viewstate, make changes to this during off peak times).

Instructions for Using

The first step in implementing the solution is to download the two C# files, below, that implement the server side functionality and add them to your ASP.Net project.

BasePage.cs

This file includes the class BasePage which is used as a parent class for all pages on the site and can be used without modification. The use of a custom base class has many advantages for an ASP.Net site, in this case it is used to override the default viewstate handling provided by System.Web.UI.Page. The two methods that this class overrides are LoadPageStateFromPersistenceMedium() which is called to reload the viewstate on a postback, and SavePageStateToPersistenceMedium() which is called to save the viewstate before sending the results to the client. The override of these methods simply check if server side viewstate is enabled, then call methods to do the work if it is (otherwise they just invoke the original Page class methods). The LoadViewState() method checks for the __VIEWSTATE field and attempts to convert it to a long (which would fail if the viewstate was actually the original encoded string). If it successfully retrieves the value, it uses another class viewStateSvrMgr to retrieve the viewstate from the session. The SaveViewState() method uses the viewStateSvrMgr class to save the viewstate to the session, and writes the hidden form field __VIEWSTATE to the client page with the request number key. If you are using the .Net 1.1 framework, you should change the name used for the hidden field value, which is stored in the constant VIEW_STATE_FIELD_NAME in this file (anything other than __VIEWSTATE).

All web pages for the site must derive from the BasePage class. The default file webForm1/.cs in an ASP.Net project has the original class declaration “public class WebForm1 : System.Web.UI.Page”, this is changed instead to “public class WebForm1 : ServerSideViewstate.BasePage”.


viewStateSvrMgr.cs

This provides the majority of the code required for server side viewstate. A constant at the top of the file VIEW_STATE_NUM_PAGES controls the number of viewstate values kept for each user. This can be adjusted as necessary to conserve memory, but if the user hit the back button more times than this array size and did a postback, the viewstate data would not be present which will affect page behavior for controls and events. This file also uses a technique that stores an object in session rather than storing multiple independent session variables. The GetViewStateSvrMgr() method checks if an object of this class has already been created and stored in session or if it needs to do so. Then the class can simply store local variables that are all stored in that single session object, which helps avoid pollution of the Session with lots of independent keys with no relation between them. Pages that need to use the services of the class call GetViewStateSvrMgr() and get an instance of the class they can use to get/save viewstate information. The property ServerSideEnabled abstracts the check for whether the server side viewstate is enabled or not (via the web.config setting). The methods SaveViewState() and GetViewState() manage the circular array which is a member of the class, and hence is stored in the object that is placed in the session object for this user.

The last step is to add the following to the web.config file (place just above the </configuration> entry:

<appSettings>
<add key="ServerSideViewState" value="true" />
</appSettings>


Testing

Once the files have been added to the project, the pages have been changed to derive from the BasePage class, and the changes to web.config have been made the site will start to use the session to store viewstate. This can be validated by visiting one of the / pages on the site and using the View Source option of the browser, search for the __VIEWSTATE and notice that it is no longer a large encoded string, but a request number that is the key to the location of the viewstate for the page on the server.

Conclusion

This solution still supports the EnableViewState property for pages and controls and the viewstate size can be seen with tracing, this technique simply changes where the viewstate will be stored. The original driver of using Server Side ViewState was some corruption problems and lack of browser support for large hidden form fields. The use of the described technique solves both of those problems, while also reducing the page size sent to the client.

ServerSideFiles.zip (2.44 kb)

NVARCHAR versus VARCHAR

SQL Server provides both datatypes to store character information. For the most part the two datatypes are identical in how you would work with them within SQL Server or from an application. The difference is that nvarchar is used to store unicode data, which is used to store multilingual data in your database tables. Other languages have an extended set of character codes that need to be saved and this datatype allows for this extension. If your database will not be storing multilingual data you should use the varchar datatype instead. The reason for this is that nvarchar takes twice as much space as varchar, this is because of the need to store the extended character codes for other languages

HTML Editor

I’ve been doing a little research on the use of an HTML Editor for development of internal project, using ASP.NET

Aloha Editor (http://www.aloha-editor.org)

This HTML editor is the bees knees, take a look at the website and check it out, it fantastic

alohaeditor-0.23.9.zip (11.30 mb)

FCKeditor (http://www.fckeditor.net/)

The editor is distributed under the GPL, LGPL and MPL open source licenses. This triple licensing model avoids incompatibility with other open source licenses, making it possible to integrate FCKeditor whenever you want.

FreeTextBox (http://freetextbox.com/default/)

FreeTextBox is the most-used HTML editor for ASP.NET. It is compatible with IE on the PC, and Mozilla and Firefox on all platforms. It is used in major Open Source projects such as Community Server and DotNetNuke as well as excellent packages like Smarter Mail.
Buy FreeTextBox
To purchase a FreeTextBox Professional license or Distribution License with Source code, please click on one of the “Purchase” buttons below. Currently, PayPal is our payment provider. Once you click a “buy” you will be redirected to paypal where you can pay using your PayPal account or a credit card.
Professional License – $49.99
The Professional License allows FreeTextBox to be used in all applications you or your company creates on as many servers as you host, but the license does not include rights to re-distribute FreeTextBox 3.0 Professional features in applications you or your company sell. If you sell or distribute products with FreeTextBox 3.0 Professional features, you must purchase a Distribution license (see below). Upon purchase, you will receive a unique FreeTextBox.lic file which will unlock the Professional features of FreeTextBox 3.0. Professional License Agreement

TinyMCE (http://tinymce.moxiecode.com/index.php)

Open Source under LGPL

CuteEditor (http://cutesoft.net/ASP.NET+WYSIWYG+Editor/default/)

http://cutesoft.net/How+to+Buy/Licensing+Info/default/
This can be purchased from
http://cutesoft.net/ASP.NET+WYSIWYG+Editor/Purchase+CuteEditor+for+.NET/default/

RadEditor (http://www.telerik.com/products/aspnet-ajax/controls/editor/overview/)

RadEditor for ASP.NET AJAX is the leading WYSIWYG web editor. It can replace any TextBox with an intuitive Word®-like editor, which enables even non-technical users to visually manage HTML content. Also available for SharePoint 2007 (MOSS) and DotNetNuke.
Cost $999

On my research I could not find a BSD licensed HTML Editor.

The ones I have used before are
    RadEditor, very good, also very good support and backup.
    TinyMCE, very rich in features and easy to use.
    FCKeditor, little harder to use, and now with extra licences to allow for more general use
   

CSV file parser and writer in C# (Part 3)

This is the last part of an article series exploring reading and writing CSV files with C#/.NET.

Part 1 covered converting the contents of a DataTable into CSV format. Part 2 explained how to read a CSV file back into a DataTable.

Finally we will look at some test cases for use with NUnit, a very important tool that I'll describe in more detail in an upcoming article. There is quite a bit of code, but it is all very simple and easy to understand. Each method marked with [Test] will be called by NUnit and run one of the CSV related methods in turn. Assert statements will then check if the returned values are valid or not.

First, let's check the parser methods. There are several tests here that all test something different; they all pass a CSV formatted string to the CsvParser.Parse method and check the returned DataTable.

using System;
using System.Data;
using System.IO;
using NUnit.Framework;

namespace CsvParser
{
        [TestFixture]
        public class TestCsvParser
        {
                [Test]
                public void UnquotedLine()
                {
                        DataTable table = CsvParser.Parse("one,two,three");
                        Assert.IsNotNull(table);
                        Assert.AreEqual(3, table.Columns.Count);
                        Assert.AreEqual(1, table.Rows.Count);
                        Assert.AreEqual("one", table.Rows[0][0]);
                        Assert.AreEqual("two", table.Rows[0][1]);
                        Assert.AreEqual("three", table.Rows[0][2]);
                }

                [Test]
                public void QuotedLine()
                {
                        DataTable table = CsvParser.Parse("\"one\",\"two\",\"three\"");
                        Assert.IsNotNull(table);
                        Assert.AreEqual(3, table.Columns.Count);
                        Assert.AreEqual(1, table.Rows.Count);
                        Assert.AreEqual("one", table.Rows[0][0]);
                        Assert.AreEqual("two", table.Rows[0][1]);
                        Assert.AreEqual("three", table.Rows[0][2]);
                }

                [Test]
                public void TwoLines()
                {
                        DataTable table = CsvParser.Parse("one,two,three\nfour,five,six\n");
                        Assert.IsNotNull(table);
                        Assert.AreEqual(3, table.Columns.Count);
                        Assert.AreEqual(2, table.Rows.Count);
                        Assert.AreEqual("one", table.Rows[0][0]);
                        Assert.AreEqual("two", table.Rows[0][1]);
                        Assert.AreEqual("three", table.Rows[0][2]);
                        Assert.AreEqual("four", table.Rows[1][0]);
                        Assert.AreEqual("five", table.Rows[1][1]);
                        Assert.AreEqual("six", table.Rows[1][2]);
                }

                [Test]
                public void QuotedMultilineValue()
                {
                        DataTable table = CsvParser.Parse(
                                "\"one\n\"\"beer\"\"\",\"\"\"two\"\"\nbeers\",\"three\nbeers\"");
                        Assert.IsNotNull(table);
                        Assert.AreEqual(3, table.Columns.Count);
                        Assert.AreEqual(1, table.Rows.Count);
                        Assert.AreEqual("one\n\"beer\"", table.Rows[0][0]);
                        Assert.AreEqual("\"two\"\nbeers", table.Rows[0][1]);
                        Assert.AreEqual("three\nbeers", table.Rows[0][2]);
                }

                [Test]
                public void Headers()
                {
                        DataTable table = CsvParser.Parse(
                                "First,Last,Email\nAndreas,Knab,knabar@yahoo.com", true);
                        Assert.IsNotNull(table);
                        Assert.AreEqual(3, table.Columns.Count);
                        Assert.AreEqual(1, table.Rows.Count);
                        Assert.AreEqual("Andreas", table.Rows[0]["First"]);
                        Assert.AreEqual("Knab", table.Rows[0]["Last"]);
                        Assert.AreEqual("knabar@yahoo.com", table.Rows[0]["Email"]);
                }

                [Test]
                public void TrailingNewlines()
                {
                        DataTable table = CsvParser.Parse("test\n\n\n\n\n\n");
                        Assert.IsNotNull(table);
                        Assert.AreEqual(1, table.Columns.Count);
                        Assert.AreEqual(6, table.Rows.Count);
                        Assert.AreEqual("test", table.Rows[0][0]);
                }

                [Test]
                public void Newlines()
                {
                        DataTable table = CsvParser.Parse("test1\x0Atest2\x0Dtest3\x0D\x0Atest4");
                        Assert.IsNotNull(table);
                        Assert.AreEqual(1, table.Columns.Count);
                        Assert.AreEqual(4, table.Rows.Count);
                        Assert.AreEqual("test1", table.Rows[0][0]);
                        Assert.AreEqual("test2", table.Rows[1][0]);
                        Assert.AreEqual("test3", table.Rows[2][0]);
                        Assert.AreEqual("test4", table.Rows[3][0]);
                }
        }
}

Second, the opposite route – testing the CsvWriter. Since it takes quite a bit of effort to completely populate a DataTable, I'll cheat and create the DataTable using the CsvParser, which has been tested separately and is known to work.

using System;
using System.Data;
using System.IO;
using NUnit.Framework;
using CsvParser;

namespace CsvWriter
{
        [TestFixture]
        public class TestCsvWriter
        {
                [Test]
                public void PlainText()
                {
                        string s = "one,two,three\n";
                        DataTable table = CsvParser.Parse(s);
                        string t = CsvWriter.WriteToString(table, false, false);
                        Assert.AreEqual(s, t);
                }

                [Test]
                public void QuotedText()
                {
                        string s = "\"one\",\"two\",\"three\"\n";
                        DataTable table = CsvParser.Parse(s);
                        string t = CsvWriter.WriteToString(table, false, true);
                        Assert.AreEqual(s, t);
                }

                [Test]
                public void MultiLineText()
                {
                        string s = "\"one\nline\",\"two\nline\",\"three\nline\"\n";
                        DataTable table = CsvParser.Parse(s);
                        string t = CsvWriter.WriteToString(table, false, false);
                        Assert.AreEqual(s, t);
                }

                [Test]
                public void Headers()
                {
                        string s = "First,Last,Email\nAndreas,Knab,knabar@yahoo.com\n";
                        DataTable table = CsvParser.Parse(s, true);
                        string t = CsvWriter.WriteToString(table, true, false);
                        Assert.AreEqual(s, t);
                }
        }
}

And that's it! There are more special cases that could be tested, especially invalid input like non-CSV files, but technically everything can be interpreted as CSV, so the question would be what output to expect.

As I mentioned earlier, NUnit is worth a closer look, so check back for more.

CSV file parser and writer in C# (Part 2)

This is the second part of an article series exploring reading and writing CSV files with C#/.NET.

Part 1 covered converting the contents of a DataTable into CSV format; this part explains reading a CSV file back into a DataTable.

First some namespace imports and the namespace declaration for this project:

using System;
using System.Collections;
using System.Data;
using System.Text;
using System.IO;

namespace CsvParser
{

Like the CsvWriter class, all methods in the CsvParser class are static. There are four parser methods that return a DataTable from either a string or text stream, and expecting a header line in the CSV source or not. Only one method has a real body, all others just adjust their parameter signature. The implementation is simple: the method reads one "row" of the CSV source at a time and populates a row in the DataTable. The real meat is the private class CsvStream explained below. There is one utility method that returns an unused column name for a DataTable, in case there are no headers in the CSV source or the headers are not unique.

        public class CsvParser
        {
                public static DataTable Parse(string data, bool headers)
                {
                        return Parse(new StringReader(data), headers);
                }
               
                public static DataTable Parse(string data)
                {
                        return Parse(new StringReader(data));
                }

                public static DataTable Parse(TextReader stream)
                {
                        return Parse(stream, false);
                }

                public static DataTable Parse(TextReader stream, bool headers)
                {
                        DataTable table = new DataTable();
                        CsvStream csv = new CsvStream(stream);
                        string[] row = csv.GetNextRow();
                        if (row == null)
                                return null;
                        if (headers)
                        {
                                foreach (string header in row)
                                {
                                        if (header != null && header.Length > 0 && !table.Columns.Contains(header))
                                                table.Columns.Add(header, typeof(string));
                                        else
                                                table.Columns.Add(GetNextColumnHeader(table), typeof(string));
                                }
                                row = csv.GetNextRow();
                        }
                        while (row != null)
                        {
                                while (row.Length > table.Columns.Count)
                                        table.Columns.Add(GetNextColumnHeader(table), typeof(string));
                                table.Rows.Add(row);
                                row = csv.GetNextRow();
                        }
                        return table;
                }

                private static string GetNextColumnHeader(DataTable table)
                {
                        int c = 1;
                        while (true)
                        {
                                string h = "Column" + c++;
                                if (!table.Columns.Contains(h))
                                        return h;
                        }
                }

The CsvStream class does the actual work – read the CSV source in one character at a time and return meaningful chunks of decoded data, namely data items and rows.

                private class CsvStream
                {
                        private TextReader stream;                 
                       
                        public CsvStream(TextReader s)
                        {
                                stream = s;
                        }

                        public string[] GetNextRow()
                        {
                                ArrayList row = new ArrayList();
                                while (true)
                                {
                                        string item = GetNextItem();
                                        if (item == null)
                                                return row.Count == 0 ? null : (string[])row.ToArray(typeof(string));
                                        row.Add(item);
                                }
                        }

                        private bool EOS = false;
                        private bool EOL = false;

                        private string GetNextItem()
                        {
                                if (EOL)
                                {
                                        // previous item was last in line, start new line
                                        EOL = false;
                                        return null;
                                }

                                bool quoted = false;
                                bool predata = true;
                                bool postdata = false;
                                StringBuilder item = new StringBuilder();
                               
                                while (true)
                                {
                                        char c = GetNextChar(true);
                                        if (EOS)
                                                return item.Length > 0 ? item.ToString() : null;

                                        if ((postdata || !quoted) && c == ',')
                                                // end of item, return
                                                return item.ToString();
                                       
                                        if ((predata || postdata || !quoted) && (c == '\x0A' || c == '\x0D'))
                                        {
                                                // we are at the end of the line, eat newline characters and exit
                                                EOL = true;
                                                if (c == '\x0D' && GetNextChar(false) == '\x0A')
                                                        // new line sequence is 0D0A
                                                        GetNextChar(true);
                                                return item.ToString();
                                        }

                                        if (predata && c == ' ')
                                                // whitespace preceeding data, discard
                                                continue;

                                        if (predata && c == '"')
                                        {
                                                // quoted data is starting
                                                quoted = true;
                                                predata = false;
                                                continue;
                                        }

                                        if (predata)
                                        {
                                                // data is starting without quotes
                                                predata = false;
                                                item.Append(c);
                                                continue;
                                        }

                                        if (c == '"' && quoted)
                                        {
                                                if (GetNextChar(false) == '"')
                                                        // double quotes within quoted string means add a quote       
                                                        item.Append(GetNextChar(true));
                                                else
                                                        // end-quote reached
                                                        postdata = true;
                                                continue;
                                        }

                                        // all cases covered, character must be data
                                        item.Append(c);
                                }
                        }

                        private char[] buffer = new char[4096];
                        private int pos = 0;
                        private int length = 0;

                        private char GetNextChar(bool eat)
                        {
                                if (pos >= length)
                                {
                                        length = stream.ReadBlock(buffer, 0, buffer.Length);
                                        if (length == 0)
                                        {
                                                EOS = true;
                                                return '\0';
                                        }
                                        pos = 0;
                                }
                                if (eat)
                                        return buffer[pos++];
                                else
                                        return buffer[pos];
                        }
                }
        }
}

And that's about it. In an upcoming part of this article I'll share some NUnit test cases for these classes.

CSV file parser and writer in C# (Part 1)

An issue that comes up quite frequently is how to read and write comma seperated value (CSV) files in C#. Surprisingly the .NET libraries have no built-in support for this, and the usual solution to use an OleDb connection to the CSV file with Microsoft Excel's database driver is convoluted and not cross-platform.

CSV files have a very simple structure (source):

  • Each record is one line (with exceptions)
  • Fields are separated with commas
  • Leading and trailing space-characters adjacent to comma field separators are ignored
  • Fields with embedded commas must be delimited with double-quote characters
  • Fields that contain double quote characters must be surounded by double-quotes, and the embedded double-quotes must each be represented by a pair of consecutive double quotes.
  • A field that contains embedded line-breaks must be surounded by double-quotes
  • Fields with leading or trailing spaces must be delimited with double-quote characters
  • Fields may always be delimited with double quotes
  • The first record in a CSV file may be a header record containing column (field) names

In this article, I'll provide simple, but fully functional code to read and write a CSV file according to these rules. In memory, data will be represented as a DataTable, which makes it easy to process; for storage in the file system or transfer over a network the CSV data will be stored as a String or in a Stream.


First, the easy part: writing a DataTable to a CSV file.

public class CsvWriter
{
        public static string WriteToString(DataTable table, bool header, bool quoteall)
        {
                StringWriter writer = new StringWriter();
                WriteToStream(writer, table, header, quoteall);
                return writer.ToString();
        }

        public static void WriteToStream(TextWriter stream, DataTable table, bool header, bool quoteall)
        {
                if (header)
                {
                        for (int i = 0; i < table.Columns.Count; i++)
                        {
                                WriteItem(stream, table.Columns[i].Caption, quoteall);
                                if (i < table.Columns.Count1)
                                        stream.Write(',');
                                else
                                        stream.Write('\n');
                        }
                }
                foreach (DataRow row in table.Rows)
                {
                        for (int i = 0; i < table.Columns.Count; i++)
                        {
                                WriteItem(stream, row[i], quoteall);
                                if (i < table.Columns.Count1)
                                        stream.Write(',');
                                else
                                        stream.Write('\n');
                        }
                }
        }

        private static void WriteItem(TextWriter stream, object item, bool quoteall)
        {
                if (item == null)
                        return;
                string s = item.ToString();
                if (quoteall || s.IndexOfAny("\",\x0A\x0D".ToCharArray()) > –1)
                        stream.Write("\"" + s.Replace("\"", "\"\"") + "\"");
                else
                        stream.Write(s);
        }
}

The methods are static since the whole conversion is done in one method call, there is no need to create object instances etc.

WriteToString will return the CSV file in a string; it is just a wrapper around the more generic WriteToStream method. Both methods take a DataTable and two boolean flags to indicate if you want to write a header line (which would use the column headers of the DataTable) and if you want to quote all values instead of only values that need to be quoted.

Since CSV files do not work well for binary data, your DataTable should not contain any, although the resulting file would still be valid and could be read back in.

WriteToStream just loops through all rows and columns of the DataTable and writes the individual data items to the output stream.

The WriteItem method finally encodes an individual data item and, if necessary or requested, adds quotes around it.

In upcoming parts of this article series, I'll provide and explain code to read a CSV file back into a DataTable, and how to use NUnit to test everything.

Using BlogEngine.net as a General Purpose Content Management System

So I keep running into the same problem – I am building a small website for somebody and I need to provide them with a way to update the content of their site so I don’t have to. Basically, I need a lightweight and flexible content management system that is easy to use.

If The Shoe Fits…

When I first thought of a lightweight CMS, I thought of graffiti. It sounds like exactly what I need. So I downloaded the express edition and started evaluating it. It seemed like a nice product and all is not free for commercial use ($399 is the cheapest commercial licence) and I can’t afford that price tag when building small websites.

Enter BlogEngine.net. My favorite blogging platform. There, I said it.  Well, I use it to run my blog and I am constantly tinkering around with the site all of the time because I enjoy using BlogEngine.net so much.

I thought that BlogEngine.net has all of the key pieces I needed for my lightweight CMS:

  1. A WYSIWYG Editor
  2. A Metaweblog interface
  3. Tons of extensibility

Basic Idea

I decided to base my CMS implementation on the concept of pages. Most blog engines have two distinct types of content: pages and posts. Posts are the typical type of content that becomes part of your blogs feed whereas pages are usually static content which can be anything outside of a blog post (for example an ‘About Me’ page). BlogEngine.net already has everything I need to get the content of page created and persisted in a data store (it supports xml and sql server out of the box). I decided to write a web control which I can place on any webpage and include the contents of a given page from the data store.

I made a control called PageViewer which you can place on the page like this:

<blog:PageViewer ID=“view” runat=“server” DisplayTitle=“false” PageId=“167eb7f3-135b-4f90-9756-be25ec10f14c” />

This control basically just looks up the page using the given id (this functionality is all provided by the existing BlogEngine.Core library) and displays its content. Here is the rendering logic

if (PageId != Guid.Empty)	
page = BlogEngine.Core.Page.GetPage(PageId);if (page != null){	ServingEventArgs arg = new ServingEventArgs(page.Content,ServingLocation.SinglePage);BlogEngine.Core.Page.OnServing(page, arg);	if (arg.Cancel)	Page.Response.Redirect("error404/", true);	
if (DisplayTitle)	{		writer.Write("<h1>");		writer.Write(page.Title);		writer.Write("</h1>");	}	
writer.Write("<div>");	writer.Write(arg.Body);	writer.Write("</div>");}

This code is pretty straight forward – all it does is get an instance of the page and then display its title in <h1> a tag and its body in <div> tag. This logic is actually straight from the existing page retrieval code that already exists in BlogEngine.net. This web control is pretty much the only new code I had to write. The rest of the project mostly involves moving files around and removing parts of the BlogEngine.net framework that I don’t need.

Armed with this control, we are ready to start converting the static pages from the old version of the website to be BlogEngine.net pages which can be stored and retrieved using the BlogEngine.Core classes.

Themes

It’s also worth noting that there are many themes available for BlogEngine

themes4blogengine.net 

Wiki

I've been after creating a Wiki for some time, and I think I may have found one worth looking at 'Screw Turn', not the most inspiring name but I have read good reviews, so I'll be having a look in my spare time at what it can do and how I can implement it www.screwturn.eu