SQL Server 2008 Table-Valued Parameters and C# Custom Iterators: A Match Made In Heaven!

The new Table-Valued Parameter (TVP) feature is perhaps the most significant T-SQL enhancement in SQL Server 2008. Many people have already discovered how to use this feature for passing entire sets of rows with a typed schema definition along from one stored procedure to another on the server. This is something that has not been previously possible using either table variables or temp tables, although temp tables can of course be shared between multiple stored procedures executing on the same connection. Conveniently, TVPs can be passed freely between stored procedures and user-defined functions (UDFs) on the server.

A less widely-known capability of TVPs—yet arguably their most compelling facet—is the ability to marshal an entire set of rows across the network, from your ADO.NET client to your SQL Server 2008 database, with a single stored procedure call (only one round-trip) that accepts a single table-valued parameter. Developers have resorted to a variety of clever hacks over the years to reduce multiple round-trips for processing multiple rows—including XML, delimited text, or even (gasp) accepting hundreds (up to 2100!) of parameters. But special logic then needs to be implemented for packaging and unpackaging the parameter values on both sides of the wire. Worse, the code to implement that logic is often gnarly, and tends to impair developer productivity. None of those techniques even come close to the elegance and simplicity of using TVPs as a native solution to this problem.

I won’t cover the basics of TVPs, since many folks know them already and there are numerous blog posts already on the Web that explain the basics (not to mention Chapter 2 in my book) What I will show is how to implement TVPs in ADO.NET client code for passing an entire DataTable object (all its rows and columns) to SQL Server with one stored procedure call.

If you’re a business-object person and not a DataSet/DataTable person, I’ll also show you how to do the same thing with an entire business object collection using C# custom iterators. There are few places (if any) other than this blog post that show the necessary steps to implement TVPs against business object collections (it’s not even covered in my book!). So let’s get started.

Preparing the SQL Server Database

Before diving into the C# code, we need to set up our database for a simple Order Entry scenario. That means creating an Order table and a related OrderDetail table. We’ll also need an OrderUdt and OrderDetailUdt user-defined table type (UDTT) to base our TVPs on, and a single InsertOrders stored procedure that accepts header and detail TVPs containing typed rows to be bulk inserted into the underlying tables:

CREATE TABLE [Order](
 OrderId int NOT NULL,
 CustomerId int NOT NULL,
 OrderedAt date NOT NULL,
 CreatedAt datetime2(0) NOT NULL,
  CONSTRAINT PK_Order PRIMARY KEY CLUSTERED (OrderId ASC))
GO

CREATE TABLE [OrderDetail](
 OrderId int NOT NULL,
 LineNumber int NOT NULL,
 ProductId int NOT NULL,
 Quantity int NOT NULL,
 Price money NOT NULL,
 CreatedAt datetime2(0) NOT NULL,
  CONSTRAINT PK_OrderDetail PRIMARY KEY CLUSTERED (OrderId ASC, LineNumber ASC))
GO

CREATE TYPE OrderUdt AS TABLE(
 OrderId int,
 CustomerId int,
 OrderedAt date)
GO

CREATE TYPE OrderDetailUdt AS TABLE(
 OrderId int,
 LineNumber int,
 ProductId int,
 Quantity int,
 Price money)
GO

CREATE PROCEDURE InsertOrders(
 @OrderHeaders AS OrderUdt READONLY,
 @OrderDetails AS OrderDetailUdt READONLY)
AS
 BEGIN

    -- Bulk insert order header rows from TVP
    INSERT INTO [Order]
     SELECT *, SYSDATETIME() FROM @OrderHeaders

    -- Bulk insert order detail rows from TVP
    INSERT INTO [OrderDetail]
     SELECT *, SYSDATETIME() FROM @OrderDetails

 END
GO

Notice how the schemas of the UDTTs and the actual tables themselves are almost, but not quite, identical. The CreatedAt column in the two tables is not present in the UDTTs, since we won’t be accepting the client’s date and time—as we shouldn’t trust it, not knowing the time zone, not being in sync with the server and other clients, etc. So we’ll accept all required column values for orders and order details from the client, except for CreatedAt, for which our stored procedure will call the SYSDATETIME function to provide the current date and time based on the server clock (just like the GETDATE function, but returns as a datetime2 data type rather than a datetime data type). Independent of CreatedAt, the header table has an OrderedAt column which will be accepted from the client as a date value (the “time-less” date and enhanced datetime2 are new data types in SQL Server 2008 that I’ll cover in a future post).

Passing a DataTable to SQL Server

The easiest way to call this stored procedure from a .NET client is to use a DataTable. You simply pass the entire populated DataTable object as the parameter value and set the parameter’s SqlDbType property to SqlDbType.Structured. Everything else is plain old vanilla ADO.NET:

var headers = new DataTable();
headers.Columns.Add("OrderId", typeof(int));
headers.Columns.Add("CustomerId", typeof(int));
headers.Columns.Add("OrderedAt", typeof(DateTime));

var details = new DataTable();
details.Columns.Add("OrderId", typeof(int));
details.Columns.Add("LineNumber", typeof(int));
details.Columns.Add("ProductId", typeof(int));
details.Columns.Add("Quantity", typeof(decimal));
details.Columns.Add("Price", typeof(int));

headers.Rows.Add(new object[] { 6, 51, DateTime.Today });
details.Rows.Add(new object[] { 6, 1, 12, 2, 15.95m });
details.Rows.Add(new object[] { 6, 2, 57, 1, 59.99m });
details.Rows.Add(new object[] { 6, 3, 36, 10, 8.50m });

headers.Rows.Add(new object[] { 7, 51, DateTime.Today });
details.Rows.Add(new object[] { 7, 1, 23, 2, 79.50m });
details.Rows.Add(new object[] { 7, 2, 78, 1, 3.25m });

using (var conn = new SqlConnection("Data Source=.;Initial Catalog=MyDb;Integrated Security=True;"))
{
  conn.Open();
  using (var cmd = new SqlCommand("InsertOrders", conn))
  {
    cmd.CommandType = CommandType.StoredProcedure;

    var headersParam = cmd.Parameters.AddWithValue("@OrderHeaders", headers);
    var detailsParam = cmd.Parameters.AddWithValue("@OrderDetails", details);

    headersParam.SqlDbType = SqlDbType.Structured;
    detailsParam.SqlDbType = SqlDbType.Structured;

    cmd.ExecuteNonQuery();
  }
  conn.Close();
}

The magic here is that the one line of code that invokes ExecuteNonQuery sends all the data for two orders with five details from the client over to the InsertOrders stored procedure on the server. The data is structured as a strongly-typed set of rows on the client, and winds up as a strongly-typed set of rows on the server in a single round-trip, without any extra work on our part.

This example passes a generic DataTable object to a TVP, but you can also pass a strongly typed DataTable or even a DbDataReader object (that is connected to another data source) to a TVP in exactly the same way. The only requirement is that the columns of the DataTable or DbDataReader correspond to the columns defined for the UDTT that the TVP is declared as.

So TVPs totally rock, eh? But hold on, the TVP story gets even better…

Passing a Collection of Objects to SQL Server

What if you’re working with collections populated with business objects rather than DataTable objects populated with DataRow objects? You might not think at first that business objects and TVPs could work together, but the fact is that they can… and quite gracefully too. All you need to do is implement the IEnumerable<SqlDataRecord> interface in your collection class. This interface requires your collection class to supply a C# custom iterator (sorry, VB .NET doesn’t support custom iterators) method named GetEnumerator which ADO.NET will call for each object contained in the collection when you invoke ExecuteNonQuery. Here are the detailed steps:

First we’ll define the OrderHeader and OrderDetail classes and properties, similar to the way we created the DataTable objects and columns before:

public class OrderHeader
{
  public int OrderId { get; set; }
  public int CustomerId { get; set; }
  public DateTime OrderedAt { get; set; }
}

public class OrderDetail
{
  public int OrderId { get; set; }
  public int LineNumber { get; set; }
  public int ProductId { get; set; }
  public int Quantity { get; set; }
  public decimal Price { get; set; }
}

Ordinarily, List<OrderHeader> and List<OrderDetail> objects might be suitable for serving as collections of OrderHeader and OrderDetail objects in our application. But these collections they won’t suffice on their own as input values for TVPs because List<T> doesn’t implement IEnumerable<SqlDataRecord>. We need to add that ourselves. So we’ll define OrderHeaderCollection and OrderDetailCollection classes that inherit List<OrderHeader> and List<OrderDetail> respectively, and also implement IEnumerable<SqlDataRecord> to “TVP-enable” them:

public class OrderHeaderCollection : List<OrderHeader>, IEnumerable<SqlDataRecord>
{
  IEnumerator<SqlDataRecord> IEnumerable<SqlDataRecord>.GetEnumerator()
  {
    var sdr = new SqlDataRecord(
     new SqlMetaData("OrderId", SqlDbType.Int),
     new SqlMetaData("CustomerId", SqlDbType.Int),
     new SqlMetaData("OrderedAt", SqlDbType.Date));

    foreach (OrderHeader oh in this)
    {
      sdr.SetInt32(0, oh.OrderId);
      sdr.SetInt32(1, oh.CustomerId);
      sdr.SetDateTime(2, oh.OrderedAt);

      yield return sdr;
    }
  }
}

public class OrderDetailCollection : List<OrderDetail>, IEnumerable<SqlDataRecord>
{
  IEnumerator<SqlDataRecord> IEnumerable<SqlDataRecord>.GetEnumerator()
  {
    var sdr = new SqlDataRecord(
     new SqlMetaData("OrderId", SqlDbType.Int),
     new SqlMetaData("LineNumber", SqlDbType.Int),
     new SqlMetaData("ProductId", SqlDbType.Int),
     new SqlMetaData("Quantity", SqlDbType.Int),
     new SqlMetaData("Price", SqlDbType.Money));

    foreach (OrderDetail od in this)
    {
      sdr.SetInt32(0, od.OrderId);
      sdr.SetInt32(1, od.LineNumber);
      sdr.SetInt32(2, od.ProductId);
      sdr.SetInt32(3, od.Quantity);
      sdr.SetDecimal(4, od.Price);

      yield return sdr;
    }
  }
}

I’ll only explain the OrderHeaderCollection class; you can then infer how the OrderDetailCollection class–or any of your own collection classes–implements the custom iterator needed to support TVPs.

First, again, it inherits List<OrderHeader>, so an OrderHeaderCollection object is everything that a List<OrderHeader> object is. This means implicitly, by the way, that it also implements IEnumerable<OrderHeader>, which is what makes any sequence “foreach-able” or “LINQ-able”. But in addition, it explicitly implements IEnumerable<SqlDataRecord> which means it also has a customer iterator method for ADO.NET to consume when an instance of this collection class is assigned to a SqlDbType.Structured parameter for piping over to SQL Server with a TVP.

Every enumerable class requires a matching enumerator method, so not surprisingly implementing IEnumerable<SqlDataRecord> requires providing a GetEnumerator method that returns an IEnumerator<SqlDataRecord>. This method first initializes a new SqlDataRecord object with a schema that matches the UDTTs that the TVPs are declared as. It then enters a loop that iterates all the elements in the collection (possible because List<OrderHeader> implicitly implements IEnumerable<OrderHeader>). On the first iteration, it sets the column property values of the SqlDataRecord object to the property values of the first OrderHeader element, and then issues the magic yield return statement. By definition, any method (like this one) which returns IEnumerator<T> and has a yield return statement in it, is a custom iterator method that is expected to return a sequence of T objects until the method execution path completes (in this case, when the foreach loop finishes).

The crux of this is that we are never calling this method directly. Instead, when we invoke ExecuteNonQuery to run a stored procedure with a SqlDbType.Structured parameter (that is, a TVP), ADO.NET expects the collection passed for the parameter value to implement IEnumerable<SqlDataRecord> so that IEnumerable<SqlDataRecord>.GetEnumerator can be called internally to fetch each new record for piping over to the server. When the first element is fetched from the collection, GetEnumerator is entered, the SqlDataRecord is initialized and is then populated with values using the SetInt32 and SetDateTime methods (there’s a SetXXX method for each data type). That SqlDataRecord “row” is then pushed into the pipeline to the server by yield return. When the next element is fetched from the collection, the GetEnumerator method resumes from the point that it yield returned the previous element, rather than entering GetEnumerator again from the top. This means the SqlDataRecord gets initialized with schema information only once, while its population with one element after another is orchestrated by the controlling ADO.NET code for ExecuteNonQuery that actually ships one SqlDataRecord after another to the server..

The actual ADO.NET code is 100% identical to the code at the top of the post that works with DataTable objects. Substituting a collection for a DataTable object requires no code changes and works flawlessly, provided the collection implements IEnumerator<SqlDataRecord> and provides a GetEnumerator method that maps each object instance to a SqlDataRecord and each object property to a column defined by the UDTT that the TVP is declared as.

How freaking c-o-o-l is that?

Understanding Generic Delegates and Lambda Expressions with LINQ Extension Methods

Generic delegates and lambda expressions underpin LINQ. If you’ve been struggling to get your arms around these new concepts, this post should help demystify matters for you.

Of course, there are a host of other new .NET language features that enable LINQ as well. These include extension methods, type inference, anonymous types, object initializers, and custom iterators, just to name a few! In this post, I’ll address generic delegates and lambda expressions as they pertain to LINQ (that is, as they pertain to the extension methods provided in .NET 3.5 to support LINQ), and I’ll cover some of the other new language features in future posts.

LINQ Query Syntax

There is simply no LINQ without generic delegates and lambda expressions, yet it would seem at first that you can write simple LINQ queries without seeing or touching either a generic delegate or a lambda expression. Here’s a simple example that queries from a collection of customers in custs. In the where clause, we use a boolean allCustomers variable to control whether all customers or only USA customers will be returned.

var allCustomers = false;

var q =
from c in custs
where c.Country == "USA" || allCustomers == true
orderby c.FirstName
select c;

Where’s the generic delegate and the lambda expression in this LINQ query? Oh they’re there all right, but the LINQ query syntax hides those details from the code.

LINQ Method Call Syntax

In reality, the compiler (C# in this case, but VB .NET as well) is supporting a sugar-coated syntax that is strikingly similar to an ordinary SQL query in the RDBMS world, but which really amounts to a chained set of method calls.

What this means is that there is really no such thing as a where clause in LINQ–it’s a Where extension method call.

And guess what? There’s no such thing as an orderby clause either–it’s an OrderBy extension method call, chained to the end of the Where method call.

Meaning that the previous query is treated by the compiler exactly as if you had coded it using LINQ method call syntax as follows:

var q = custs
.Where(c => c.Country == "USA" || allCustomers == true)
.OrderBy(c => c.FirstName);

So this code looks more “conventional,” eh? We’re invoking the Where method on the collection of customers, and then invoking the OrderBy method on the result returned by Where, and obtaining our final filtered and sorted results in q.

LINQ query syntax is obviously cleaner than LINQ method call syntax, but it’s important to understand that only LINQ method call syntax fully implements the complete set of LINQ Standard Query Operators (SQOs), so that you will sometimes be forced to use method call syntax. Also realize that it’s common and normal to use combinations of the two in a single query (I’ll show examples of this using .Concat and other methods in future posts).

Two questions arise at this point:

1) How did the customer collection, which is a List<Customer> in our example, get a Where and OrderBy method?

2) What is the => operator in the method parameters, and what does it do (and how the heck do you pronounce it)?

The answer to the first question is simple: Extension methods. I’ll cover those in more detail in a future post, but for now just understand that the .NET framework 3.5 “extends” any class that implements IEnumerable<T> (virtually any collection, such as our List<Customer>) with the Where and OrderBy methods, as well as many other methods that enable LINQ.

Generic Delegates

The second question is much juicier. First, the => is the lambda operator, and many people verbalize it as ‘goes to’. So ‘c => …’ is read aloud as ‘c goes to…’

To understand lambdas, you need to understand generic delegates and anonymous methods. And to best understand those is to review delegates and generics themselves. If you’ve been programming .NET for any length of time, you’re certain to have encountered delegates. And as of .NET 2.0 (VS 2005), generics have been an important language feature for OOP as well.

A delegate is a data type that defines the “shape” (signature) of a method call. Once you have defined a delegate, you can declare a variable as that delegate. At runtime, the variable can then be pointed to (and then invoke) any available method that matches the signature defined by the delegate. EventHandler is a typical example of a delegate defined by the .NET framework. Here is how .NET defines the EventHandler delegate:

public delegate void EventHandler(object sender, EventArgs e);

This delegate is suitable for pointing to any method that returns no value (void), and accepts two parameters (object sender and EventArgs e); in other words, a typical event handler that has no special event arguments to be passed.

A generic delegate is nothing more than an ordinary delegate, but supports generic type declarations for achieving strongly-typed definitions of the delegate at compile time. This is the same way generics are used for classes. For example, just as the framework provides a List<T> class so that you can have a List of anything, it now also provides a small set of generic delegates named Func that can be used to stongly type many different method signatures; essentially, any method that takes from zero to 4 parameters of any type (T1 through T4) and returns an object of any type (TResult):

public delegate TResult Func<TResult>();
public delegate TResult Func<T, TResult>(T arg);
public delegate TResult Func<T1, T2, TResult>(T1 arg1, T2 arg2);
public delegate TResult Func<T1, T2, T3, TResult>(T1 arg1, T2 arg2, T3 arg3);
public delegate TResult Func<T1, T2, T3, T4, TResult>(T1 arg1, T2 arg2, T3 arg3, T4 arg4);

Delegates as Parameters

With these generic Func delegates baked into the .NET framework, LINQ queries can use them as parameters to extension methods such as Where and OrderBy. Let’s examine one overload of the Where extension method:

public static IEnumerable<TSource> Where<TSource>(
  this IEnumerable<TSource> source,
  Func<TSource, bool> predicate);

What does this mean? Let’s break it down. And to better suit our particular example and simplify the explanation, let’s substitute Customer for TSource:

1) It’s a method called Where<Customer>.

2) It returns an IEnumerable<Customer> sequence.

3) It’s an extension method available to operate on any IEnumerable<Customer>instance (as denoted by the special this keyword in the signature), such as the List<Customer> in our example. The this keyword used in this context means that an extension method for the type following the this keyword is being defined, rather than definining what looks like the first parameter to the method.

4) It takes a single predicate parameter of type Func<Customer, bool>.

Number 4 is key. Func<Customer, bool> is the second Func generic delegate shown above, Func<T, TResult>, where T is Customer and TResult is bool (Boolean). The Func<Customer, bool> delegate refers to any method that accepts a Customer object and returns a bool result.  This means that the Where<Customer> method takes a delegate (function pointer) that points to another method which actually implements the filtering logic. This other method will receive each Customer object as the LINQ query iterates the source sequence List<Customer>, apply the desired filtering criteria on it, and return a true or false result in the bool return value that controls whether this particular Customer object should be selected by the query.

How does all that get expressed simply with: .Where(c => c.Country == “USA” || allCustomers == true)?

The best way to answer that question is to first demonstrate two other ways of invoking the Where method. The first is to explicitly provide a delegate instance of Func<Customer, bool> that points to another method called IsCustomerCountryUSA:

private void MainMethod()
{
    var filterMethod = new Func<Customer, bool>(IsCustomerCountryUSA);
    var q = custs.Where(filterMethod);
}

private bool IsCustomerCountryUSA(Customer c)
{
return (c.Country == "USA");
}

The Where method takes a single parameter, which is a new instance of the Func<TSource, T> generic delegate that points to a method named IsCustomerCountryUSA. This works because IsCustomerCountryUSA accepts a Customer object and returns a bool result, which is the signature defined by Func<Customer, bool> expected as a parameter by Where<Customer>.  Inside the IsCustomerCountryUSA, filtering logic is applied. Note that because we have created IsCustomerCountryUSA as a completely separate method, it cannot access the allCustomers variable defined locally in the calling method (the variable that controls whether all or only USA customers should be selected, as shown at the beginning of this post). Furthermore, because the signature of the IsCustomerCountryUSA method is fixed as determined by Func<Customer, bool>, we can’t just pass the allCustomers variable along as another parameter either.  Thus, this approach would require you to define the allCustomers variable as a private member if you wanted to include it in the filtering logic of the IsCustomerCountryUSA method.

Anonymous Methods

Anonymous methods were introduced at the same time as generics in .NET 2.0 (VS 2005). They are useful in cases such as this, where the method we’re creating for the delegate only needs to be called from one place, so the entire method body is simply coded in-line. Since it’s coded in-line at the one and only place it’s invoked, the method doesn’t need to be given a name, hence the term anonymous method.

To refactor the delegate instance version of the code to use an anonymous method, replace the Func parameter passed to the Where method with the actual method signature and body as follows:

var q = custs.Where(delegate(Customer c) { return c.Country == "USA" || allCustomers == true; });

The keyword delegate indicates you’re pointing the delegate parameter Func<Customer, bool> expected by the Where method to an anonymous method. The anonymous method isn’t named, as mentioned, but it must match the Func<Customer, bool> signature, meaning it must accept a Customer object parameter and return a bool result. The explicit signature following the delegate keyword defines the expected single Customer object parameter, and the expected bool result is inferred from the fact that the method returns an expression that resolves to a bool value. If the anonymous method used any other signature or returned anything other than a bool result, the compiler would fail to build your code  because the Where<Customer> method is defined to accept only a Func<Customer, bool> delegate.

So now the method body containing the filtering logic that was formerly coded in the separate IsCustomerCountryUSA method appears in braces right after the anonymous method signature. But there’s another huge advantage here besides saving the overhead of declaring a separate method for the filtering logic. Because anonymous methods appear inside of the method that calls them, they magically gain access to the local variables of the calling method. Thus, this implementation can also test the allCustomers variable, even though it’s defined in the calling method which would normally be inaccessible to the method being called. Because of this feature, the allCustomers variable defined as a local variable in the calling method can be tested by the anonymous method as part of the filtering criteria. That means it’s not necessary to define private fields to share local variables defined in a method with an anonymous method that the method calls, as demonstrated here). The result is less code, and cleaner code (though you should be aware that the compiler pulls some fancy tricks to make this possible, and there is a degree of additional runtime overhead involved when accessing local variables of the calling method from the anonymous method being called).

Lambda Expressions

Lambdas were introduced in .NET 3.5 (VS 2008) as a more elegant way of implementing anonymous methods, particularly with LINQ. When a lambda is used to represent an anonymous method containing a single expression which returns a single result, that anonymous method is called a lambda expression. Let’s first refactor our query by converting it from an anonymous method to a lambda:

var q = custs.Where((Customer c) => { return c.Country == "USA" || allCustomers == true; });

All we did was drop the delegate keyword and add the => (“goes to”) operator. The rest looks and works the same as before.

This lambda consists of a single expression that returns a single result, which qualifies it to be refactored as a lambda expression like this:

var q = custs.Where(c => c.Country == "USA" || allCustomers == true);

This final version again works exactly the same way, but has become very terse. First, the Customer data type in the signature isn’t needed, since it’s inferred as the TSource in Func<TSource, bool>. The parentheses surrounding the signature are also not needed, so they’re dropped too. Secondly, since lambda expressions by definition consist of only a single statement, the opening and closing braces for the method body aren’t needed and also get dropped. Finally, since lambda expressions by definition return a single result, the return keyword isn’t needed and is also dropped.

Besides offering an even terser syntax, lambda expressions provide two important additional benefits over anonymous methods. First, as just demonstrated, the data type of the parameter in the signature needn’t be specified since it’s inferred automatically by the compiler. That means that anonymous types (such as those often created by projections defined in the select clause of a LINQ query) can also be passed to lambda expressions. Anonymous types cannot be passed to any other type of method, so this capability was absolutely required to support anonymous types with LINQ. Second, lambda expressions can be used to build expression trees at runtime, which essentially means that your query code gets converted into query data which can then be processed by a particular LINQ implementation at runtime. For example, LINQ to SQL builds an expression tree from the lambda expressions in your query, and generates the appropriate T-SQL for querying SQL Server from the expression tree. Future posts will cover anonymous types and expression trees in greater detail.

To Sum It All Up

So this final lambda expression version invokes the Where method on the List<Customer> collection in custs. Because List<Customer> implements IEnumerable<T>, the Where extension method which is defined for any IEnumerable<T> is available to be invoked on it. The Where<Customer> method expects a single parameter of the generic delegate type Func<Customer, bool>. Such a parameter can be satisfied by the in-line lambda expression shown above, which accepts a Customer object as c =>, and returns a bool result by applying a filtering condition on the customer’s Country property and the calling method’s allCustomers variable.

And as expained at the beginning of this post, the method syntax query using our final lambda expression is exactly what the compiler would generate for us if we expressed it using this query syntax:

var q =
    from c in custs
    where c.Country == "USA" || allCustomers == true
    select c;

I hope this helps clear up any confusion there is surrounding lambda expressions in .NET 3.5.

Happy coding!

Disable the Browser Back Button in your ASP.NET Applications

If you’ve just come to accept the Back button as a necessary evil, I’ve got a few simple solutions for you that prevent the user from backing up past a designated “one-time only” page in your application.

For example, if you have a wizard with multiple steps, it’s fine to let the user click the browser’s Back (and then Forward) buttons all they want as they navigate from page to page. But when they click Finish on the last page, and you save information to the database, you don’t want them to be able to click Back to the last wizard page and then click Finish again. This would result in repeating the server side processing (in this example, saving to the database again, but in another scenario this could mean charging a credit card multiple times). The proper approach would be for the user to start the wizard over from the beginning, if the intent is really to submit another new request that’s entirely separate from the last one.

Disabling the browser cache for the page is not a real solution (though I’ll show it anyway), because the user gets an ugly message when they click Back. In this post, I’ll show you how to add just a tiny bit of code to achieve an ideal solution that effectively disables the Back button.

Solution 1: Ask the user nicely and hope they listen

Most developers think there’s no way to disable the browser’s Back button and just don’t even try at something they believe to be futile. And that’s why you see messages like “Please don’t click the browser’s Back button after confirming your purchase as doing so may result in a duplicate charge to your credit card” on practically every e-commerce site on the Web. Surely there’s a better way…

Solution 2: Disable caching for the page

OK, so let’s just not cache the page with the Finish button, the one that does all the work that we don’t want repeated should the user click Back and then resubmit. We can do that by overriding the OnPreInit method (same thing as, but more efficient than, handling the PreInit event) and setting the various caching-related properties on the Response object to prevent the browser from caching the page:

protected override void OnPreInit(EventArgs e)
{
    base.OnPreInit(e);

    Response.Buffer = true;
    Response.ExpiresAbsolute = DateTime.Now.AddDays(-1d);
    Response.Expires = -1500;
    Response.CacheControl = "no-cache";
    Response.Cache.SetCacheability(HttpCacheability.NoCache);
}

See http://forums.asp.net/t/1060173.aspx and http://www.codeproject.com/KB/aspnet/NoCaching.aspx for more information on these property settings.

One problem solved (no more risk that the user will click Finish twice), but a new problem is created (user gets “Webpage has expired” error if they click Back):

Expired

The screen shot above is from Internet Explorer; different browsers (Chrome, FireFox, etc.) handle the condition by prompting the user in different ways. But the bottom line is that this solution results in a degraded user experience because we’re not really disabling the Back button. We’re just preventing the page from being stored in the browser history, so that when the user does click the Back button, they get the expired error/prompt instead of the page they expected to go back to. Is there a way around this? You bet! Read on…

Solution 3: Disable the Back button completely

This is the simplest solution that disables the Back button completely. How simple? Just add onload=”window.history.forward();” to the <body> tag in the page(s) that you don’t want the user to be able to click Back to.

<body onload="window.history.forward();">

With this solution, the page tries to move forward in history as soon as it loads. Of course, for newly served pages this has no effect, since there’s no forward history. But when the user clicks the Back button to reload the same page from the browser history, the page immediately forces the browser to pop back to where the user came from, as though they clicked Forward right after clicking Back. But it happens before the previous page even has a chance to load because the “forward” command is invoked on the onload event of the previous page. Including this in every page in your application will disable the Back button completely (for postbacks too), which may not be something you want. Since users do expect to be able to use the Back button, you should let them do so except when undesirable behavior results (such as charging their credit card more than once).

Solution 4: Use this “hidden field/script/postback” trick

This solution is a little more involved, but is more flexible and opens up other interesting possibilities (for example, it gives you greater control when server-side validation is involved).

The gist of it? — You can’t directly disable or otherwise control the behavior of the Back button. But you can deliver a page that always forces a redirect, which gets stored normally in the browser’s history, such that when the user clicks Back, the redirect will always be forced.

Here’s the recipe:

1) Add a hidden field

In the .aspx markup, add a hidden field server control to your page that starts off with no initial value:

<input type="hidden" id="hidIsCommitted" runat="server" />

This field will get set only when the code-behind performs the processing you don’t want repeated (e.g., database save, credit card charge) if the user clicks Back and then resubmits.

2) Just beneath the hidden field, add the following script:

<script language="javascript" type="text/javascript">
    if (document.getElementById('<%= this.hidIsCommitted.ClientID %>').value == '1')
    {
        location.href = 'ThankYouPage.aspx';
    }
</script>

This simple code tests the value of the hidden field, and if it is set, the code redirects the user to the Thank You page (the page the user is normally redirected to after clicking Finish). Of course, the hidden field is not initially set, since we added it above with no value and no code has set its value yet.

Note the importance of using both document.getElementById and the bee-sting <%= %> server-side replacement markup for the hidden field’s ClientID value. Using getElementById ensures cross-browser compatibility, and the ClientID property resolves to the actual ID generated for the hidden field in the DOM, which includes additional unique identifiers if the element is contained inside a templated control, content placeholder, or other naming container.

3) Just beneath the script, add the following server-side span element with another script block inside of it:

<span id="spnCommitScript" runat<span style="color:#000000;">="server" visible="false">
    <script language="javascript" type="text/javascript"></span>
        document.getElementById('<%= this.hidIsCommitted.ClientID %>').value = '1';
        var theForm = document.forms['form1'];
        if (!theForm) theForm = document.form1;
        theForm.submit();
    </script>
</span>

This code sets the hidden field value and then performs a postback, but because the script block is wrapped inside of a server-side span element named spnCommitScript whose visible property is set to false, this code is not actually sent to the client. Only when the span element is made visible by server-side code will this client script get sent to the client, and the server-side code will not make the span visible until after completing the processing that occurs when the user clicks Finish.

4) In the code-behind, just at the point in time when you have completed processing (saving to the database, charging the credit card, etc.), make the span visible. Basically, this is the same point in time that you would normally issue a Response.Redirect to the Thank You page. But instead, you just make the span visible:

protected void btnFinish_Click(object sender, EventArgs e)
{
    if (!this.PerformAdditionalValidation())
    {
        this.lblValidationErrorMessage.Visible = true;
        return;
    }

    // Update database, charge credit card, etc.
    <span style="color:#993366;"><span style="color:#000000;">this.spnCommitScript.Visible = true;</span>
</span>}

The result? When the page returns to the client, the hidden field value is still not set. So the initial JavaScript test still returns false and the user does not get redirected to the Thank You page yet. But right after that, additional JavaScript code executes that has never execute before because the span has always been invisible. And that JavaScript code goes ahead and sets the hidden field, and then invokes a postback. The page returned to the client after the postback is then stored in the browser’s history cache normally. But this version of the page has the hidden field value set which causes redirect to the Thank You page. So clicking Back from the Thank You page results in reloading a page that forces itself to redirect immedately back again to the Thank You page. If the user clicks the Back button all day long, they won’t be able to back up past the Thank You page. Even if they advance several pages forward in the application, they’ll be able to click Back only up until reaching the Thank You page, but no further back than that.

All of the markup in steps 1, 2, and 3 can be placed in a Master page, so that it can be shared by all Content pages. With this approach, you would implement the redirect page as a variable rather than hard-coded as ThankYouPage.aspx and thus achieve an extremely lightweight framework for handling the Back button in this manner across your entire application. Only the code in step 4 (modified to also set the desired redirect page name) needs to be applied whereever you’d normally code an ordinary Response.Redirect in your code-behind.

 Hope this information saves you some grief trying to cope with the Evil Back Button!