Using SqlFileStream in C# to Access SQL Server FILESTREAM Data

FILESTREAM is a powerful feature in SQL Server that stores varbinary(max) column data (BLOBs) in the file system (where BLOBs belong) rather than in the database’s structured file groups (where BLOBs kill performance). This feature was first introduced in SQL Server 2008, and is now being expanded with the new FileTable feature coming in SQL Server 2012 (code-named “Denali”), which I’ll cover in a near-future post (as well as in my Programming Microsoft SQL Server 2012 book, which I’m writing now).

If you’re not already familiar with FILESTREAM, you can get the necessary background by reading my two earlier blog posts: Introducing FILESTREAM and Enabling and Using FILESTREAM. In this post, I’ll show you how to use the SqlFileStream class to achieve high-performance streaming of SQL Server FILESTREAM data in your C# applications. This is a simplified version of the OpenSqlFilestream code I presented in my Using the OpenSqlFilestream API blog post.

What Is SqlFileStream?

SqlFileStream is a class in the .NET Framework (.NET 3.5 SP1 and higher) that wraps the OpenSqlFilestream function exposed by the SQL Server Native Client API. This lets you stream BLOBs directly between SQL Server and your .NET application written in C#. SqlFileStream is always used within a transaction, just at the point that you want to store and retrieve BLOBs to and from varbinary(max) FILESTREAM columns. At that point in time, SQL Server will “step aside” and let you stream directly against the server’s NTFS file system—a native environment optimized for streaming. This provides you with a streaming “tunnel” between your application and SQL Server’s internally-managed file system. Using SqlFileStream will give your application lightning-fast BLOB performance. Let’s dive in!

Creating the Database

Before getting started, be sure that FILESTREAM is enabled for remote file system access at both the Windows Service and SQL Server instance levels (as explained in Enabling and Using FILESTREAM). Then create a FILESTREAM-enabled database as follows (be sure to create the directory, C:\DB in this example, before creating the database):

CREATE DATABASE PhotoLibrary
 ON PRIMARY
  (NAME = PhotoLibrary_data,
   FILENAME = 'C:\DB\PhotoLibrary_data.mdf'),
 FILEGROUP FileStreamGroup CONTAINS FILESTREAM
  (NAME = PhotoLibrary_blobs,
   FILENAME = 'C:\DB\Photos')
 LOG ON
  (NAME = PhotoLibrary_log,
   FILENAME = 'C:\DB\PhotoLibrary_log.ldf')

Next, use the database and create a table for BLOB storage as follows:

USE PhotoLibrary
GO

CREATE TABLE PhotoAlbum(
 PhotoId int PRIMARY KEY,
 RowId uniqueidentifier ROWGUIDCOL NOT NULL UNIQUE DEFAULT NEWID(),
 Description varchar(max),
 Photo varbinary(max) FILESTREAM DEFAULT(0x))

In this table, the Photo column is declared as varbinary(max) FILESTREAM, and will hold pictures that will be stored in the file system behind the scenes. (Refer to Enabling and Using FILESTREAM for a complete explanation of the varbinary(max) FILESTREAM and ROWGUIDCOL columns.) Notice the default value we’ve established for the BLOB column. The value 0x represents a zero-length binary stream, which is different than NULL. Think of it as the difference between a zero-length string and a null string in .NET; the two are not the same. Similarly, you won’t be able to use SqlFileStream against NULL instances of varbinary(max) FILESTREAM columns, and you’ll soon see why.

Writing SqlFileStream Code

Start Visual Studio, create a new Class Library project, and add a PhotoData class as follows:

using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using System.Drawing;
using System.IO;
using System.Transactions;

namespace PhotoLibraryApp
{
  public class PhotoData
  {
    private const string ConnStr =
      "Data Source=.;Integrated Security=True;Initial Catalog=PhotoLibrary;";

    public static void InsertPhoto
      (int photoId, string desc, string filename)
    {
      const string InsertTSql = @"
        INSERT INTO PhotoAlbum(PhotoId, Description)
          VALUES(@PhotoId, @Description);
        SELECT Photo.PathName(), GET_FILESTREAM_TRANSACTION_CONTEXT()
          FROM PhotoAlbum
          WHERE PhotoId = @PhotoId";

      string serverPath;
      byte[] serverTxn;

      using (TransactionScope ts = new TransactionScope())
      {
        using (SqlConnection conn = new SqlConnection(ConnStr))
        {
          conn.Open();

          using (SqlCommand cmd = new SqlCommand(InsertTSql, conn))
          {
            cmd.Parameters.Add("@PhotoId", SqlDbType.Int).Value = photoId;
            cmd.Parameters.Add("@Description", SqlDbType.VarChar).Value = desc;
            using (SqlDataReader rdr = cmd.ExecuteReader())
            {
              rdr.Read();
              serverPath = rdr.GetSqlString(0).Value;
              serverTxn = rdr.GetSqlBinary(1).Value;
              rdr.Close();
            }
          }
          SavePhotoFile(filename, serverPath, serverTxn);
        }
        ts.Complete();
      }
    }

    private static void SavePhotoFile
      (string clientPath, string serverPath, byte[] serverTxn)
    {
      const int BlockSize = 1024 * 512;

      using (FileStream source =
        new FileStream(clientPath, FileMode.Open, FileAccess.Read))
      {
        using (SqlFileStream dest =
          new SqlFileStream(serverPath, serverTxn, FileAccess.Write))
        {
          byte[] buffer = new byte[BlockSize];
          int bytesRead;
          while ((bytesRead = source.Read(buffer, 0, buffer.Length)) > 0)
          {
            dest.Write(buffer, 0, bytesRead);
            dest.Flush();
          }
          dest.Close();
        }
        source.Close();
      }
    }

    public static Image SelectPhoto(int photoId, out string desc)
    {
      const string SelectTSql = @"
        SELECT
            Description,
            Photo.PathName(),
            GET_FILESTREAM_TRANSACTION_CONTEXT()
          FROM PhotoAlbum
          WHERE PhotoId = @PhotoId";

      Image photo;
      string serverPath;
      byte[] serverTxn;

      using (TransactionScope ts = new TransactionScope())
      {
        using (SqlConnection conn = new SqlConnection(ConnStr))
        {
          conn.Open();

          using (SqlCommand cmd = new SqlCommand(SelectTSql, conn))
          {
            cmd.Parameters.Add("@PhotoId", SqlDbType.Int).Value = photoId;

            using (SqlDataReader rdr = cmd.ExecuteReader())
            {
              rdr.Read();
              desc = rdr.GetSqlString(0).Value;
              serverPath = rdr.GetSqlString(1).Value;
              serverTxn = rdr.GetSqlBinary(2).Value;
              rdr.Close();
            }
          }
          photo = LoadPhotoImage(serverPath, serverTxn);
        }

        ts.Complete();
      }

      return photo;
    }

    private static Image LoadPhotoImage(string filePath, byte[] txnToken)
    {
      Image photo;

      using (SqlFileStream sfs =
        new SqlFileStream(filePath, txnToken, FileAccess.Read))
      {
        photo = Image.FromStream(sfs);
        sfs.Close();
      }

      return photo;
    }

  }
}

Let’s explain the code in detail. We’ll start at the top with some required namespace inclusions. The two using statements to take notice of are System.Data.SqlTypes and System.Transactions. The System.Data.SqlTypes namespace defines the SqlFileStream class that we’ll be using to stream BLOBs. No special assembly reference is required to use this class, because it is provided by the System.Data.dll assembly that our project is already referencing (Visual Studio set this reference automatically when it created our project). The System.Transactions namespace defines the TransactionScope class that lets us code implicit transactions against the database. This class is provided by the System.Transactions.dll assembly, which is not referenced automatically. You’ll need to add a reference to it now, or the code will not compile. Right-click the project in Solution Explorer and choose Add Reference. In the Add Reference dialog, click the .NET tab, and scroll to find the System.Transactionscomponent. Then double-click it to add the reference.

At the top of the class, we define a connection string as a hard-coded constant named ConnStr. This is just for demonstration purposes; a real-world application would store the connection string elsewhere (such as in a configuration file, possibly encrypted), but we’re keeping our example simple.

Streaming Into SQL Server

The first method defined in the class is InsertPhoto, which accepts a new photo integer ID, string description, and full path to an image file to be saved to the database. Notice that the InsertTSql string constant defined at the top of the method specifies an INSERT statement that includes the PhotoId and Description columns, but not the actual Photo BLOB column itself. Instead, the INSERT statement is followed immediately by a SELECT statement that retrieves two pieces of information we’ll use to stream the BLOB into the Photo column much more efficiently than using ordinary T-SQL—namely, a logical UNC path name to the file and the transactional context token. These are the two values needed to use SqlFileStream, and you’re about to see how exactly. But all we’ve done so far is define a constant holding two T-SQL statements. The constant is followed by two variables declarations serverPath and serverTxn that will receive the two special values when we later execute those T-SQL statements.

The method then creates and enters a new TransactionScope block. This does not actually begin the database transaction (we’ve not even connected to the database yet), but rather declares that all data access within the block (and in any code called from within the block) must participate in a database transaction. Inside the TransactionScope block, the code creates and opens a new SqlConnection. Being the first data access code inside the TransactionScope block, this also implicitly begins the database transaction. Next, it creates a SqlCommand object associated with the open connection and prepares its command text to contain our T-SQL statements (the INSERT followed by the SELECT).

Invoking the ExecuteReader method executes the T-SQL statements and returns a reader from which we can retrieve the values returned by the SELECT statement. The transaction is still pending at this time. Our INSERT statement does not provide a value for RowId and instead allows SQL Server to automatically generate and assign a new uniqueidentifier ROWGUID value by default. I’ve also pointed out that no value is provided for the Photo column—and this is exactly how the default 0x value that we defined earlier for the Photo column comes into play (I said we’d come back to it, and here we are).

Although the row has been added by the INSERT statement, it will rollback (disappear) if a problem occurs before the transaction is committed. Because we didn’t provide a BLOB value for the Photo column in the new row, SQL Server honors the default value 0x that we established for it in the CREATE TABLE statement for PhotoAlbum. This represents a zero-length binary stream, which is completely different than NULL. Being a varbinary(max) column decorated with the FILESTREAM attribute, an empty file gets created in the file system that SQL Server associates with the new row. At the same time, SQL Server initiates an NTFS file system transaction over this new empty file and synchronizes it with the database transaction. So just like the new row, the new file will disappear if the database transaction does not commit successfully.

Immediately following the INSERT statement, the SELECT statement returns Photo.PathName and GET_FILESTREAM_TRANSACTION_CONTEXT. What we’re essentially doing with the WHERE clause in this SELECT statement is reading back the same row we have just added (but not yet committed) to the PhotoAlbum table in order to reference the BLOB stored in the new file that was just created (also not yet committed) in the file system.

The value returned by Photo.PathName is a fabricated path to the BLOB for the selected PhotoId. The path is expressed in UNC format, and points to the network share name established for the server instance when you first enabled FILESTREAM (by default, this is MSSQLSERVER). It is not a path the file’s physical location on the server, but rather contains information SQL Server can use to derive the file’s physical location. For example, you’ll notice that it always contains the GUID value in the uniqueidentifier ROWGUIDCOL column of the BLOB’s corresponding row. We retrieve the path value from the reader’s first column and store it in the serverPath string variable.

We just explained how SQL Server initiated an NTFS file system transaction over the FILESTREAM data in the new row’s Photo column when we started our database transaction. The GET_FILESTREAM_TRANSACTION_CONTEXT function returns a handle to that NTFS transaction (if you’re not inside a transaction, this function will return NULL and your code won’t work). We obtain the transaction context, which is returned by the reader’s second column as a SqlBinary value, and store it in the byte array named serverTxn.

Armed with the BLOB path reference in serverPath and the transaction context in serverTxn, we have what we need to create a SqlFileStream object and perform direct file access to stream our image into the Photo column. We close the reader, terminate its using block, then terminate the enclosing using block for the SqlConnection as well. This would normally close the database connection implicitly, but that gets deferred in this case because the code is still nested inside the outer using block for the TransactionScope object. So the connection is still open at this time, and the transaction is still pending. It is precisely at this point that we call the SavePhotoFile method to stream the specified image file into the Photo column of the newly inserted PhotoAlbum row, overwriting the empty file just created by default. When control returns from SavePhotoFile, the TransactionScope object’s Complete method is invoked and its using block is terminated, signaling the transaction management API that everything worked as expected. This implicitly commits the database transaction (which in turn commits the NTFS file system transaction) and closes the database connection.

The SavePhotoFile method reads from the source file and writes to the database FILESTREAM storage in 512 KB chunks at a time using ordinary .NET streaming techniques. The method begins by defining a BlockSize integer constant that is set to a reasonable value of 512 KB. Picture files larger than this will be streamed to the server in 512 KB blocks at a time. The local source image file (in clientPath) is then opened on an ordinary read-only FileStream object.

Then the destination file is opened by passing the two special values (serverPath and serverTxn), along with a FileAccess.Write enumeration requesting write access, into the SqlFileStream constructor. Like the source FileStream object, SqlFileStream inherits from System.IO.Stream, so it can be treated just like any ordinary stream. Thus, you attain write access to the destination BLOB on the database server’s NTFS file system. Remember that this output file is enlisted in an NTFS transaction and nothing you stream to it will be permanently saved until the database transaction is committed by the terminating TransactionScope block, after SavePhotoFile completes. The rest of the SavePhotoFile method implements a simple loop that reads from the source FileStream and writes to the destination SqlFileStream, one 512 KB block at a time until the entire source file is processed, and then it closes both streams.

Streaming Out From SQL Server

The rest of the code contains methods to retrieve existing photos and stream their content from the file system into an Image object for display. You’ll find that this code follows the same pattern as the last, only now we’re performing read access.

The SelectPhoto method accepts a photo ID and returns the string description from the database in an output parameter. The actual BLOB itself is returned as the method’s return value in a System.Drawing.Image object. We populate the Image object with the BLOB by streaming into it from the database server’s NTFS file system using SqlFileStream. Once again, we start things off by entering a TransactionScope block and opening a connection. We then execute a simple SELECT statement that queries the PhotoAlbum table for the record specified by the photo ID and returns the description and full path to the image BLOB, as well as the FILESTREAM transactional context token. And once again we use the path name and transactional context with SqlFileStream to tie into the server’s file system in the LoadPhotoImage method.

Just as when we were inserting new photos (only this time using FileAccess.Read instead of FileAccess.ReadWrite), we create a new SqlFileStream object from the logical path name and transaction context. We then pull the BLOB content directly from the NTFS file system on the server into a new System.Drawing.Image object using the static Image.FromStream method against the SqlFileStream object. The populated image can then be passed back up to a Windows Forms application, where it can be displayed using the Image property of a PictureBox control.

Or, to stream a photo over HTTP from a simple ASP.NET service:

using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using System.IO;
using System.Transactions;
using System.Web.UI;

namespace PhotoLibraryHttpService
{
  public partial class PhotoService : Page
  {
    private const string ConnStr =
      "Data Source=.;Integrated Security=True;Initial Catalog=PhotoLibrary";

    protected void Page_Load(object sender, EventArgs e)
    {
      int photoId = Convert.ToInt32(Request.QueryString["photoId"]);
      if (photoId == 0)
      {
        return;
      }

      const string SelectTSql = @"
        SELECT Photo.PathName(), GET_FILESTREAM_TRANSACTION_CONTEXT()
         FROM PhotoAlbum
         WHERE PhotoId = @PhotoId";

      using (TransactionScope ts = new TransactionScope())
      {
        using (SqlConnection conn = new SqlConnection(ConnStr))
        {
          conn.Open();

          string serverPath;
          byte[] serverTxn;

          using (SqlCommand cmd = new SqlCommand(SelectTSql, conn))
          {
            cmd.Parameters.Add("@PhotoId", SqlDbType.Int).Value = photoId;

            using (SqlDataReader rdr = cmd.ExecuteReader())
            {
              rdr.Read();
              serverPath = rdr.GetSqlString(0).Value;
              serverTxn = rdr.GetSqlBinary(1).Value;
              rdr.Close();
            }
          }

          this.StreamPhotoImage(serverPath, serverTxn);
        }
        ts.Complete();
      }
    }

    private void StreamPhotoImage(string serverPath, byte[] serverTxn)
    {
      const int BlockSize = 1024 * 512;
      const string JpegContentType = "image/jpeg";

      using (SqlFileStream sfs =
        new SqlFileStream(serverPath, serverTxn, FileAccess.Read))
      {
        byte[] buffer = new byte[BlockSize];
        int bytesRead;
        Response.BufferOutput = false;
        Response.ContentType = JpegContentType;
        while ((bytesRead = sfs.Read(buffer, 0, buffer.Length)) > 0)
        {
          Response.OutputStream.Write(buffer, 0, bytesRead);
          Response.Flush();
        }
        sfs.Close();
      }
    }
  }
}

Conclusion

The OpenSqlFilestream function provides native file streaming capabilities between FILESTREAM storage in the file system managed by SQL Server and any native-code (e.g., C++) application. SqlFileStream provide a managed code wrapper around OpenSqlFilestream that simplifies direct FILESTREAM access from .NET applications (e.g., C# and VB .NET). This post explained how this API works in detail, and showed the complete data access code that uses SqlFileStream for both reading and writing BLOBs to and from SQL Server. That’s everything you need to know to get the most out of FILESTREAM. I hope you enjoyed it!

Introducing SQL Server Developer Tools (code-named “Juneau”)

With Denali CTP3 released just two weeks ago, I’ve been ramping up fast as I write the new edition of my book (Programming Microsoft SQL Server 2012, MS Press) and prepare for upcoming presentations at the NJSQL User Group (Sep 20) and the SDC event in the Netherlands (Oct 3/4). I recently spent a lot of time working with the Juneau CTP3, code name for the new SQL Server Developer Tools (SSDT), set to ship with SQL Server 2012 (code named Denali) next year. I’m extremely impressed with the features of this new tool, and delighted at the notion that most developers will no longer need to toggle between Visual Studio and SQL Server Management Studio (SSMS), because they may very well never again need to use SSMS at all!

This is not to say that SSDT is intended to be a replacement for SSMS—au contraire, SSMS is alive and well in Denali, and continues to evolve as the primary management tool for database administrators respsonible for maintaining running SQL Server installations. But programmers have been using SSMS to conduct development tasks for years (and before 2005, we were using two dba tools—SQL Enterprise Manager and Query Analyzer). It’s always been necessary to switch away from our primary application development tool (Visual Studio) to a database management tool (SSMS) in order to get our database development done.

But no more.

SQL Server Developer Tools finally gives application developers what they need to get everything done without ever leaving Visual Studio—and it delivers many powerful new capabilities as well. In this post, I’ll describe the most important features that developers are certain to enjoy when they start working with this new tool.

Model-Based Development

The key concept in SSDT is that it uses a model-based approach. That is, there is always an in-memory representation of what a database looks like, and all the SSDT tools (designers, validations, IntelliSense, schema compare, etc.) operate on that model. This model can be backed by a live database (on-premises or Azure), an offline database project, or a snapshot taken of an offline database project at any point in time. But to reiterate, the tools are agnostic to the model’s backing, and work exclusively against the model itself. Thus, you enjoy a rich, consistent experience in any scenario—regardless of whether you’re working with on-premises or cloud databases, offline projects, or versioned model snapshots.

SSMS in Server Explorer

The Server Explorer now provides most of the functionality that developers are accustomed to in SSMS. Although the experience is strikingly similar when working against a connected database, remember that (and I’ll risk overstating this) SSDT tools only operate on a model. So when working connected, SSDT is actually creating a model from the database—on the fly—and then allowing you to edit that model.

When you “save” a schema change, SSDT works out the necessary script to update the database to reflect the change made to the model. Of course, the end result is the same as the connected SSMS experience, so it isn’t strictly necessary to understand the distinction if you don’t exploit the capabilities of SSDT beyond connected development (but why wouldn’t you?). Once you do understand it, then SSDT’s offline development and versioning capabilities will immediately feel natural and intuitive, as those scenarios are simply different backing types of the very same model. It’s just that when you’re working in Server Explorer, the backing happens to be a live connected database.

You can also open query windows to compose and execute T-SQL statements directly against the database. Overall, these Server Explorer-based features give you the same connected management experience you are accustomed to in SSMS; which is an imperative, script-based approach to directly managing a stateful database.

Offline Development in Visual Studio

SSDT offers so much more than a mere replacement for the connected SSMS experience, by providing a rich offline experience with the new SQL Server project type in Visual Studio 2010. If you’re familiar with the previous Database Professional editions of Visual Studio (commonly known as “data dude”), you can think of this project type as the next version of that edition (data dude projects will even load and upgrade to SSDT automatically when opened in Visual Studio 2010). The T-SQL script files in a SQL Server project are declarative in nature (all CREATE statements; no ALTER statements), which is an entirely different approach to how you’re accustomed to “developing” databases in SSMS against a live stateful database. Essentially, you get to focus on “this is how the database should look,” and let the tool worry about and figure out the appropriate T-SQL statements to actually update the live database to match your definition.

The SQL Server project enjoys all the same capabilities of any Visual Studio project. This includes not only source control (as mentioned), but all the common code navigation and refactoring paradigms (like Rename, Goto Definition, and Find All References) that have come to be accepted as indispensible tools of the modern IDE. And the model’s rich metadata provides a far better IntelliSense than what SSMS has been offering since SQL Server 2008. You really get that “strongly-typed” feeling. You can also set breakpoints, single-step through T-SQL code, and utilize the Locals window in Visual Studio much like you can when debugging .NET project types. With SSDT, application and database development tooling has now finally been unified under one roof: Visual Studio 2010.

The beauty of the model-based approach is that you can generate models from almost anything. As already explained, when connected directly via Server Explorer, SSDT creates a model from the connected database. When you create a SQL Server project (which can be imported from any existing database, script, or snapshot), you are creating an offline source-controlled project inside Visual Studio that fully describes a database. Now SSDT generates a model backed by your project, and so the design experience offline is just the same as when connected; the designers, IntelliSense, validation checks, and all other SSDT features all work exactly the same way.

As you conduct your database development within the project, you get the same “background compilation” experience that you’re used to experiencing with typical .NET development using C# or VB .NET. For example, making a change in the project that can’t be submitted to the database because of dependency issues will immediately raise errors in the Errors pane. You can click on the errors to navigate directly to the various dependencies so that they can be dealt with. Once all the build errors disappear, you’ll be able to submit the changes to update the database.

SSDT also provides a new local database runtime for testing and debugging. This is a lightweight, single user instance of SQL Server that spins up on demand when you build your SQL Server project. This is a great way to test your offline work before deploying to a live server or the cloud.

Versioning and Snapshots

A database project gives you an offline definition of a database. Like all projects, each database object (table, view, stored procedure) lives as a source file that can be placed under version control. Thus, the project system enables you to preserve and protect the definition of the database, as opposed to having the definition live within the database itself.

At any point in time, and as often as you’d like, you can create a database snapshot. A snapshot is nothing more than a persisted serialized version of the model based on the current project at the time the snapshot is taken. As such, you can use SSDT to develop, deploy, and synchronize database structures across local/cloud databases and differently versioned offline database projects.

Targeting SQL Azure

SSDT projects have a target switch that lets you specify which SQL Server edition or platform the project is intended to be deployed to. All the validation checks against the project-backed model are based on this setting, so it’s simply a matter of choosing SQL Azure as the target to ensure that your database can be deployed to the cloud without any problems. If your database project defines something that is not supported in Azure (a table with no clustered index, for example), it will get flagged as an error automatically when you attempt to publish.

The New BIDS

Are you not sufficiently impressed yet? But wait! SSDT is even bigger than what I’ve described so far. As announced at SQL PASS at the end of 2010, SSDT is an umbrella, a packaging of what was the VS2008-based Business Intelligence Developer Studio (BIDS) tools for SSAS, SSRS, and SSIS, combined with a new service, the “database services”. So they’ve really now brought together under the SQL Server umbrella all of the database development experiences inside of Visual Studio 2010. Here’s the picture:

Get Started Now, and Meet Me in New Jersey

SSDT is the next generation database development tool Visual Studio developers, and its coming soon! Hopefully this post inspires you to give SSDT a test drive. Download the Juneau CTP3 bits from http://www.microsoft.com/web/gallery/install.aspx?appid=JUNEAU10 and check it out for yourself.

I’ll be presenting a session at the NJ SQL User Group (http://njsql.org/default.aspx) with lots of demos that show off some of the coolest things about SSDT. The meeting  is being held at SetFocus in Parsippany (directions here: http://www.setfocus.com/about/headquarters.aspx) on Sept 20 from 6 – 8:30. Hope to see you then!