New Metadata Discovery Features in SQL Server 2012

It has always been possible to interrogate SQL Server for metadata (schema) information. You can easily discover all the objects in a database (tables, views, stored procedures, and so on) and their types by directly querying system tables (not recommended, as they can change from one version of SQL Server to another) or information schema views (which are consistent in each SQL Server version). It is significantly more challenging, however, to discover the result set schema for T-SQL statements or stored procedures that contain conditional logic. Using SET FMTONLY ON/OFF has been the common technique in the past for discovering the schema of a query’s result set without actually executing the query itself. For example, consider the following code:

USE AdventureWorks2012
GO

SET FMTONLY ON
SELECT * FROM HumanResources.Employee;
SET FMTONLY OFF

This SELECT statement, which would normally return all the rows from the HumanResources.Employee table, returns no rows at all. It just reveals the columns. The SET FMTONLY ON statement prevents queries from returning rows of data so that their schemas can be discovered, and this behavior remains in effect until SET FMTONLY OFF is encountered. SQL Server 2012 introduces several new system stored procedures and table-valued functions (TVFs) that provide significantly richer metadata discovery than what can be discerned using the relatively inelegant (and now deprecated) SET FMTONLY ON/OFF approach. These new procedures and functions are:

  • sys.sp_describe_first_result_set
  • sys.dm_exec_describe_first_result_set
  • sys.dm_exec_describe_first_result_set_for_object
  • sys.sp_describe_undeclared_parameters

In this blog post, I’ll explain how to use these new objects to discover schema information in SQL Server 2012.

sys.sp_describe_first_result_set

The sys.sp_describe_first_result_set stored procedure accepts a T-SQL statement and produces a highly detailed schema description of the first possible result set returned by that statement. The following code retrieves schema information for the same SELECT statement you used earlier to get information on all the columns in the HumanResources.Employee table:

EXEC sp_describe_first_result_set
 @tsql = N'SELECT * FROM HumanResources.Employee'

The following screenshot shows the wealth of information that SQL Server returns about each column in the result set returned by the sp_describe_first_result_set call:

sys.dm_exec_describe_first_result_set

There is also a data management function named sys.dm_exec_describe_first_result_set that works very similar to sys.sp_describe_first_result_set. But because it is implemented as a table-valued function (TVF), it is easy to query against it and limit the metadata returned. For example, the following query examines the same T-SQL statement, but returns just the name and data type of nullable columns:

SELECT name, system_type_name
 FROM sys.dm_exec_describe_first_result_set(
  'SELECT * FROM HumanResources.Employee', NULL, 1)
 WHERE is_nullable = 1

Here is the output:

name               system_type_name
-----------------  ----------------
OrganizationNode   hierarchyid
OrganizationLevel  smallint

Parameterized queries are also supported, if you supply an appropriate parameter signature after the T-SQL. The T-SQL in the previous example had no parameters, so it passed NULL for the “parameters parameter.” The following example discovers the schema of a parameterized query.

SELECT name, system_type_name, is_hidden
 FROM sys.dm_exec_describe_first_result_set('
  SELECT OrderDate, TotalDue
   FROM Sales.SalesOrderHeader
   WHERE SalesOrderID = @OrderID',
  '@OrderID int', 1)

Here is the output:

name             system_type_name  is_hidden
---------------  ----------------  ---------
OrderDate        datetime          0
TotalDue         money             0
SalesOrderID     int               1

You’d be quick to question why the SalesOrderID column is returned for a SELECT statement that returns only OrderDate and TotalDue. The answer lies in the last parameter passed to the data management function. A bit value of 1 (for true) tells SQL Server to return the identifying SalesOrderID column, because it is used to “browse” the result set. Notice that it is marked true (1) for is_hidden. This informs the client that the SalesOrderID column is not actually revealed by the query, but can be used to uniquely identify each row in the query’s result set.

What if multiple result sets are possible? There’s no problem with this as long as they all have the same schema. In fact, SQL Server will even try to forgive cases where multiple possible schemas are not exactly identical. For example, if the same column is nullable in one result set and non-nullable in the other, schema discovery will succeed and indicate the column as nullable. It will even tolerate cases where the same column has a different name (but same type) between two possible result sets, and indicate NULL for the column name, rather than arbitrarily choosing one of the possible column names or failing altogether.

The following code demonstrates this with a T-SQL statement that has two possible result sets depending on the value passed in for the @SortOrder parameter. Because both result sets have compatible schemas, the data management function succeeds in returning schema information.

SELECT name, system_type_name
 FROM sys.dm_exec_describe_first_result_set('
    IF @SortOrder = 1
      SELECT OrderDate, TotalDue
       FROM Sales.SalesOrderHeader
       ORDER BY SalesOrderID ASC
    ELSE IF @SortOrder = -1
      SELECT OrderDate, TotalDue
       FROM Sales.SalesOrderHeader
       ORDER BY SalesOrderID DESC',
   '@SortOrder AS tinyint', 0) 

Here is the output:

name         system_type_name
-----------  ----------------
OrderDate    datetime
TotalDue     money

Discovery won’t succeed if SQL Server detects incompatible schemas. In this next example, the call to the system stored procedure specifies a T-SQL statement with two possible result sets, but one returns three columns while the other returns only two columns.

EXEC sys.sp_describe_first_result_set
  @tsql = N'
    IF @IncludeCurrencyRate = 1
      SELECT OrderDate, TotalDue, CurrencyRateID
       FROM Sales.SalesOrderHeader
    ELSE
      SELECT OrderDate, TotalDue
       FROM Sales.SalesOrderHeader'

In this case, the system stored procedure raises an error that clearly explains the problem:

Msg 11509, Level 16, State 1, Procedure sp_describe_first_result_set, Line 53

The metadata could not be determined because the statement 'SELECT OrderDate, TotalDue, CurrencyRateID FROM Sales.SalesOrderHeader' is not compatible with the statement 'SELECT OrderDate, TotalDue FROM Sales.SalesOrderHeader'.

It is noteworthy to mention that the data management function copes with this scenario much more passively. Given conflicting result set schemas, it simply returns NULL and does not raise an error.

sys.dm_exec_describe_first_result_set_for_object

The data management function sys.dm_exec_describe_first_result_set_for_object can be used to achieve the same discovery against any object in the database. It accepts just an object ID and the Boolean “browse” flag to specify if hidden ID columns should be returned. You can use the OBJECT_ID function to obtain the ID of the desired object. The following code demonstrates this by returning schema information for the stored procedure GetOrderInfo.

CREATE PROCEDURE GetOrderInfo(@OrderID AS int) AS
  SELECT OrderDate, TotalDue
   FROM Sales.SalesOrderHeader
   WHERE SalesOrderID = @OrderID
GO

SELECT name, system_type_name, is_hidden
 FROM sys.dm_exec_describe_first_result_set_for_object(OBJECT_ID('GetOrderInfo'), 1)

Here is the output:

name             system_type_name   is_hidden
---------------  -----------------  ---------
OrderDate        datetime           0
TotalDue         money              0
SalesOrderID     int                1

sys.sp_describe_undeclared_parameters

Finally, the sys.sp_describe_undeclared_parameters stored procedure parses a T-SQL statement to discover type information about the parameters expected by the statement, as the following code demonstrates:

EXEC sys.sp_describe_undeclared_parameters
 N'IF @IsFlag = 1 SELECT 1 ELSE SELECT 0'

Here is the output:

parameter_ordinal name    suggested_system_type_id suggested_system_type_name ...
----------------- ------- ------------------------ -------------------------- -------
1                 @IsFlag 56                       int                        ... 

In this example, SQL Server detects the @IsFlag parameter, and suggests the int data type based on the usage in the T-SQL statement it was given to parse.

Download Visual Studio Live! New York Slides and Code

A very successful Visual Studio Live! just wrapped up two weeks ago here in my hometown of Brooklyn, NY. Thanks again to all the good folks that attended my sessions.

As promised, I’ve posted all the slides and code from my sessions for you to download. You can grab the stuff here:

May 14, 2012 SQL Server Workshop for Developers http://sdrv.ms/VSLiveNY2012SQL
May 17, 2012 Introducing SQL Server Data Tools http://sdrv.ms/VSLiveNY2012SSDT
May 17, 2012 T-SQL Enhancements in SQL Server 2012 http://sdrv.ms/VSLiveNY2012TSql

Looking forward to Visual Studio Live! in Redmond this coming August! 🙂