2005.07.29 05:30 PM

SQL Server Column Wildcard Perils

Using column wildcards in SQL Server views and functions carries certain risks. Having survived an ugly debugging session relating to this today, I now believe that no one should use them without first having an understanding of the potential danger.

Let's start by creating a SQL Server table and adding some rows:

CREATE TABLE dbo.TestResults (
  StudentId int NOT NULL,
  TestId int NOT NULL,
  Results int NOT NULL,
  Comment varchar(25)
  CONSTRAINT [PK_TestResults] PRIMARY KEY CLUSTERED (
    StudentId,
    TestId
  ) ON [PRIMARY]
) ON [PRIMARY]
GO

INSERT INTO TestResults VALUES(1, 1, 100, 'Not bad')
INSERT INTO TestResults VALUES(2, 1, 90, 'Good improvement')
INSERT INTO TestResults VALUES(3, 1, 60, 'What happened?')

For simplicity, we'll skip defining the table's relations (just use your imagination).

Next let's create a view based on the table that encapsulates the JOINs required on its various (imaginary) relations:

CREATE VIEW dbo.TestResultsWithRelations 
AS
  SELECT
    StudentId,
    -- joined student name,
    TestId,
    -- joined test name,
    Results
  FROM 
    TestResults
    -- join, join, join

Finally, let's create a table-valued function that returns all of the view's columns using a wildcard and additionally 1) calculates a new column (Grade), and 2) reduces the rows selected to those matching criteria based on a passed parameter:

CREATE FUNCTION dbo.GetGradedTestResults(@MinResults int) RETURNS TABLE 
AS 
RETURN (
  SELECT 
    *,
    CASE 
      WHEN Results > 90 THEN 'A'
      WHEN Results > 80 THEN 'B'
      WHEN Results > 70 THEN 'C'
      WHEN Results > 60 THEN 'D'
      ELSE 'F'
    END as Grade
  FROM
    TestResultsWithRelations
  WHERE
    Results >= @MinResults
)

If we select using our new function:

SELECT * FROM dbo.GetGradedTestResults(70)

Here's what we get:

StudentIdTestIdResultsGrade
11100A
2190B

So where's the peril? Well, let's modify the view to include the Comment column we originally left out:

ALTER VIEW dbo.TestResultsWithRelations 
AS
  SELECT
    StudentId,
    -- joined student name,
    TestId,
    -- joined test name,
    Results,
    Comment
  FROM 
    TestResults
    -- join, join, join

And perform the same select again:

StudentIdTestIdResultsGrade
11100Not bad
2190Good improvement

Yikes. SQL Server has executed our function using the view schema info it collected during the function's creation and mixed it in a rather unpredictable way with our calculated column. Luckily, in this case, the results are obviously flawed and can be quickly spotted and fixed, but it's not hard to imagine a scenario where the results could escape detection, causing incalculable havoc downstream.

Note that views are not immune to this either. If we backed this example up one level, created a table without a Comment column, used a column wildcard in the view (and added the Grade logic from the function) and then added a Comment column to the table, we would see the same problem with the view's results.

In the end, it pays to be explicit with view and function column references. It also doesn't hurt to use the SCHEMABINDING option when creating/altering views and functions to prevent schema changes from being made to related database objects (this will also prevent you from using column wildcards in the first place, even accidently).


Comments


TrackBack

TrackBack URL:  http://www.typepad.com/services/trackback/6a00d8341c7bd453ef00d8351cf88453ef

Listed below are links to weblogs that reference SQL Server Column Wildcard Perils: