2007.04.13 11:30 PM

Access TRANSFORM (Crosstab) Queries and Missing Rows

Last December, reader jf0027 left the following comment on my Access SQL TRANSFORM Statement post from January 2005:

Very good. Exactly what I needed. There is one extra thing though. Here is my query.

TRANSFORM Count([Customer].C_ID) AS CountOfC_ID
SELECT Format([C_Date],"mm") AS Expr1
FROM [Customer]
GROUP BY Format([C_Date],"mm")
PIVOT Format([C_Date],"yyyy");

What I'm doing is count the number of customer per month, and compare different years. My only problem is if there is no customer for one month, it will not show this month in the query. So i will only have 11 months instead of 12. I'm using this table for a graph with the X-axis as months and years for the series. Is there a work around for this?

Thanks

That's a good question, and I'm sorry I took so long to respond. Well, actually, I did respond the next day, but only to leave an unfortunate suggestion that I would look into it in a couple of days. Apparently what I really meant was "in four months". Oops. To paraphrase an old New Yorker cartoon involving cavemen, rain, and a snuffed fire: Hey, this work stuff puts out this blog stuff.

Here's a simple Customer table to get us started:

C_ID    C_Date         C_Name
1       1/15/2006      First customer
2       3/1/2006       Second customer
3       3/20/2006      Third customer
4       4/18/2006      Fourth customer
5       7/22/2006      Fifth customer
6       8/1/2006       Sixth customer
7       10/1/2006      Seventh customer
8       10/25/2006     Eighth customer
9       12/1/2006      Ninth customer
10      2/5/2007       Tenth customer

Using this data, jf0027's query produces the following results:

Expr1  2006  2007
01     1	
02           1
03     2	
04     1	
07     1	
08     1	
10     2	
12     1	

That's accurate, but incomplete, as there are no rows for months 5, 6, 9, and 11. What jf0027 wants is this:

Expr1  2006  2007
01     1	
02           1
03     2	
04     1	
05
06
07     1	
08     1	
09
10     2	
11
12     1	

The easiest way to expand the rows resulting from a TRANSFORM query like this is to LEFT JOIN the query with another table (or query) that has all of the rows you want. To do this, the TRANSFORM query must first be saved because Access doesn't support TRANSFORM-based subqueries. If jf0027's query is saved as Q1 and the database contains another table named Months with a numeric Month key column and a row for each month, the following query will expand Q1's rows to include all months:

SELECT Format(Months.Month, "00") as Month, Q1.*
FROM Months
LEFT JOIN Q1 on Val(Q1.Expr1) = Months.Month
ORDER BY Months.Month

Note that Months.Month is formatted to match the results of jf0027's query. Also, because the months in jf0027's query were formatted as strings ("mm"), they have to be converted back to numbers for the join (or, I suppose, I could have removed the formatting from jf0027's query, but I didn't want to muddy the sample by changing the original). The new query produces the following results:

Month  Expr1  2006  2007
01     01     1	
02     02           1
03     03     2	
04     04     1	
05
06
07     07     1	
08     08     1	
09
10     10     2	
11
12     12     1	

The good news is there's now a row for every month. The bad news is there are two month columns: the GROUP BY column from the TRANSFORM query (Expr1), and the LEFT JOINed Months.Month. This might be okay, depending on how the results will be used. If the results will be exported to Excel to make graphs, or manually manipulated and printed, or programmatically evaluated, the TRANSFORM query's unnecessary GROUP BY column won't matter. However, if it isn't okay, there are only a couple of ways to get rid of it without resorting to code.

The first approach requires knowing in advance what PIVOT columns the TRANSFORM query will produce. If the columns are known, they can be explicitly named in the query, replacing the Q1.* column wildcard, like this:

SELECT Format(Months.Month, "00") as Month, Q1.[2006], Q1.[2007]
FROM Months
LEFT JOIN Q1 on Val(Q1.Expr1) = Months.Month
ORDER BY Months.Month

However, this is not a general-purpose solution. Care must be taken to refer only to columns that will appear in the TRANSFORM query's results. One way to ensure this (or to expand the columns to include values that don't actually appear in the data, similar to what we're doing for the rows) is to explicitly name the expected column values in the TRANSFORM query's PIVOT clause, like this:

TRANSFORM Count([Customer].C_ID) AS CountOfC_ID
SELECT Format([C_Date],"mm") AS Expr1
FROM Customer
GROUP BY Format([C_Date],"mm")
PIVOT YEAR([C_Date]) in (2005, 2006, 2007)

The other approach it to back up and get all of the desired rows into the data in the first place. There are probably as many ways to do this as there are TRANSFORM queries, so I'm just going to focus on jf0027's needs.

In jf0027's case, missing month rows can't just be manufactured and added to the original Customer data using JOINs or UNIONs, because adding rows will affect the TRANSFORM query's counts. Customer rows must be grouped and counted first. These counts can then be expanded to include any missing months before being handed to the TRANSFORM query. In this case, it isn't even necessary to be careful with the addition of months (in other words, we don't have to fill the month gaps for each year), because the year and month groupings are separated in the TRANSFORM query anyway. All that's needed is a full set of months for any one year appearing in the results. The following query does this using the last Customer year:

SELECT A.Year, A.Month, SUM(A.Total) as Total
FROM (
  SELECT 
    Year(C_Date) as Year, 
    Month(C_Date) as Month, 
    COUNT(*) as Total
  FROM Customer
  GROUP BY Year(C_Date), Month(C_Date)
  UNION
  SELECT
    (SELECT MAX(Year(C_Date)) FROM Customer) as Year,
    Month,
    Null
  FROM Months
) A
GROUP BY A.Year, A.Month

This query can then be fed to a TRANSFORM query to GROUP BY month and PIVOT on year, using the aggregate function First() to pull the appropriate values:

TRANSFORM First(Total) AS CountOfC_ID
SELECT Format(Month,"00") AS Expr1
FROM (

  SELECT A.Year, A.Month, SUM(A.Total) as Total
  FROM (
    SELECT 
      Year(C_Date) as Year, 
      Month(C_Date) as Month, 
      COUNT(*) as Total
    FROM Customer
    GROUP BY Year(C_Date), Month(C_Date)
    UNION
    SELECT
      (SELECT MAX(Year(C_Date)) FROM Customer) as Year,
      Month,
      Null
    FROM Months
  ) A
  GROUP BY A.Year, A.Month

)
GROUP BY Format(Month,"00")
PIVOT Year;

The result? Just what jf0027 wanted:

Expr1  2006  2007
01     1	
02           1
03     2	
04     1	
05
06
07     1	
08     1	
09
10     2	
11
12     1	

Sorry again for the delay. Please let me know if you have any questions. I promise to get back to you before August.


Comments

Hi Eric,
I was thinking a bout this - well the first post actully the other day when a problem came up, looking back at it now it would't of helped me, but it's worth knowing anyway, cheers
Ross

ross | 2007.04.16 05:33 AM

I have a multitables in a FIC and I would like to get a distibution in a grid as :
column(0) = name of tables(in a RESTAURANT)
ROW(0) 0 = MONTH of THE CURRENT YEAR
MANY THANKS FOR YOUR REPLY
ROGER

BAUQUEL ROGER | 2007.08.20 08:03 AM

I have a multitables in a FIC and I would like to get a distibution in a grid as :
column(0) = name of tables(in a RESTAURANT)
ROW(0) 0 = MONTH of THE CURRENT YEAR
MANY THANKS FOR YOUR REPLY
ROGER

BAUQUEL ROGER | 2007.08.20 08:09 AM

Bauquel, I'd like to help, but I'm afraid I'll need more info to do so. Like, for instance: What columns comprise the table(s) you're wanting to query? What's a "FIC"? What table/month-related value do you intend to aggregate ($ served, # of tickets, etc.)? And, how do you want them aggregated (summed, averaged, etc.)?

ewbi.develops | 2007.08.21 07:58 PM

I'm trying to do something similar. If you add a numeric c_month table to customer (and populate it, of course), the problem could be solved with:
TRANSFORM Count(Customer.C_ID) AS CountOfC_ID
SELECT Months.Month
FROM Customer RIGHT JOIN Months ON Customer.C_Month = Months.Month
GROUP BY Months.Month
PIVOT Format([C_Date],"yyyy");

StevenHB | 2008.04.30 09:28 AM

But the following doesn't work the way that I'd expect (and I'd like to understand why):
TRANSFORM Count(Customer.C_ID) AS CountOfC_ID
SELECT Months.Month
FROM Customer RIGHT JOIN Months ON Customer.C_Month = Months.Month
WHERE (((Customer.C_Name)<>"Sixth customer"))
GROUP BY Months.Month
PIVOT Format([C_Date],"yyyy");

This query omits all of the months with no customers!

Steven | 2008.04.30 10:41 AM

Hi Steven,

I don't believe this has to do with the transform. If you execute just the group by query (from the SELECT to just before the PIVOT), but add the full C_Month and C_Date year (or just C_Date), you'll see that you aren't getting back months for every year. If it's not in the GROUP BY query, it won't be in the TRANSFORM query either. To get every month, you need to SELECT from Months and LEFT JOIN Customer. However, even in this case the possibility exists that PIVOTing on the year portion of C_Date could result in missing years in the columns.

Good luck!

ewbi.develops | 2008.04.30 11:35 AM


TrackBack

TrackBack URL:  http://www.typepad.com/services/trackback/6a00d8341c7bd453ef00d83462a55e69e2

Listed below are links to weblogs that reference Access TRANSFORM (Crosstab) Queries and Missing Rows: