2007.11.28 09:37 PM

SQL Server POWERSUM() in One Query

Update #2 11/29/2007
Oops. Robyn Page points out in a comment that I've more SELECTs than needed. I'm afraid that in my rush I did the first line and copied the remainder without thinking to pull the SELECTs and replace them with CASEs over a single SELECT. Here is Ms. Page's much cleaner solution:

select
 cast(
   left(
char(sum(distinct case when col/8=0 then power(2, Col%8) else 0 end))
+ char(sum(distinct case when col/8=1 then power(2, Col%8) else 0 end))
+ char(sum(distinct case when col/8=2 then power(2, Col%8) else 0 end))
+ char(sum(distinct case when col/8=3 then power(2, Col%8) else 0 end))
+ char(sum(distinct case when col/8=4 then power(2, Col%8) else 0 end))
+ char(sum(distinct case when col/8=5 then power(2, Col%8) else 0 end))
+ char(sum(distinct case when col/8=6 then power(2, Col%8) else 0 end))
+ char(sum(distinct case when col/8=7 then power(2, Col%8) else 0 end))
+ char(sum(distinct case when col/8=8 then power(2, Col%8) else 0 end))
+ char(sum(distinct case when col/8=9 then power(2, Col%8) else 0 end))
+ char(sum(distinct case when col/8=10 then power(2, Col%8) else 0 end))
+ char(sum(distinct case when col/8=11 then power(2, Col%8) else 0 end))
+ char(sum(distinct case when col/8=12 then power(2, Col%8) else 0 end))
+ char(sum(distinct case when col/8=13 then power(2, Col%8) else 0 end))
+ char(sum(distinct case when col/8=14 then power(2, Col%8) else 0 end))
+ char(sum(distinct case when col/8=15 then power(2, Col%8) else 0 end))
+ char(sum(distinct case when col/8=16 then power(2, Col%8) else 0 end))
+ char(sum(distinct case when col/8=17 then power(2, Col%8) else 0 end))
+ char(sum(distinct case when col/8=18 then power(2, Col%8) else 0 end))
+ char(sum(distinct case when col/8=19 then power(2, Col%8) else 0 end))
+ char(sum(distinct case when col/8=20 then power(2, Col%8) else 0 end))
,max(col/8)+1)as varbinary)
from @data 

Btw, in a follow-up comment, Mr. Clarke suggests that his solution is different still than this. Can't wait to see.


Update 11/29/2007
A new comment-clue from Mr. Clarke suggests that the solution should work in SQL Server 2000 and that it will only handle input values beyond 127 with a little cut-and-paste. That led me to the following query:

select
  cast(
    left(
      char(COALESCE((select sum(distinct power(2, Col%8)) from @Data where (Col/8) = 0), 0)) +
      char(COALESCE((select sum(distinct power(2, Col%8)) from @Data where (Col/8) = 1), 0)) +
      char(COALESCE((select sum(distinct power(2, Col%8)) from @Data where (Col/8) = 2), 0)) +
      char(COALESCE((select sum(distinct power(2, Col%8)) from @Data where (Col/8) = 3), 0)) +
      char(COALESCE((select sum(distinct power(2, Col%8)) from @Data where (Col/8) = 4), 0)) +
      char(COALESCE((select sum(distinct power(2, Col%8)) from @Data where (Col/8) = 5), 0)) +
      char(COALESCE((select sum(distinct power(2, Col%8)) from @Data where (Col/8) = 6), 0)) +
      char(COALESCE((select sum(distinct power(2, Col%8)) from @Data where (Col/8) = 7), 0)) +
      char(COALESCE((select sum(distinct power(2, Col%8)) from @Data where (Col/8) = 8), 0)) +
      char(COALESCE((select sum(distinct power(2, Col%8)) from @Data where (Col/8) = 9), 0)) +
      char(COALESCE((select sum(distinct power(2, Col%8)) from @Data where (Col/8) = 10), 0)) +
      char(COALESCE((select sum(distinct power(2, Col%8)) from @Data where (Col/8) = 11), 0)) +
      char(COALESCE((select sum(distinct power(2, Col%8)) from @Data where (Col/8) = 12), 0)) +
      char(COALESCE((select sum(distinct power(2, Col%8)) from @Data where (Col/8) = 13), 0)) +
      char(COALESCE((select sum(distinct power(2, Col%8)) from @Data where (Col/8) = 14), 0)) +
      char(COALESCE((select sum(distinct power(2, Col%8)) from @Data where (Col/8) = 15), 0))
    , (select max(Col/8)+1 from @Data))
  as varbinary(max))

Simple, but not as elegant as the recursive CTEs below. I think it satisfies the puzzle, though. Will await the solution.


In today's Daily Grind, Mike Gunderloy (curse his hide) linked to SQL Puzzle 8 by Red Gate Software engineer Lionel Clarke challenging readers to craft a one-query replacement for SQL Server's POWERSUM() function, which was apparently "removed" from SQL Server 2008 in the November CTP, as described by András, another Red Gate Software engineer. I'd never heard of nor used POWERSUM(), but I can't seem to leave challenges like this alone and so decided to take a look over lunch.

András' short explanation of what the undocumented POWERSUM() function does fell a bit short of the mark, but after a lot of testing I was able to get my head around it. And now, having done that, I'm not going to bother repeating here what I learned, unless someone really, really wants to know. After all, it's an undocumented and (now) dead function. I'll just say that it mostly does what András says, it just does more, and it does it in a really non-intuitive way.

Mr. Clarke's challenge was "to write a single select statement that will return the same results as POWERSUM when run on the supplied table variable", which in this case Mr. Clarke named @Data with a single INT column named Col. In a comment below his post, Mr. Clarke clarified that variables weren't allowed. And it's probably safe to assume that "single select statement" means no user-defined functions, too. However, Mr. Clarke never mentions whether he considers CTEs (Common Table Expressions) a legitimate extension of a "single select statement". I'm hoping he does, as I couldn't figure out how to do this without recursion, both for the production of a full range of bytes and for concatenation of the VARBINARY bytes. Perhaps in SQL Server 2008 there are new range-producing and/or aggregate-concatenating functions that would allow for the elimination of the CTEs, but with only SQL Server 2005 (and a lunch break) to work with, this was the best I could do.

The query below matches POWERSUM() for the values Mr. Clarke provides, as well as for every set of values I threw at it (with one exception, described below the query). It also satisfies Mr. Clarke's wish for a solution "that will scale to beyond" input values up to 127. As written below, the query will support input values to 807 before it runs into the default MAXRECURSION level of 100. If you need values greater than 807, add an "option (MAXRECURSION x)" hint to the end of the final SELECT, where "x" represents the number of levels you need. Or, just use a MAXRECURSION of 0 to allow recursion to any level, within the limits of your machine. I tested it with a MAXRECURSION of 0 and input values up to 20,000. My machine huffed and puffed, but after a few seconds did produce results that matched POWERSUM().

So, here's the query:

;with
psdata(value, byte) as (
  select distinct 
    Col, 
    (Col / 8) 
  from 
    @Data
),
psbyte(byte) as (
  select 
    max(byte) 
  from 
    psdata
 union all
  select 
    byte - 1 
  from 
    psbyte 
  where 
    byte > 0
),
psbytevalue(value, byte) as (
  select 
    cast(sum(b.bytevalue) as varbinary(1)), 
    b.byte 
  from (
    select 
      psdata.value, 
      case 
        when psdata.value is null then 0 
        else power(2, psdata.value - (8 * psdata.byte)) 
      end as bytevalue,
      psbyte.byte 
    from
      psbyte 
    left join 
      psdata on psdata.byte = psbyte.byte
  ) b 
  group by 
    b.byte
),
ps(value, byte) as (
  select 
    cast(value as varbinary(max)), 
    byte 
  from 
    psbytevalue 
  where 
    byte = 0
 union all
  select 
    cast(a.value + b.value as varbinary(max)), 
    b.byte 
  from 
    psbytevalue b 
  join 
    ps a on a.byte = b.byte - 1 
  where 
    b.byte > 0
)
select top 1 
  value 
from 
  ps 
order by 
  byte desc

So what's the one set of values that causes this query to produce a result different than POWERSUM()? An empty set. If zero rows are fed to this query it produces an empty result set. However, POWERSUM() somehow returns a zero-length VARBINARY value, which looks like "0x". There may be a way to do this with an expression, but I couldn't find it.

Anyhow, that's it. I can't wait to see what the "real" solution is. I have a feeling it will be significantly shorter and more succinct than what I've put together. Perhaps with more time I could collapse some of the CTEs out of this query and simplify it some, but there's no more time to spare.


Comments

Hi ! I tried to understand your solution and Simple-talk.com's puzzle, but I am lost. How can I find more details about PowerSum(), mean meaning and usage. I tried to use it on SQL 2000 and got the following error message:

Server: Msg 195, Level 15, State 10, Line 51
'POWERSUM' is not a recognized function name.

Please let me know.

Thanks,
Reetesh

Reetesh | 2007.11.30 12:16 PM

Hi Reetesh,

Unfortunately, as an undocumented function, there's little documentation available. I never did find any, so just ran it over a large number of values to figure out what it did. I'm not certain, but your tests seem to confirm that it was introduced with SQL Server 2005, and then taken away with SQL Server 2008.

In short, what POWERSUM() does it translate one or more positive numbers into a specially formatted bitmap. For instance, the number 0 results in a one byte value with the first bit set. The number 6 results in a one byte value with the 7th bit set. The numbers 0 and 6 together result in a one byte value with the first and 7th bits set (i.e., 01000001, or 0x41). The bytes are ordered left to right, so a POWERSUM of the numbers 0, 6, and 24 would get you a 4 byte value (i.e., 01000001 00000000 00000000 00000001, or 0x41 00 00 01), as would a POWERSUM of just the number 24 (i.e., 00000000 00000000 00000000 00000001, or 0x00 00 00 01).

Hope that helps. Good luck!

ewbi.develops | 2007.11.30 03:46 PM



Post a Comment

 
  (optional)
  (no html)
 
   


TrackBack

TrackBack URL:  http://www.typepad.com/services/trackback/6a00d8341c7bd453ef00e54f9087c78833

Listed below are links to weblogs that reference SQL Server POWERSUM() in One Query: