Friday, October 9, 2009

Pivot with out aggregate

Pivot without Aggregating with consistent and inconsistent formats:

One of the most common tasks which I came across at least once in almost every project is to convert the structure of the table data into a PIVOT form. But, I don’t have anything to aggregate in the table; all I want is convert the structure as below.

Image-1

image

Before SQL Server 2005, in the earlier versions we used to perform this through using cursors and temp tables which takes a lot of I/O. Starting with SQL Server 2005 which provides a new operator called PIVOT which actually does this, but the main purpose of using PIVOT is to rotate the table data from vertical to horizontal as shown in above picture, and performs AGGREGATIONS on required columns which will be shown in the final data set.

As per the syntax PIVOT only works with an aggregate:

SELECT <non-pivoted column>,

[first pivoted column] AS <column name>,

[second pivoted column] AS <column name>,

...

[last pivoted column] AS <column name>

FROM

(<SELECT query that produces the data>)

AS <alias for the source query>

PIVOT

(

<aggregation function>(<column being aggregated>)

FOR

[<column that contains the values that will become column headers>]

IN ( [first pivoted column], [second pivoted column],

... [last pivoted column])

) AS <alias for the pivot table>

<optional ORDER BY clause>;

But I don’t want to do any aggregate operation as all I want is to transpose the data into a tabular format.

There are 2 approaches to do this based on the requirements:

1. If the column names are consistent across the data then follow the 1st approach

2. If the column names are NOT consistent then the 2nd approach will solve the purpose using a schema table.

Let’s create a table with the data as shown in the Image-1

Approach- 1 :- columns (NAME,ADDRESS,PHONE ,EMAIL) is consistent across all the sets

create table customers ( IID INT IDENTITY(1,1),col0 varchar(90),col1 varchar(90))

GO

-- 1st set

insert into customers values ( 'NAME','A_NM')

insert into customers values ( 'ADDRESS','A_ADR')

insert into customers values ( 'PHONE','A_PH')

insert into customers values ( 'EMAIL','A_EMAIL')

-- 2nd set

insert into customers values ( 'NAME','B_BM')

insert into customers values ( 'ADDRESS','B_ADR')

insert into customers values ( 'PHONE','B_XYZ')

insert into customers values ( 'EMAIL','B_PH')

-- 3rd set

insert into customers values ( 'NAME','C_BM')

insert into customers values ( 'ADDRESS','C_ADR')

insert into customers values ( 'PHONE','C_XYZ')

insert into customers values ( 'EMAIL','C_EMAIL')

To transpose the data into a pivot report, I have used the partition function and included the results in a CTE and select the data in the CTE using PIVOT operator
script –A1

WITH CUST_CTE(iid, COL0,COL1,RID)

AS

(

SELECT iid, COL0,COL1 , ROW_NUMBER() OVER (PARTITION BY (COL0)ORDER BY IID) AS RID FROM CUSTOMERS

)

SELECT NAME,ADDRESS,PHONE,EMAIL

FROM

(SELECT COL0,COL1,RID

FROM CUST_CTE)C

PIVOT

(

max(COL1)

FOR COL0 IN (NAME,[ADDRESS],PHONE,EMAIL)

) AS PivotTable;

Output:

clip_image004

Explanation: the key to the whole solution is to partition the sets in the data, the column RID has the partitioned values which actually works with the PIVOT. To make it more clear, look at the output of the CTE. You can see the data sets are partitioned with a column in RID. So when you execute the script A1, the RID makes the rows to unique.

WITH CUST_CTE(iid, COL0,COL1,RID)

AS

(

SELECT iid, COL0,COL1 , ROW_NUMBER() OVER (PARTITION BY (COL0)ORDER BY IID) AS RID FROM CUSTOMERS

)

SELECT * FROM CUST_CTE ORDER BY 1,4

clip_image006

Approach- 2:- considering the column NAME acts as a primary key and present across all the sets , but rest of the columns (ADDRESS, PHONE, and EMAIL) are NOT consistent across all the sets.
Lets create the table with the sample set to simulate the problem.

TRUNCATE TABLE CUSTOMERS

GO

-- 1st set

insert into customers values ( 'NAME','A_NM')

insert into customers values ( 'ADDRESS','A_ADR')

insert into customers values ( 'PHONE','A_PH')

insert into customers values ( 'EMAIL','A_EMAIL')

-- 2nd set

insert into customers values ( 'NAME','B_BM')

insert into customers values ( 'ADDRESS','B_ADR')

--insert into customers values ( 'PHONE','B_XYZ')

insert into customers values ( 'EMAIL','B_PH')

-- 3rd set

insert into customers values ( 'NAME','C_BM')

insert into customers values ( 'ADDRESS','C_ADR')

insert into customers values ( 'PHONE','C_XYZ')

insert into customers values ( 'EMAIL','C_EMAIL')

GO

SELECT * FROM CUSTOMERS

When you execute the Script-A, you will see an incorrect results in the transposed data.
Script –A2

WITH CUST_CTE(iid, COL0,COL1,RID)

AS

(

SELECT iid, COL0,COL1 , ROW_NUMBER() OVER (PARTITION BY (COL0)ORDER BY IID) AS RID FROM CUSTOMERS

)

SELECT NAME,ADDRESS,PHONE,EMAIL

FROM

(SELECT COL0,COL1,RID

FROM CUST_CTE)C

PIVOT

(

max(COL1)

FOR COL0 IN (NAME,[ADDRESS],PHONE,EMAIL)

) AS PivotTable;

Output:

clip_image008

The value “C_XYZ should belong to C_BM , but its moved to B_BM.

As I mentioned, the key is the partition column RID, let’s look at what is there in the RID by executing only the statement in CTE

WITH CUST_CTE(iid, COL0,COL1,RID)

AS

(

SELECT iid, COL0,COL1 , ROW_NUMBER() OVER (PARTITION BY (COL0)ORDER BY IID) AS RID FROM CUSTOMERS

)

SELECT * FROM CUST_CTE ORDER BY 1,4

clip_image008[1]clip_image009                  clip_image011

So, the partitioned value RID for C_XYZ should be 3 as it belongs to 3rd set (look at the insert statement above) , but here its 2 , because I had partitioned the col0 order by IID.
So, if I can partition the data based on the sets, then it will resolve the problem, but there is no such function exists in SQL Server to partition based on defined sets. So I followed the below technique to partition the data based on the column value (NAME).

with cte_group(RID,IID,COL0,COL1)

AS

(

select row_number() over (ORDER BY IID )AS RID,* from

--customers ~ select statement with adding one row at the end with value Name

(SELECT * FROM customers UNION SELECT MAX(IID)+1,'NAME','' FROM customers)A

where col0 = 'NAME'

)

-- sent data to a temp table

SELECT * INTO #GROUPS FROM CUSTOMERS a

join (SELECT A.RID, A.IID as strt ,B.IID as ends FROM cte_group A JOIN cte_group B ON B.RID - A.RID = 1) b

on a.iid >= b.strt and a.iid <b.ends

select * from #GROUPS

Execute the below script which shows the partitioned column based on sets without using partition function

select from #GROUPS

Output:

clip_image013

Now, you will get the desired output as the RID is able to show the correct data based on the sets by using the below script.

SELECT NAME,ADDRESS,PHONE,EMAIL

FROM

(SELECT COL0,COL1,RID

FROM #GROUPS)C

PIVOT

(

max(COL1)

FOR COL0 IN (NAME,[ADDRESS],PHONE,EMAIL)

) AS PivotTable;

OUTPUT:

clip_image015

37 comments:

  1. WHAT IF IT IS NOT KNOWN HOW MANY COLUMNS WOULD BE THERE. SO FOR EXAMPLE THERE MAY BE MORE VALUES OTHER THEN PHONE, NAME ETC WHICH IS NOT KNOWN AT COMPILE TIME

    ReplyDelete
  2. How to use Pivot in this scenerio? I want to avoid using aggregate function in my pivot statement. Is there any workaround.

    ReplyDelete
  3. you cant use the PIVOT function with out using an aggregate and again its all depends on your exact requirement. post your exact requirement , we will look at it.

    ReplyDelete
  4. I tried it on my tables, I'm getting an error:
    Incorrect syntax near the keyword 'with'.
    This is what I've tried:
    select * from IdentityProperties i, Properties p where i.IdentityID = p.ID
    with CTE (T_ID, ID, Name, Value)
    AS
    (
    SELECT
    row_number() OVER(PARTITION BY Name ORDER BY ID) AS T_ID, ID,Name , Value
    FROM dbo.IdentityProperties i inner join dbo.Properties p on i.PropertyID = p.ID
    )

    ReplyDelete
  5. -- the line before the with must contain semicolon(;), use the below script

    select * from IdentityProperties i, Properties p where i.IdentityID = p.ID;
    with CTE (T_ID, ID, Name, Value)
    AS
    (
    SELECT
    row_number() OVER(PARTITION BY Name ORDER BY ID) AS T_ID, ID,Name , Value
    FROM dbo.IdentityProperties i inner join dbo.Properties p on i.PropertyID = p.ID
    )

    ReplyDelete
  6. I need to use your example to populate 49 columns for diagnoses and another 49 columns for proceedures. The common field will be account. Is this feasible?

    ReplyDelete
  7. Hi,

    Can you please give me how to use pivot when you don't know the number of columns and with out aggregate function?
    can we use dynamic sql to do this?

    ReplyDelete
  8. yes, you can use dynamic pivot. checkout the article by madhavan at the below link.
    http://sqlblogcasts.com/blogs/madhivanan/archive/2008/08/27/dynamic-pivot-in-sql-server-2005.aspx

    ReplyDelete
  9. If your data set in the first instance has data which has an ID which forces grouping on it, you dont need to use ROW_NUMBER() PARTITION etc. The use of the CTE is unnecessary and adds complexity here it is simply be used as an Alias, when a a few more brackets would do the trick.


    ReplyDelete
    Replies
    1. chandra sekhar pathivadaFebruary 28, 2013 at 8:44 AM

      We still need the CTE to generate TID(partition)

      Delete
  10. Hi

    This is a simple solution, for the case when you don't know the number of columns

    http://valonkastrati.wordpress.com/2013/05/29/sql-rows-into-columns-without-using-pivot/

    Happy coding!

    ReplyDelete
    Replies
    1. Hi Valonk,
      I like your post , but i think you need to modify a bit as its missing the format still.
      if you take the table in this poast as an example "create table customers..." then you will see the difference.
      Happy coding.

      Delete
  11. Check this link for pivot without aggregate function::

    http://sandipgsql.blogspot.in/2013/06/pivot-without-aggregate-function-in-sql.html

    ReplyDelete
    Replies
    1. purpose of pivot function is to eliminate cursor. can you compare performance results from both the approaches

      Delete
  12. Nice explanation.. good job...

    ReplyDelete
  13. suppose the pivoted column has say 5 different valuesand we need all of those(and not just the max or min value).

    ReplyDelete
  14. Thank a lot - your post saved me hours of work! I used Approach 1 (By using LEFT OUTER JOIN I managed to compensate MISSING values) along with user-defined table type (instead of CTE) and dynamic SQL. Now I have a "GENERIC" stored procedure that could return a recordset with "OPEN SCHEMA" with normal representation of row. Finally tamed EAV "beast"!

    ReplyDelete
  15. o happy day! thanks1

    ReplyDelete
  16. Thanks for the post. Very useful.

    ReplyDelete
  17. Thanks - this helped a lot!

    ReplyDelete
  18. Thanks for thhe auspicious writeup. It actually was once a amusement account it.
    Glance advanced to more introduced agreeable from you!
    However, how could we keep in touch?

    ReplyDelete
  19. It's truly very complex in this full of activity life to listen news
    on TV, so I just use world wide web for that reason, and take the most recent information.

    ReplyDelete
  20. Asking questions are in fact fastidious thing if you are not understanding something totally,
    but this post provides pleasant understanding even.

    ReplyDelete
  21. I am not sure where you're getting your info, but good topic.
    I needs to spend some time learning much more or understanding more.
    Thanks for great info I was looking for this info
    for my mission.

    ReplyDelete
  22. What's up everybody, here every one is sharing such know-how, thus it's
    fastidious to read this weblog, and I used to pay a visit
    this website every day.

    ReplyDelete
  23. Great weblog right here! Also your web site a
    lot up fast! What web host are you using? Can I am getting your associate hyperlink to
    your host? I want my web site loaded up as quickly as yours lol

    ReplyDelete
  24. I love your blog.. very ice colors & theme. Did you create
    this website yourself or did you hire someone
    to do iit for you? Plz respond aas I'm looking tto design my
    own blog and would like to find out where u got this from.
    appreciate it

    ReplyDelete
  25. Thanks for your personal marvelous posting!
    I actually enjoyed reading it, you are a great
    author.I will remember to bookmark your blog and may come back down the road.
    I want to encourage you to continue your great job, have
    a nice holiday weekend!

    ReplyDelete
  26. Great delivery. Solid arguments. Keep up the amazing spirit.

    ReplyDelete
  27. My bdother recommended I might like this website.
    He was totally right. Thiss post truly made my day. You ccan not imagine simply
    hoow much time I haad spent for this information!
    Thanks!

    ReplyDelete
  28. Saved as a favorite, I love your web site!

    ReplyDelete
  29. you're in reality a just right webmaster.
    The site loading pace is amazing. It seems tha you're doing
    any unique trick. Furthermore, The contents aare masterpiece.

    you've performed a excellent job in this topic!

    ReplyDelete
  30. Hi there friends, nice paragraph and fastidious arguments commented
    at this place, I am actually enjoying by these.

    ReplyDelete
  31. Everyone loves it when folks get together and share opinions.
    Great website, keep it up!

    ReplyDelete
  32. Wow, marvelous blog structure! Howw ⅼong havee you evеr beеn running a blog fοr?

    yoս mаke blogging glance easy. Thee ⲟverall look of youг website іѕ magnificent, ɑs weⅼl ass
    the ϲontent!

    ReplyDelete

Featured Post

SQL Server AWS Migration BCP

stored procedure to generate BCP scritps to migrate the SQL Server database. Developed this stored procedure on my labs to simulate t...

Contributors