GROUP的功能配合加總函數,可對資料分群後,以群為單位加總,如果資料當中,用以分群的欄位數超過一個以上,對分群欄位如何做到個別加總及最後的完整加總? ROLLUP、CUBE和GROUPING SET提供了很好的方法。
參考文件:文件一、
文件二、
文件三
以下範例假設有某資料表,有三個屬性欄位(如部門及子部門等)和一個數值欄位(如薪水或價格等),該資料表的SQL請參考文件一。
對前三個欄位做群組,列出欄位數及合計
SELECT YEAR,
AREA,
DEPARTMENT_ID,
COUNT(*) AS num_rows,
SUM(sales) AS TOTAL_SALES
FROM total_sales;
ROLLUP : 分群部分加總函數
ROLLUP 函數可對一個以上的欄位進行分項加總,進行加總時,從最右邊的欄位進行分項加總。若 ROLLUP 有 N 個欄位,會產生 N+1 個階層的子項加總,以下針對年度與區域做子項加總,求分年和分區域的總和,查詢部分截圖如下,完整結果請參考
結果檔案。
SELECT YEAR,
AREA,
SUM(sales) AS TOTAL_SALES
FROM total_sales
GROUP BY ROLLUP (YEAR, AREA)
ORDER BY YEAR, AREA;
使用三個欄位進行分項加總,共有4個階層的子加總,其
結果檔案和截圖如下所示。
SELECT YEAR,
AREA,
DEPARTMENT_ID,
SUM(sales) AS TOTAL_SALES
FROM total_sales
GROUP BY ROLLUP (YEAR, AREA, DEPARTMENT_ID)
ORDER BY YEAR, AREA, DEPARTMENT_ID;
分項加總也可以搭配一般的群組GROUP,其結果如
檔案所示。
SELECT YEAR, AREA, DEPARTMENT_ID,
SUM(sales) AS TOTAL_SALES
FROM total_sales
GROUP BY YEAR, ROLLUP (AREA, DEPARTMENT_ID)
ORDER BY YEAR, AREA, DEPARTMENT_ID;
使用GROUP BY 和 UNION ALL 取代 ROLLUP 函數
上述分項加總的ROLLUP函數也有替代方案。若是三個欄位進行ROLLUP,則需四組SQL進行 UNION ALL 運算,第一組列出所有欄位,第二組從ROLLUP最右邊的欄位起,將該欄位以NULL取代,依此類推。
下列ROLLUP查詢
SELECT YEAR,
AREA,
DEPARTMENT_ID,
SUM(sales) AS TOTAL_SALES
FROM total_sales
GROUP BY ROLLUP (YEAR, AREA, DEPARTMENT_ID)
ORDER BY YEAR, AREA, DEPARTMENT_ID;
與使用替代方案的UNION ALL 的下列查詢,產生完全相同的
結果 。
Select YEAR,AREA,DEPARTMENT_ID,sum(SALES) AS TOTAL_SALES from TOTAL_SALES group by YEAR,AREA,DEPARTMENT_ID
union all
Select YEAR,AREA,null,sum(SALES) AS TOTAL_SALES from TOTAL_SALES group by YEAR,AREA
union all
Select YEAR,null,null,sum(SALES) AS TOTAL_SALES from TOTAL_SALES group by YEAR
union all
Select null,null,null,sum(SALES) AS TOTAL_SALES from TOTAL_SALES
order by 1,2,3;
CUBE 函數
使用 CUBE 函數,是對函數內的參數,列舉完整的加總結果。若是 N 個參數的 CUBE 函數,會產生 2^N 個子加總的組合。
SELECT YEAR, AREA,
SUM(sales) AS TOTAL_SALES
FROM total_sales
GROUP BY CUBE (YEAR, AREA)
ORDER BY YEAR, AREA;
上述CUBE函數的
結果。
當參數增加,子加總的結果也會大幅增加,前述兩個參數共產生18筆,若改為三個參數如下,則共產生198筆
資料
SELECT YEAR, AREA, DEPARTMENT_ID,
SUM(sales) AS TOTAL_SALES
FROM total_sales
GROUP BY CUBE (YEAR, AREA, DEPARTMENT_ID)
ORDER BY YEAR, AREA,DEPARTMENT_ID;
使用GROUP BY 和 UNION ALL 取代 CUBE 函數的替代方法
前述三個參數的 CUBE 函數,若是使用 GROUP BY 和 UNION ALL,則需要 2^3 個 SQL,也就是 8 個SQL來組成替代方案。
Select YEAR,AREA,DEPARTMENT_ID,sum(SALES) AS TOTAL_SALES from TOTAL_SALES group by YEAR,AREA,DEPARTMENT_ID
union all
Select YEAR,AREA,null,sum(SALES) AS TOTAL_SALES from TOTAL_SALES group by YEAR, AREA
union all
Select YEAR,null,DEPARTMENT_ID,sum(SALES) AS TOTAL_SALES from TOTAL_SALES group by YEAR, DEPARTMENT_ID
union all
Select YEAR,null,null,sum(SALES) AS TOTAL_SALES from TOTAL_SALES group by YEAR
union all
Select null,AREA,DEPARTMENT_ID,sum(SALES) AS TOTAL_SALES from TOTAL_SALES group by AREA, DEPARTMENT_ID
union all
Select null,AREA,null,sum(SALES) AS TOTAL_SALES from TOTAL_SALES group by AREA
union all
Select null,null,DEPARTMENT_ID,sum(SALES) AS TOTAL_SALES from TOTAL_SALES group by DEPARTMENT_ID
union all
Select null,null,null,sum(SALES) AS TOTAL_SALES from TOTAL_SALES;
上述UNION的結果,共產生 198 筆
資料。
GROUPING Functions
在所產生的子項加總資料列,所進行加總的欄位以NULL顯示,但是若原有資料本身就有NULL值,會難以分辨是原有的NULL還是因為加總而產生的NULL,GROUPING函數用來顯示該欄位是否因為執行加總(ROLLUP、CUBE)而產生NULL值,它接受一個參數(欄位),1表示該欄位的NULL由加總產生,0 表示欄位是其他的值。
SELECT YEAR,
AREA,
SUM(sales) AS TOTAL_SALES,
GROUPING(YEAR) as Grouping_Year,
GROUPING(AREA) as Grouping_Area,
FROM total_sales
GROUP BY ROLLUP (YEAR, AREA)
ORDER BY YEAR, AREA;
上述查詢
結果多了兩個欄位Grouping_Year與Grouping_Area。當該欄位顯示為1,表示該分項子加總由該欄位產生,而其對應的欄位亦為NULL。
下圖顯示,Grouping_Area 的值為1,表示欄位Area的NULL是因為加總而產生,而子項加總 27435.34 也是由2015年各區域的總和而產生。
有此欄位,就可以對子項加總的結果,篩選你要的結果,比方說,你只想看2015年各分項加總以及總結的數量,可用下列方式,從113筆資料當中,篩選出7筆2015年的分項總計。
SELECT YEAR, AREA, DEPARTMENT_ID, SUM(sales) AS TOTAL_SALES,
GROUPING(YEAR) as Grouping_Year,
GROUPING(AREA) as Grouping_Area,
GROUPING(DEPARTMENT_ID) as Grouping_Dept
FROM total_sales
WHERE YEAR=2015
GROUP BY ROLLUP (YEAR, AREA, DEPARTMENT_ID)
HAVING GROUPING(YEAR)=1 or GROUPING(AREA)=1 or GROUPING(DEPARTMENT_ID)=1
ORDER BY YEAR, AREA, DEPARTMENT_ID;
其分項結果如下圖所示。
GROUPING_ID
此函數接受一個以上的欄位為參數,通常與ROLLUP或是CUBE的參數相同,它用來顯示加總欄位的階層,0表示第一階。以下使用兩個欄位以CUBE函數查詢所有分項加總及其組合。由
結果顯示共有4個階層。
SELECT YEAR, AREA,
SUM(sales) AS TOTAL_SALES,
GROUPING_ID(YEAR, AREA) AS GROUPING_LEVEL
FROM total_sales
GROUP BY CUBE (YEAR, AREA)
ORDER BY YEAR, AREA;
GROUPING SETS
有時我們只想得到某些欄位組合的分項加總, GROUPING SETS用於此時,可將想得到的欄位組合以括號組合在一起,就只會列出該組合的子加總,例如,若只想得到(年度,區域)及(年度,部門)這兩種組合的子加總時,便可用 GROUPING SETS((YEAR, AREA), (YEAR, DEPARTMENT_ID))來取得,如下範例所示,其
結果請參考。
SELECT YEAR, AREA, DEPARTMENT_ID,
SUM(sales) AS TOTAL_SALES,
GROUPING_ID(YEAR, AREA, DEPARTMENT_ID) AS GROUPING_LEVEL
FROM total_sales
GROUP BY GROUPING SETS ((YEAR, AREA),(YEAR, DEPARTMENT_ID))
ORDER BY YEAR, AREA,DEPARTMENT_ID;
Composite Columns
在ROLLUP與CUBE函數中的參數可用欄位組合方式來取得特定組合的加總,例如ROLLUP((YEAR, AREA), DEPARTMENT_ID)或是CUBE((YEAR, AREA), DEPARTMENT_ID),如此,這些以括號放在一起的欄位就會一起列出。其
結果請參考。
SELECT YEAR, AREA, DEPARTMENT_ID,
SUM(sales) AS TOTAL_SALES,
GROUPING_ID(YEAR, AREA, DEPARTMENT_ID) AS GROUPING_LEVEL
FROM total_sales
GROUP BY CUBE ((YEAR, AREA), DEPARTMENT_ID)
ORDER BY YEAR, AREA,DEPARTMENT_ID;
Concatenated Groupings
這是把組合的欄位,以交錯的方式組合輸出。多個GROUPING SETS,透過逗點組合在一起,例如GROUPING SETS(id1, id2), GROUPING SETS(id3, id4) ,會產生的組合是 :
(id1, id3)
(id1, id4)
(id2, id3)
(id2, id4)
依此類推,以下範例就會產生出這些欄位的組合和子項加總 ,其結果請參考
(YEAR, DEPARTMENT_ID)
(YEAR, MANAGER_ID)
(AREA, DEPARTMENT_ID)
(AREA, MANAGER_ID)
SELECT YEAR, AREA, DEPARTMENT_ID,MANAGER_ID,
SUM(sales) AS TOTAL_SALES,
GROUPING_ID(YEAR, AREA, DEPARTMENT_ID) AS GROUPING_LEVEL
FROM total_sales
GROUP BY GROUPING SETS (YEAR, AREA),GROUPING SETS (DEPARTMENT_ID, MANAGER_ID)
ORDER BY YEAR, AREA,DEPARTMENT_ID;