-- 由ROW_NUMBER()函数所想到的一点点东东 --
SQL code-- 近几天,观察了一下公司的统计平台,经常看到类似如下的ROW_NUMBER()函数相关的SQL语句:SELECT ... ROW_NUMBER() OVER (PARTITION BY T74.WEEK_NAME, T5816.SESSION_ID ORDER BY T74.WEEK_NAME DESC, T5816.SESSION_ID DESC) ...FROM ...-- 很明显:ROW_NUMBER() OVER()函数中的 ORDER BY 部分完全与 PARTITION BY 部分相同。-- 我当时很疑惑:这样查询出来的结果能否准确呢?是否是你想要的数据呢?-- 个人提示:用 ROW_NUMBER() OVER()函数的时候,其ORDER BY 所指定的字段,不应该与 PARTITION BY 所指定的字段完全相同,-- 准确地说:已经在PARTITION BY 字段中指定的字段,就不应该在后续的 ORDER BY 字段中出现。-- 举例:-- Emp表字段说明: ----------------------------------------- EMPNO NOT NULL NUMBER(4) -- 员工号 ENAME VARCHAR2(10) -- 员工名称 JOB VARCHAR2(9) -- 员工职位 MGR NUMBER(4) -- 员工上级领导工号 HIREDATE DATE -- 员工入职日期 SAL NUMBER(7,2) -- 员工薪水 COMM NUMBER(7,2) -- 员工提成 DEPTNO NUMBER(2) -- 员工所在部门的部门号scott@TDODS> SELECT * FROM EMP; EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO---------- ---------- --------- ---------- --------- ---------- ---------- ---------- 7369 SMITH CLERK 7902 17-DEC-80 800 20 7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 30 7521 WARD SALESMAN 7698 22-FEB-81 1250 500 30 7566 JONES MANAGER 7839 02-APR-81 2975 20 7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 30 7698 BLAKE MANAGER 7839 01-MAY-81 2850 30 7782 CLARK MANAGER 7839 09-JUN-81 2450 10 7788 SCOTT ANALYST 7566 19-APR-87 3000 20 7839 KING PRESIDENT 17-NOV-81 5000 10 7844 TURNER SALESMAN 7698 08-SEP-81 1500 0 30 7876 ADAMS CLERK 7788 23-MAY-87 1100 20 7900 JAMES CLERK 7698 03-DEC-81 950 30 7902 FORD ANALYST 7566 03-DEC-81 3000 20 7934 MILLER CLERK 7782 23-JAN-82 1300 1014 rows selected.SELECT EMPNO, ENAME, JOB, MGR, HIREDATE, SAL, COMM, DEPTNO, ROW_NUMBER() OVER (PARTITION BY DEPTNO ORDER BY DEPTNO DESC) AS cntFROM EMP; EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO CNT---------- ---------- --------- ---------- --------- ---------- ---------- ---------- ---------- 7782 CLARK MANAGER 7839 09-JUN-81 2450 10 1 7839 KING PRESIDENT 17-NOV-81 5000 10 2 7934 MILLER CLERK 7782 23-JAN-82 1300 10 3 7566 JONES MANAGER 7839 02-APR-81 2975 20 1 7902 FORD ANALYST 7566 03-DEC-81 3000 20 2 7876 ADAMS CLERK 7788 23-MAY-87 1100 20 3 7369 SMITH CLERK 7902 17-DEC-80 800 20 4 7788 SCOTT ANALYST 7566 19-APR-87 3000 20 5 7521 WARD SALESMAN 7698 22-FEB-81 1250 500 30 1 7844 TURNER SALESMAN 7698 08-SEP-81 1500 0 30 2 7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 30 3 7900 JAMES CLERK 7698 03-DEC-81 950 30 4 7698 BLAKE MANAGER 7839 01-MAY-81 2850 30 5 7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 30 6-- 上面的查询,我一个一个仔细看,这个 ORDER BY DEPTNO DESC 是起作用啦,将查询的整体结果按DEPTNO升序排序,也不符合要求:ORDER BY DEPTNO DESC-- 进一步分析上面的结果:CNT 字段到底是按什么顺序排序的呢?-- 是按 EMPNO 排序? NO-- 是按 ENAME 排序? NO-- 是按 JOB 排序? NO-- 是按 MGR 排序? NO-- 是按 HIREDATE 排序? NO-- 是按 SAL 排序? NO-- 是按 COMM 排序? NO-- 除了整体结果是按的DEPTNO 升序排序外,看不出什么其他规律。-- 那么:是不是按照 rowid排序的呢?我们再来验证一下:-- 验证是否是按照rowid全局排序:SELECT EMPNO, ENAME, JOB, MGR, HIREDATE, SAL, COMM, DEPTNO, rowidFROM EMPORDER BY rowid ASC; EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO ROWID---------- ---------- --------- ---------- --------- ---------- ---------- ---------- ------------------ 7369 SMITH CLERK 7902 17-DEC-80 800 20 AAAR3xAAEAAAACXAAA 7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 30 AAAR3xAAEAAAACXAAB 7521 WARD SALESMAN 7698 22-FEB-81 1250 500 30 AAAR3xAAEAAAACXAAC 7566 JONES MANAGER 7839 02-APR-81 2975 20 AAAR3xAAEAAAACXAAD 7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 30 AAAR3xAAEAAAACXAAE 7698 BLAKE MANAGER 7839 01-MAY-81 2850 30 AAAR3xAAEAAAACXAAF 7782 CLARK MANAGER 7839 09-JUN-81 2450 10 AAAR3xAAEAAAACXAAG 7788 SCOTT ANALYST 7566 19-APR-87 3000 20 AAAR3xAAEAAAACXAAH 7839 KING PRESIDENT 17-NOV-81 5000 10 AAAR3xAAEAAAACXAAI 7844 TURNER SALESMAN 7698 08-SEP-81 1500 0 30 AAAR3xAAEAAAACXAAJ 7876 ADAMS CLERK 7788 23-MAY-87 1100 20 AAAR3xAAEAAAACXAAK 7900 JAMES CLERK 7698 03-DEC-81 950 30 AAAR3xAAEAAAACXAAL 7902 FORD ANALYST 7566 03-DEC-81 3000 20 AAAR3xAAEAAAACXAAM 7934 MILLER CLERK 7782 23-JAN-82 1300 10 AAAR3xAAEAAAACXAAN14 rows selected.-- 验证是否是按照DEPTNO分组,然后各组按照rowid排序:SELECT EMPNO, ENAME, JOB, MGR, HIREDATE, SAL, COMM, DEPTNO, ROW_NUMBER() OVER (PARTITION BY DEPTNO ORDER BY rowid DESC) AS cntFROM EMP; EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO CNT---------- ---------- --------- ---------- --------- ---------- ---------- ---------- ---------- 7934 MILLER CLERK 7782 23-JAN-82 1300 10 1 7839 KING PRESIDENT 17-NOV-81 5000 10 2 7782 CLARK MANAGER 7839 09-JUN-81 2450 10 3 7902 FORD ANALYST 7566 03-DEC-81 3000 20 1 7876 ADAMS CLERK 7788 23-MAY-87 1100 20 2 7788 SCOTT ANALYST 7566 19-APR-87 3000 20 3 7566 JONES MANAGER 7839 02-APR-81 2975 20 4 7369 SMITH CLERK 7902 17-DEC-80 800 20 5 7900 JAMES CLERK 7698 03-DEC-81 950 30 1 7844 TURNER SALESMAN 7698 08-SEP-81 1500 0 30 2 7698 BLAKE MANAGER 7839 01-MAY-81 2850 30 3 7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 30 4 7521 WARD SALESMAN 7698 22-FEB-81 1250 500 30 5 7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 30 614 rows selected.-- 果然也不是按照 rowid 排序的,我们都清楚,rowid 只是行记录的一个物理地址,所以就算是按照rowid这样排序出来的CNT字段已经失去了意义。-- 综上所述:-- 所以:用 ROW_NUMBER() OVER()函数的时候,其ORDER BY 所指定的字段,不应该与 PARTITION BY 所指定的字段完全相同,-- 准确地说:已经在PARTITION BY 字段中指定的字段,就不应该在后续的 ORDER BY 字段中出现。-- 再举例:例如:我要查询员工的详细信息,且按部门分组,按工资排名:SELECT EMPNO, ENAME, JOB, MGR, HIREDATE, SAL, COMM, DEPTNO, ROW_NUMBER() OVER (PARTITION BY DEPTNO ORDER BY sal DESC) AS cntFROM EMP; EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO CNT---------- ---------- --------- ---------- --------- ---------- ---------- ---------- ---------- 7839 KING PRESIDENT 17-NOV-81 5000 10 1 7782 CLARK MANAGER 7839 09-JUN-81 2450 10 2 7934 MILLER CLERK 7782 23-JAN-82 1300 10 3 7788 SCOTT ANALYST 7566 19-APR-87 3000 20 1 7902 FORD ANALYST 7566 03-DEC-81 3000 20 2 7566 JONES MANAGER 7839 02-APR-81 2975 20 3 7876 ADAMS CLERK 7788 23-MAY-87 1100 20 4 7369 SMITH CLERK 7902 17-DEC-80 800 20 5 7698 BLAKE MANAGER 7839 01-MAY-81 2850 30 1 7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 30 2 7844 TURNER SALESMAN 7698 08-SEP-81 1500 0 30 3 7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 30 4 7521 WARD SALESMAN 7698 22-FEB-81 1250 500 30 5 7900 JAMES CLERK 7698 03-DEC-81 950 30 614 rows selected.-- 可以看到: 数据非常准确: 10号部门工资排在第1位的是 KING 员工,薪水是5000; 20号部门工资排在第1位的是 SCOTT 员工,薪水是3000; 30号部门工资排在第1位的是 BLAKE 员工,薪水是2850;