问题描述:有一张签到表,需要将连续或相同的时间段合并,即把多条记录查询合并成一条连续的时间段记录。
数据表如下:
DROP TABLE IF EXISTS `timesheets`;CREATE TABLE `timesheets` ( `task_id` varchar(10) NOT NULL DEFAULT '', `start_date` date DEFAULT NULL, `end_date` date DEFAULT NULL, PRIMARY KEY (`task_id`)) ENGINE=InnoDB DEFAULT CHARSET=latin1;-- ------------------------------ Records of timesheets-- ----------------------------INSERT INTO `timesheets` VALUES ('1', '2014-03-01', '2014-03-03');INSERT INTO `timesheets` VALUES ('10', '2014-03-17', '2014-03-17');INSERT INTO `timesheets` VALUES ('2', '2014-03-02', '2014-03-04');INSERT INTO `timesheets` VALUES ('3', '2014-03-04', '2014-03-05');INSERT INTO `timesheets` VALUES ('4', '2014-03-06', '2014-03-09');INSERT INTO `timesheets` VALUES ('5', '2014-03-09', '2014-03-09');INSERT INTO `timesheets` VALUES ('6', '2014-03-09', '2014-03-09');INSERT INTO `timesheets` VALUES ('7', '2014-03-12', '2014-03-15');INSERT INTO `timesheets` VALUES ('8', '2014-03-13', '2014-03-14');INSERT INTO `timesheets` VALUES ('9', '2014-03-14', '2014-03-14');
解答:
解法一:
首先需要明确,查询的结果肯定是通过表自身的连接得到的,因为查询的两个字段来自不同的记录行,不妨分组得到可能的记录再筛选:
SELECT a.start_date,b.end_date FROM timesheets a,timesheets b ,timesheets c WHERE a.end_date<=b.end_date GROUP BY a.start_date,b.end_date;
哪些记录需要排除呢,结果字段在原表同一条记录的两值之间的记录都得干掉,起点和起点可相同,终点和终点可相同:
SELECT a.start_date,b.end_date FROM timesheets a,timesheets b ,timesheets c WHERE a.end_date<=b.end_date GROUP BY a.start_date,b.end_date HAVING MAX(CASE WHEN (a.start_date>c.start_date and a.start_date<=c.end_date) OR (b.end_date>=c.start_date and b.end_date<c.end_date) then 1 ELSE 0 END) =0 ;
最后,从这个结果集中分组得到最终的结果,起始时间和组内最小的终止时间:
SELECT d.start_date,MIN(d.end_date) from (SELECT a.start_date,b.end_date FROM timesheets a,timesheets b ,timesheets c WHERE a.end_date<=b.end_date GROUP BY a.start_date,b.end_date HAVING MAX(CASE WHEN (a.start_date>c.start_date and a.start_date<=c.end_date) OR (b.end_date>=c.start_date and b.end_date<c.end_date) then 1 ELSE 0 END) =0) d GROUP BY d.start_date;
解法二:
通过左连接分别得到起始时间和终止时间,然后通过内连接合并结果:
#干掉起点时间在某记录起止时间之间的SELECT a.start_date FROM timesheets a LEFT OUTER JOIN timesheets b ON a.start_date>b.start_date AND a.start_date<=b.end_date GROUP BY a.start_date HAVING COUNT(b.start_date)=0;#干掉终点时间在某记录起止时间之间的SELECT a.end_date FROM timesheets a LEFT OUTER JOIN timesheets b ON a.end_date>=b.start_date AND a.end_date<b.end_date GROUP BY a.end_date HAVING COUNT(b.start_date)=0;SELECT x.start_date,MIN(y.end_date) FROM (SELECT a.start_date FROM timesheets a LEFT OUTER JOIN timesheets b ON a.start_date>b.start_date AND a.start_date<=b.end_date GROUP BY a.start_date HAVING COUNT(b.start_date)=0) x INNER JOIN (SELECT a.end_date FROM timesheets a LEFT OUTER JOIN timesheets b ON a.end_date>=b.start_date AND a.end_date<b.end_date GROUP BY a.end_date HAVING COUNT(b.start_date)=0) y ON x.start_date<=y.end_date GROUP BY x.start_date;结果如图: