有些时候,我们可能想要比较一下两个数据表,以找到其中不同的数据。比如,在进行数据移植的时候,或是在合并数据的时候,或是在比对验证数据的时候。当然比较两个表,需要这两个表结构是一样的。
我们先假设一下有如下表结构:
CREATETABLEjajal ( user_id integerNOTNULL, first_name charactervarying(255), last_name charactervarying(255), grade character(1), CONSTRAINTjajal_pkey PRIMARYKEY(user_id) )
然后,我们有两张表——jajal和jajal_copy,其内容如下:
jajal
user_id | first_name | last_name | grade |
---|---|---|---|
1 | Some | Dude | A |
2 | Other | Guy | B |
3 | You are | Welcome | B |
4 | What | Other | A |
5 | INeed | You | C |
6 | Mixed | Nuts | Z |
7 | Kirk | Land | B |
8 | Bit | Shooter | A |
9 | Sun | Microsystem | C |
10 | Extra | Fancy | B |
jajal_copy
user_id | first_name | last_name | grade |
---|---|---|---|
1 | Some | Dude | A |
2 | Other | Guy | B |
3 | You are | Welcome | B |
4 | What | Other | A |
5 | INeed | You | C |
6 | Mixed | Nuts | C |
7 | Kirk | Land | B |
8 | Bit | Shooter | A |
9 | Sun | Microsystem | C |
10 | Extra | Fancy | B |
要比较这两张表的数据,找出不一样的数据行。我们可以使用outer join 技术。我给outer join做了一个链接,是Wikipedia的,如果你对这个技术不是很清楚,还请你行看看其技术细节。
下面是具体的SQL语句:
使用FULL OUTER JOIN
SELECT * FROM jajal j FULLOUTERJOINjajal_copy jc ONjc.first_name = j.first_name ANDjc.last_name = j.last_name ANDjc.grade = j.grade ANDjc.user_id = j.user_id WHERE j.user_id ISNULL ORjc.user_id ISNULL
运行结果如下:
user_id | first_name | last_name | grade | user_id | first_name | last_name | grade |
---|---|---|---|---|---|---|---|
[NULL] | [NULL] | [NULL] | [NULL] | 6 | Mixed | Nuts | C |
6 | Mixed | Nuts | Z | [NULL] | [NULL] | [NULL] | [NULL] |
使用NATURAL FULL OUTER JOIN
关于natural join,你可以看看Wikipedia是怎么说的。
SELECT * FROM jajal j NATURAL FULLOUTERJOINjajal_copy jc WHERE j.user_id ISNULL ORjc.user_id ISNULL
运行结果如下:
user_id | first_name | last_name | grade |
---|---|---|---|
6 | Mixed | Nuts | C |
6 | Mixed | Nuts | Z |
MySQL SQL 代码
MySQL 并不支持 FULL OUTER JOIN,但是我们可以使用LEFT JOIN 和 RIGHT JOIN 来实现这一功能。如下所示。
SELECT * FROM jajal j LEFTJOINjajal_copy jc ONjc.first_name = j.first_name ANDjc.last_name = j.last_name ANDjc.grade = j.grade ANDjc.user_id = j.user_id WHERE jc.user_id ISNULL UNIONALL SELECT * FROM jajal j RIGHTJOINjajal_copy jc ONjc.first_name = j.first_name ANDjc.last_name = j.last_name ANDjc.grade = j.grade ANDjc.user_id = j.user_id WHERE j.user_id ISNULL
或者你更喜欢NATURAL JOIN 版本
SELECT * FROM jajal j NATURAL LEFTJOINjajal_copy jc WHERE jc.user_id ISNULL UNIONALL SELECT * FROM jajal j NATURAL RIGHTJOINjajal_copy jc WHERE j.user_id ISNULL
当然,如果你需要一个MySQL的存储过程的话,下面是一个示例:
DELIMITER $$ CREATEPROCEDURE`db_schema`.`tablediff` (schema_name VARCHAR(64), table1 VARCHAR(64), table2 VARCHAR(64)) BEGIN DECLAREdone INTDEFAULT0; DECLAREsql_statement TEXT DEFAULT''; DECLAREsql_statement_where TEXT DEFAULT''; DECLAREsql_statement_pk TEXT DEFAULT''; DECLAREcol_name VARCHAR(64); DECLAREcol_name_cur CURSORFOR SELECT COLUMN_NAME FROM information_schema.COLUMNS WHERE TABLE_SCHEMA = schema_name ANDTABLE_NAME = table1 ; DECLARECONTINUEHANDLER FORNOTFOUND SETdone = 1; OPENcol_name_cur; traverse_columns: LOOP FETCHcol_name_cur INTOcol_name; IF done THEN CLOSEcol_name_cur; LEAVE traverse_columns; ENDIF; SETsql_statement_where = CONCAT(sql_statement_where, ' AND a.', col_name, ' = b.', col_name); SETsql_statement_pk = CONCAT(sql_statement_pk, 'AND b.', col_name, ' IS NULL'); ENDLOOP; SELECT COLUMN_NAME INTOcol_name FROM information_schema.KEY_COLUMN_USAGE WHERE CONSTRAINT_SCHEMA = schema_name ANDCONSTRAINT_NAME = 'PRIMARY' ANDTABLE_NAME = table1 LIMIT 1 ; IF col_name ISNOTNULLTHEN SETsql_statement_pk = CONCAT('AND b.', col_name, ' IS NULL'); ENDIF; SETsql_statement = CONCAT('SELECT * FROM ', schema_name, '.', table1, ' a LEFT JOIN ', schema_name, '.', table2, ' b ON TRUE'); SETsql_statement = CONCAT(sql_statement, sql_statement_where, ' WHERE TRUE ', sql_statement_pk); SETsql_statement = CONCAT(sql_statement, ' UNION ALL SELECT * FROM ', schema_name, '.', table1, ' b RIGHT JOIN ', schema_name, '.', table2, ' a ON TRUE'); SETsql_statement = CONCAT(sql_statement, sql_statement_where, ' WHERE TRUE ', sql_statement_pk); SET@s = sql_statement; PREPAREstmt1 FROM@s; EXECUTEstmt1; DEALLOCATEPREPAREstmt1; END$$ DELIMITER ;
PostgreSQL 下的SQL语句
下面是PostgreSQL的一个存储过程:
CREATEORREPLACEFUNCTIONtablediff ( INschema_name VARCHAR(64), INtable1 VARCHAR(64), INtable2 VARCHAR(64) ) RETURNSBIGINTAS $BODY$ DECLARE the_result BIGINTDEFAULT0; sql_statement TEXT DEFAULT''; sql_statement_where TEXT DEFAULT''; sql_statement_pk TEXT DEFAULT''; col_name VARCHAR(64); col_name_cur CURSORFOR SELECT column_name FROM information_schema.columns WHERE table_catalog = schema_name ANDtable_schema = 'public' ANDtable_name = table1 ; BEGIN OPENcol_name_cur; LOOP FETCHcol_name_cur INTOcol_name; IF NOTFOUND THEN EXIT; ENDIF; sql_statement_where := sql_statement_where || ' AND a.'|| col_name || ' = b.'|| col_name; ENDLOOP; SELECT column_name INTOcol_name FROM information_schema.table_constraints tc JOINinformation_schema.constraint_column_usage ccu ON ccu.constraint_name = tc.constraint_name WHERE tc.table_catalog = schema_name ANDtc.table_schema = 'public' ANDtc.table_name = table1 LIMIT 1 ; IF col_name ISNOTNULLTHEN sql_statement_pk := ' a.'|| col_name || ' IS NULL'; sql_statement_pk := sql_statement_pk || ' OR b.'|| col_name || ' IS NULL'; ENDIF; sql_statement := 'SELECT COUNT(*) FROM '|| schema_name || '.public.'|| table1 || ' a FULL OUTER JOIN '|| schema_name || '.public.'|| table2 || ' b ON TRUE'; sql_statement := sql_statement || sql_statement_where || ' WHERE '|| sql_statement_pk; EXECUTEsql_statement INTOthe_result; RETURNthe_result; END;$BODY$ LANGUAGE 'plpgsql'STABLE;