当前位置: 代码迷 >> 高性能数据库开发 >> 如何比较两个数据表
  详细解决方案

如何比较两个数据表

热度:968   发布时间:2014-03-12 23:10:45.0

有些时候,我们可能想要比较一下两个数据表,以找到其中不同的数据。比如,在进行数据移植的时候,或是在合并数据的时候,或是在比对验证数据的时候。当然比较两个表,需要这两个表结构是一样的。

我们先假设一下有如下表结构:

CREATETABLEjajal
(
    user_id integerNOTNULL,
    first_name charactervarying(255),
    last_name charactervarying(255),
    grade character(1),
    CONSTRAINTjajal_pkey PRIMARYKEY(user_id)
)


然后,我们有两张表——jajal和jajal_copy,其内容如下:

jajal

user_idfirst_namelast_namegrade
1SomeDudeA
2OtherGuyB
3You areWelcomeB
4WhatOtherA
5INeedYouC
6MixedNutsZ
7KirkLandB
8BitShooterA
9SunMicrosystemC
10ExtraFancyB

jajal_copy

user_idfirst_namelast_namegrade
1SomeDudeA
2OtherGuyB
3You areWelcomeB
4WhatOtherA
5INeedYouC
6MixedNutsC
7KirkLandB
8BitShooterA
9SunMicrosystemC
10ExtraFancyB

要比较这两张表的数据,找出不一样的数据行。我们可以使用outer join 技术。我给outer join做了一个链接,是Wikipedia的,如果你对这个技术不是很清楚,还请你行看看其技术细节。

下面是具体的SQL语句:

使用FULL OUTER JOIN

SELECT
     *
FROM
     jajal j
     FULLOUTERJOINjajal_copy jc ONjc.first_name = j.first_name
     ANDjc.last_name = j.last_name
     ANDjc.grade = j.grade
     ANDjc.user_id = j.user_id
WHERE
     j.user_id ISNULL
     ORjc.user_id ISNULL

运行结果如下:

user_idfirst_namelast_namegradeuser_idfirst_namelast_namegrade
[NULL][NULL][NULL][NULL]6MixedNutsC
6MixedNutsZ[NULL][NULL][NULL][NULL]

使用NATURAL FULL OUTER JOIN

关于natural join,你可以看看Wikipedia是怎么说的。

SELECT
       *
FROM
       jajal j
       NATURAL FULLOUTERJOINjajal_copy jc
WHERE
       j.user_id ISNULL
       ORjc.user_id ISNULL

运行结果如下:

user_idfirst_namelast_namegrade
6MixedNutsC
6MixedNutsZ

MySQL SQL 代码

MySQL 并不支持 FULL OUTER JOIN,但是我们可以使用LEFT JOIN 和 RIGHT JOIN 来实现这一功能。如下所示。
SELECT
*
FROM
jajal j
LEFTJOINjajal_copy jc ONjc.first_name = j.first_name
ANDjc.last_name = j.last_name
ANDjc.grade = j.grade
ANDjc.user_id = j.user_id
WHERE
jc.user_id ISNULL
UNIONALL
SELECT
*
FROM
jajal j
RIGHTJOINjajal_copy jc ONjc.first_name = j.first_name
ANDjc.last_name = j.last_name
ANDjc.grade = j.grade
ANDjc.user_id = j.user_id
WHERE
j.user_id ISNULL

或者你更喜欢NATURAL JOIN 版本

SELECT
*
FROM
jajal j
NATURAL LEFTJOINjajal_copy jc
WHERE
jc.user_id ISNULL
UNIONALL
SELECT
*
FROM
jajal j
NATURAL RIGHTJOINjajal_copy jc
WHERE
j.user_id ISNULL

当然,如果你需要一个MySQL的存储过程的话,下面是一个示例:

DELIMITER $$
 
CREATEPROCEDURE`db_schema`.`tablediff`
    (schema_name VARCHAR(64), table1 VARCHAR(64), table2 VARCHAR(64))
BEGIN
    DECLAREdone INTDEFAULT0;
    DECLAREsql_statement TEXT DEFAULT'';
    DECLAREsql_statement_where TEXT DEFAULT'';
    DECLAREsql_statement_pk TEXT DEFAULT'';
    DECLAREcol_name VARCHAR(64);
    DECLAREcol_name_cur CURSORFOR
        SELECT
            COLUMN_NAME
        FROM
            information_schema.COLUMNS
        WHERE
            TABLE_SCHEMA = schema_name
            ANDTABLE_NAME = table1
    ;
    DECLARECONTINUEHANDLER FORNOTFOUND SETdone = 1;
 
    OPENcol_name_cur;
    traverse_columns: LOOP
        FETCHcol_name_cur INTOcol_name;
 
        IF done THEN
            CLOSEcol_name_cur;
            LEAVE traverse_columns;
        ENDIF;
 
        SETsql_statement_where = CONCAT(sql_statement_where,
            ' AND a.', col_name, ' = b.', col_name);
        SETsql_statement_pk = CONCAT(sql_statement_pk,
            'AND b.', col_name, ' IS NULL');
    ENDLOOP;
 
    SELECT
        COLUMN_NAME INTOcol_name
    FROM
        information_schema.KEY_COLUMN_USAGE
    WHERE
        CONSTRAINT_SCHEMA = schema_name
        ANDCONSTRAINT_NAME = 'PRIMARY'
        ANDTABLE_NAME = table1
    LIMIT 1
    ;
    IF col_name ISNOTNULLTHEN
        SETsql_statement_pk = CONCAT('AND b.', col_name, ' IS NULL');
    ENDIF;
 
    SETsql_statement = CONCAT('SELECT * FROM ', schema_name, '.', table1, ' a LEFT JOIN ', schema_name, '.', table2, ' b ON TRUE');
    SETsql_statement = CONCAT(sql_statement, sql_statement_where, ' WHERE TRUE ', sql_statement_pk);
    SETsql_statement = CONCAT(sql_statement, ' UNION ALL SELECT * FROM ', schema_name, '.', table1, ' b RIGHT JOIN ', schema_name, '.', table2, ' a ON TRUE');
    SETsql_statement = CONCAT(sql_statement, sql_statement_where, ' WHERE TRUE ', sql_statement_pk);
 
    SET@s = sql_statement;
    PREPAREstmt1 FROM@s;
    EXECUTEstmt1;
    DEALLOCATEPREPAREstmt1;
 
END$$
DELIMITER ;

PostgreSQL 下的SQL语句

下面是PostgreSQL的一个存储过程:

CREATEORREPLACEFUNCTIONtablediff (
    INschema_name VARCHAR(64),
    INtable1 VARCHAR(64),
    INtable2 VARCHAR(64)
) RETURNSBIGINTAS
$BODY$
DECLARE
    the_result BIGINTDEFAULT0;
    sql_statement TEXT DEFAULT'';
    sql_statement_where TEXT DEFAULT'';
    sql_statement_pk TEXT DEFAULT'';
    col_name VARCHAR(64);
    col_name_cur CURSORFOR
        SELECT
            column_name
        FROM
            information_schema.columns
        WHERE
            table_catalog = schema_name
            ANDtable_schema = 'public'
            ANDtable_name = table1
    ;
BEGIN
    OPENcol_name_cur;
 
    LOOP
        FETCHcol_name_cur INTOcol_name;
        IF NOTFOUND THEN
            EXIT;
        ENDIF;
 
        sql_statement_where := sql_statement_where || ' AND a.'|| col_name || ' = b.'|| col_name;
    ENDLOOP;
 
    SELECT
        column_name INTOcol_name
    FROM
        information_schema.table_constraints tc
        JOINinformation_schema.constraint_column_usage ccu ON
            ccu.constraint_name = tc.constraint_name
    WHERE
        tc.table_catalog = schema_name
        ANDtc.table_schema = 'public'
        ANDtc.table_name = table1
    LIMIT 1
    ;
 
    IF col_name ISNOTNULLTHEN
        sql_statement_pk := ' a.'|| col_name || ' IS NULL';
        sql_statement_pk := sql_statement_pk || ' OR b.'|| col_name || ' IS NULL';
    ENDIF;
 
    sql_statement := 'SELECT COUNT(*) FROM '|| schema_name || '.public.'|| table1 || ' a FULL OUTER JOIN '|| schema_name || '.public.'|| table2 || ' b ON TRUE';
    sql_statement := sql_statement || sql_statement_where || ' WHERE '|| sql_statement_pk;
 
    EXECUTEsql_statement INTOthe_result;
 
    RETURNthe_result;
END;$BODY$
LANGUAGE 'plpgsql'STABLE;

  相关解决方案