ORACLE LIVE EXAMPLES: Deleting Duplicate Records from a Table

Friday, December 9, 2011

Deleting Duplicate Records from a Table

Many Time DBA faces a situation , in which they have to remove duplicate records from a table. This is somewhat difficult but we have some methods that will make it somewhat easier to resolve.

Method 1:

Delete all rowids that is BIGGER than the SMALLEST rowid value (for a given key):

SQL> DELETE FROM table_name A 
  2  WHERE ROWID > ( SELECT min(rowid) 
  3                  FROM table_name B
  4                  WHERE A.key_values = B.key_values );

Method 2:

This method is usually faster. However, remember to recreate all indexes, constraints, triggers, etc. on the table when done.

SQL> create table table_name2 as select distinct * from table_name1;
SQL> drop table table_name1;
SQL> rename table_name2 to table_name1;

Method 3:

SQL> DELETE FROM my_table t1
  2  WHERE EXISTS ( SELECT 'x' FROM my_table t2
  3                 WHERE t2.key_value1 = t1.key_value1
  4                   AND t2.key_value2 = t1.key_value2
  4                   AND t2.rowid      > t1.rowid );

Note: One can eliminate unnecessary operations by creating an index on the joined fields in the inner loop (no need to loop through the entire table on each pass by a record). This will speed-up the deletion process.

Note 2: If you are comparing NULL columns, use the NVL function. Remember that NULL is not equal to NULL. This should not be a problem as all key columns should be NOT NULL by definition.

Method 4:

This method collects the first row (order by rowid) for each key values and delete the rows that are not in this set:

SQL> DELETE FROM my_table t1
  1  WHERE rowid NOT IN ( SELECT min(rowid) 
  2                       FROM my_table t2
  3                       GROUP BY key_value1, key_value2 );

Note: IF key_value1 is null or key_value2 is null, this still works correctly.

Method 5:

This method is the easiest way to remove duplicate records from a table. But this is not feasible method for large table


1) Put duplicates in a temp table.

   

   Create Table table_dups as Select * From evil_table

   Group  By col1, col2, col3.... Having count(*) > 1;

2) Remove duplicates from table.




   Delete From evil_table where  (col1, col2, col3...) in (Select col1, col2,

   col3  From   table_dups);




3) Put duplicate rows back.




   Insert into evil_table (col1, col2, col3...) Select col1, col2, col3...

   From   table_dups;

ORACLE LIVE EXAMPLES

Friday, December 9, 2011

Deleting Duplicate Records from a Table

No comments:

Post a Comment

Our Site Map

About Me

Followers