MySQL: Remove Duplicate Entries

Remove duplicate entries is fairly hardcore, if the identify and delete approach is used – an approach described in detail here. An other approach to the problem of duplicates is to simply add a UNIQUE index to the current table and thereby let MySQL do the hardcore identification and deletion job.

Short example of the index approach.

Assume the following table and data

CREATE TABLE myDATA (
id int(10) NOT NULL auto_increment,
a int,
b int,
c int,
updated timestamp(14),
PRIMARY KEY (id)
);

insert into myData (a,b,c) values (1,2,3);
insert into myData (a,b,c) values (1,2,3);
insert into myData (a,b,c) values (1,5,4);
insert into myData (a,b,c) values (1,6,4);


mysql> select * from dupTest;
select * from dupTest;
+------+------+------+------+---------------------+
| pkey | a | b | c | timeEnter |
+------+------+------+------+---------------------+
| 1 | 1 | 2 | 3 | 2004-04-16 10:55:35 |
| 2 | 1 | 2 | 3 | 2004-04-16 10:55:35 |
| 3 | 1 | 5 | 4 | 2004-04-16 10:55:35 |
| 4 | 1 | 6 | 4 | 2004-04-16 10:55:35 |
+------+------+------+------+---------------------+
4 rows in set (0.00 sec)

mysql>

Note, the first two rows contains duplicates in columns a and b. It contains
other duplicates; but, leaves the other duplicates alone.

mysql> ALTER IGNORE TABLE dupTest ADD UNIQUE INDEX(a,b);

mysql> select * from dupTest;
select * from dupTest;
+——+——+——+——+———————+
| pkey | a | b | c | timeEnter |
+——+——+——+——+———————+
| 1 | 1 | 2 | 3 | 2004-04-16 11:11:42 |
| 3 | 1 | 5 | 4 | 2004-04-16 11:11:42 |
| 4 | 1 | 6 | 4 | 2004-04-16 11:11:42 |
+——+——+——+——+———————+
3 rows in set (0.00 sec)

Related posts:

  1. Mysql Data Import
  2. MDB Tools to export (migrate) from mdb (Microsoft Access format) to MySQL
  3. Sun acquires MySQL
  4. MySQL Optimization and Performance Tips
  5. Optimize MySQL Performance With MySQLTuner
  6. Optimize MySQL for Low Memory Use
  7. MySQL Concatenate: Adding String At The End Of Field Data
  8. Howto: How to Reset the MySQL Root Password
  9. Set MySQL Auto Increment Manually
  10. Compressing mysqldump output – mysql, gzip, bzip2 and LZMA (7z)

Popular Related Items »

10 Comments »

  1. Tony said,

    March 20, 2008 @ 13:37

    I just wanted to say thank you… I looked everywhere trying to find a way to move duplicates in sql without having to write alot of code or create new database backups etc..

    ALTER IGNORE TABLE dupTest ADD UNIQUE INDEX(a,b);

    lovely jublee
    thank you

  2. Tomy said,

    June 15, 2008 @ 17:21

    Great! Cheers! :)

  3. myrddin said,

    December 30, 2008 @ 0:53

    Nice, very nice ! Thank you, very fast !!

  4. Jack M said,

    February 1, 2009 @ 14:00

    if index cannot ne redefined:

    1)
    {
    $x = 1;
    while($x <= $i)// we loop inside the loop – this can mean quite a lot of operations so think carefully about the limit above
    {
    if($entry[$i] == $entry[$x])// checking string against previous entries
    {
    $delete_duplicate = mysql_query(“DELETE FROM `table_name` WHERE `id`=’”.$id['id'].”‘”); //delete duplicate
    /*
    or if wish to check first then add another field in the table (e.g. `check`) and change query into
    something like this:
    $mark_duplicate = mysql_query(“UPDATE `table_name` SET `check`=’1′”);
    then if happy run something lie this:
    $delete = mysql_query(“DELETE FROM `table_name` WHERE `check`=’1′”)
    */
    echo ‘Entry #’.$id[$i].’ was the same as entry ‘.$id[$x].’ and has been deleted.’;
    }
    $x++;
    }
    }
    $i++;
    }
    ?>

  5. zasssssssss said,

    February 20, 2009 @ 7:45

    xxxxsZZZZZZZZZZZZZZZZZZZZ

  6. delilah said,

    March 6, 2009 @ 4:12

    Thanks, good stuff!

  7. kaamkaaj said,

    April 15, 2009 @ 8:31

    Is there any short way of doing this, it is too complex for a beginner like me.

  8. MySQL: Zeilen mit doppelten Werten löschen | What I know is what you get said,

    June 28, 2009 @ 13:42

    [...] Erkenntnis ist allerdings nicht auf meinen Mist gewachsen, sondern stammt von http://mediakey.dk/~cc/mysql-remove-duplicate-entries/. Subscribe to comments Comment | Trackback | Post Tags: duplicate rows, [...]

  9. Marcin said,

    October 6, 2009 @ 11:49

    You solution can remove good entries and keep old ones.

    For example there can be two duplicates with id 5, and 200 and obviously 200 is most up-to-date, but when you execute your SQL 200 will be removed and 5 will stay.

  10. HOW-TO: Delete Duplicate SQL - MySQL | ShamHardy.com said,

    February 15, 2010 @ 18:11

    [...] http://mediakey.dk/~cc/mysql-remove-duplicate-entries/ mysql > ALTER IGNORE TABLE tablename ADD UNIQUE INDEX(value1, [...]

RSS feed for comments on this post · TrackBack URI

Leave a Comment