> "André Hänsel" <andrewebkr.de> wrote in message
>> Bill Karwin wrote:
>>> "André Hänsel" <andrewebkr.de> wrote in message
>>>> is there a way to find all duplicates in a table without a primary
>>> SELECT COUNT(col), col
>>> FROM myTable
>>> GROUP BY col
>>> HAVING COUNT(col) > 1
>> No, this will give me distinct values of col, not the actual rows
>> that are double.
> I see what you mean. How does one distinguish between two identical
> rows, if the table has no primary key?
> Here's a different method, using self-joins. It works nicely for
> testing for duplicates across multiple columns:
> CREATE TABLE foo (col INTEGER, col2 INTEGER, col3 INTEGER);
> INSERT INTO foo VALUES (1,1,1), (2,2,2), (2,2,2),
> (3,3,3), (3,3,3), (3,3,3), (4,4,4), (4,4,4), (4,4,4), (4,4,4);
> SELECT COUNT(*), a.*
> FROM foo AS a INNER JOIN foo AS b
> ON (a.col = b.col AND a.col2 = b.col2 AND a.col3 = b.col3)
> GROUP BY a.col, a.col2, a.col3
> HAVING COUNT(*) > 1;
> The result shows count of 4 for (2,2,2), 9 for (3,3,3), and 16 for
> (4,4,4). This is the square of the actual number of copies, because
> there's no way to prevent rows from being joined to themselves when
> there's no primary key.
> Now say you want to delete the duplicate rows and leave only one copy
> of each?
> One solution is to use DELETE with a LIMIT clause. For each row
> returned by the above SELECT, run this statement once, and provide
> the values as parameters.
> DELETE FROM myTable WHERE col = ? AND col2 = ? AND col3 = ? LIMIT 1
> So you would execute this once with parameter values 2,2,2, once with
> values 3,3,3, and once with values 4,4,4.
> Then run the SELECT again and see if the duplicates have been
> eliminated. In this case, the 2,2,2 result should vanish. The 3,3,3
> should find a count of 4, and the 4,4,4 should find a count of 9.
> You still have some duplicates, but they are fewer in number.
> If all the duplicates occured in sets of two, then the job should be
> complete after one pass. If the duplicates rows have three or more
> copies, then you may have to loop through the SELECT & DELETE test
> several times. Keep repeating this until the SELECT returns no rows.
> Does this help?