c# - 比较并删除数据表中所有相似行的方法

我正在尝试创建一个方法，在给定一个表的情况下，该方法将删除相似的行。我想做的是创建一个双 foreach 循环，以便将表中的每一行与所有其他行进行比较。

private void comparaeapagarowsiguais(DataTable table1)
    {
        foreach (DataRow row1 in table1.Rows)
        {
            foreach (DataRow row2 in table1.Rows)
            {
                var array1 = row1.ItemArray;
                var array2 = row2.ItemArray;

                if (array1.SequenceEqual(array2))
                {
                    table1.Rows.Remove(row2);
                }

            }
        }


    }

问题是，在某些时候，任何给定的行都会与自身进行比较，因此会尝试删除自身(我会以根本没有行结束)。但我想至少保留每个不同行中的一个。

我如何循环遍历它们，同时避免将任何给定行与她自己进行比较？

编辑:部分解决方案

我想出了一个适合我的情况的解决方案。如果相似的行都紧挨着彼此，这将起作用。 (如果你有类似的行散布在 table 上，它就不起作用)

 private void comparalinhaseapaga(DataTable table1)
    {   //I will explain the try in the end
        try
        {
            //you will run i for as long as you like
            for (int i = 0; ; )
            {
                //you create 2 arrays from i, and i+1, this means you will compare
                //the first 2 lines of the data table
                var array1 = table1.Rows[i].ItemArray;
                var array2 = table1.Rows[i + 1].ItemArray;

                 //if they are similar, it removes row at 1, and will go back to the cycle and 
                 //proceed to compare row at 0 with the previously row at 2
                if (array1.SequenceEqual(array2))
                {
                    table1.Rows.RemoveAt(i + 1);
                }
                else
                {
                    //if they are not equal, it move next to row at 1, and compare it with row at 2
                    //once it gets here, the row 0 and row 1 are already different
                    //that's why it only works when the similar rows are adjacent to another
                    i++;
                }
            }
        }
        catch { }
    }

try 和 catch 是因为在某些时候他不会在位置 i+1 有行来比较，并且会产生错误。使用 try/catch 将跳过错误并继续，您可以这样做，因为所有类似的行都已被删除。测试和工作 ;)

希望这对某人有用。

编辑 2: 找到了一个干净的解决方案，只需要这段代码:

 private DataTable RemoveDuplicatesRecords(DataTable dt)
    {
        //Returns just unique rows
        var UniqueRows = dt.AsEnumerable().Distinct(DataRowComparer.Default);
        DataTable dt2 = UniqueRows.CopyToDataTable();
        return dt2;
    }

最佳答案

我觉得这个可以简化为group by method，这个只能用linq来实现。然后只需选择每组的第一项

这会将您的数据表变成可枚举的，然后允许您根据不会重复的值进行分组

public IEnumerable<DataRow> test(DataTable myTable)
    {
        var results = myTable.AsEnumerable()
            .GroupBy(datarow => datarow .ItemArray[1]).Select(y=> y.First()) ;

        return results;
    }

或者如果它在所有字段上都匹配

public DataTable test(DataTable myTable)
    {
        var results = myTable.AsEnumerable().Distinct().CopyToDataTable() ;

        return results;
    }

关于c# - 比较并删除数据表中所有相似行的方法，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/29268156/

c# - 比较并删除数据表中所有相似行的方法

上一篇：c# - 使用 Xmlnclude 后如何删除不需要的属性？

下一篇：C#读取标题不是第一行的excel文件