c# - 删除非 ASCII 字符(使用 Microsoft.Office.Interop.Excel)

标签 c# .net excel

我正在尝试从 excel/csv 文件中删除所有非 ascii 字符。在网上阅读和搜索后,我发现了一篇文章,它给了我代码 xlWorksheet.UsedRange.Replace("[^\\u0000-\\u007F]" 来删除字符,但每次字符仍然存在在文件中。

还有一个对话框说明

We couldn't find anything to replace. Click Options for more ways to search.

FYI: It's possible the data you're trying to replace is in a protected sheet. Excel can't replace data in protected sheets.

不确定如何进行下一步。我一直在网上查看和阅读,但到目前为止没有发现任何有用的东西。

感谢您的帮助。

using System;
using System.Collections.Generic;
using System.Linq;
using System.Runtime.InteropServices;
using System.Text;
using System.Threading.Tasks;
using Excel = Microsoft.Office.Interop.Excel;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            Excel.Application xlApp = new Excel.Application();
            Excel.Workbook xlWorkbook = xlApp.Workbooks.Open(@"C:\Users\username\Desktop\Error Records.csv");
            Excel.Worksheet xlWorksheet = xlWorkbook.Sheets[1];
            Excel.Range xlRange = xlWorksheet.UsedRange;

            int lastUsedRow = xlWorksheet.Cells.Find("*", System.Reflection.Missing.Value,
                System.Reflection.Missing.Value, System.Reflection.Missing.Value,
                Excel.XlSearchOrder.xlByRows, Excel.XlSearchDirection.xlPrevious,
                false, System.Reflection.Missing.Value, System.Reflection.Missing.Value).Row;

            int lastUsedColumn = xlWorksheet.Cells.Find("*", System.Reflection.Missing.Value,
                System.Reflection.Missing.Value, System.Reflection.Missing.Value,
                Excel.XlSearchOrder.xlByColumns, Excel.XlSearchDirection.xlPrevious,
                false, System.Reflection.Missing.Value, System.Reflection.Missing.Value).Column;

//            int lastColumnCount = lastUsedColumn;
//;
//            for (int i = 1; i <= lastUsedColumn; i++)
//            {
//                for (int j = 1; j <= lastUsedRow; j++)
//                {
//                    xlWorksheet.Cells[j, (lastColumnCount+1)] = "Testing data 134";
//                }
//            }

            xlWorksheet.Cells[1, (lastUsedColumn + 1)] = "Title";
            xlWorksheet.UsedRange.Replace("[^\\u0000-\\u007F]", string.Empty);

            xlWorkbook.Save();
            //cleanup
            GC.Collect();
            GC.WaitForPendingFinalizers();

            //rule of thumb for releasing com objects:
            //  never use two dots, all COM objects must be referenced and released individually
            //  ex: [somthing].[something].[something] is bad

            //release com objects to fully kill excel process from running in the background
            Marshal.ReleaseComObject(xlRange);
            Marshal.ReleaseComObject(xlWorksheet);

            //close and release
            xlWorkbook.SaveAs("C:\\Users\\username\\Desktop\\Errors_four.csv".Trim(), Excel.XlFileFormat.xlCSV);
            xlWorkbook.Close();
            Marshal.ReleaseComObject(xlWorkbook);

            //quit and release
            xlApp.Quit();
            Marshal.ReleaseComObject(xlApp);

        }
    }
}

最佳答案

对于每个范围内的每个单元格,您可以使用以下函数将当前单元格字符串值替换为清理后的 ascii。我不知道有任何 excel interop 库固有的 ascii 转换函数。我很好奇,您是否可以提供任何示例来说明您尝试转换的内容?

还请记住,Excel 工作表中有函数,然后有值。您在尝试使用哪个问题时不清楚。您提到 CSV,这让我认为这些纯粹是 VALUES 操作。

public string ReturnCleanASCII(string s)
{
    StringBuilder sb = new StringBuilder(s.Length);
    foreach(char c in s.ToCharArray())
    {
       if((int)c > 127) // you probably don't want 127 either
          continue;
       if((int)c < 32)  // I bet you don't want control characters 
          continue;
       if(c == ',')
          continue;
       if(c == '"')
          continue;
       sb.Append(c);
    }
    return sb.ToString();
}

这是一个示例用法。请记住,您需要自己弄清楚如何为单元格编制索引,此示例仅适用于单元格 1,1。此外,还有两个有用的提示:单元格是 1 的索引,并且如果您调用 Value2 而不是 Value 可能会更快。

// get the value from a cell
string dirty = excelSheet.Cells[1, 1].Value.ToString(); // Value2 may be faster!

// convert to clean ascii
string clean = ReturnCleanASCII(dirty);

// set the cell value
excelSheet.Cells[1, 1].Value = clean;

关于c# - 删除非 ASCII 字符(使用 Microsoft.Office.Interop.Excel),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44828935/

相关文章:

.net - 试图了解 .NET Core 中 appsettings.json 与 web.config 的新行为,但被 MSDN 上相互矛盾的信息弄糊涂了

c# - EF 代码首先 List<DateTime> 不创建表

c# - 是否存在永远不会匹配任何字符串的正则表达式?

java - 将 JTable 导出到 Excel 文件

c# - 为什么 HashSet<Point> 比 HashSet<string> 慢这么多?

c# - 将 GUI 组件添加到预编译的应用程序

excel - 签名不适用于 DigiCert EV 代码签名证书

excel - writetable 在 Matlab 中用空白替换 NaN

c# - 如何使用 OnPlatform Xamarin 更改 BackgroundColor

c# - OleDBConnection 错误数据库服务器未找到 C#