我在我的 C# 代码中使用 string.split() 来读取制表符分隔的文件。我正面临下面代码示例中提到的“OutOfMemory 异常”。
这里我想知道为什么文件大小为 16 MB 时会出现问题?
这是正确的方法吗?
using (StreamReader reader = new StreamReader(_path))
{
//...........Load the first line of the file................
string headerLine = reader.ReadLine();
MeterDataIPValueList objMeterDataList = new MeterDataIPValueList();
string[] seperator = new string[1]; //used to sepreate lines of file
seperator[0] = "\r\n";
//.............Load Records of file into string array and remove all empty lines of file.................
string[] line = reader.ReadToEnd().Split(seperator, StringSplitOptions.RemoveEmptyEntries);
int noOfLines = line.Count();
if (noOfLines == 0)
{
mFileValidationErrors.Append(ConstMsgStrings.headerOnly + Environment.NewLine);
}
//...............If file contains records also with header line..............
else
{
string[] headers = headerLine.Split('\t');
int noOfColumns = headers.Count();
//.........Create table structure.............
objValidateRecordsTable.Columns.Add("SerialNo");
objValidateRecordsTable.Columns.Add("SurveyDate");
objValidateRecordsTable.Columns.Add("Interval");
objValidateRecordsTable.Columns.Add("Status");
objValidateRecordsTable.Columns.Add("Consumption");
//........Fill objValidateRecordsTable table by string array contents ............
int recordNumber; // used for log
#region ..............Fill objValidateRecordsTable.....................
seperator[0] = "\t";
for (int lineNo = 0; lineNo < noOfLines; lineNo++)
{
recordNumber = lineNo + 1;
**string[] recordFields = line[lineNo].Split(seperator, StringSplitOptions.RemoveEmptyEntries);** // Showing me error when we split columns
if (recordFields.Count() == noOfColumns)
{
//Do processing
}
最佳答案
Split 实现不佳,在应用于大型字符串时会出现严重的性能问题。请引用this article for details on memory requirements by split function :
What happens when you do a split on a string containing 1355049 comma separated strings of 16 characters each, having total character length of 25745930 ?
An Array of pointers to string object: Contiguous virtual address space of 4 (address pointer)*1355049 = 5420196 (arrays size) + 16 (for book keeping) = 5420212.
Non-contiguous virtual address space for 1355049 strings, each of 54 bytes. It does not mean all those 1.3 million strings would be scattered all across the heap, but they will not be allocated on LOH. GC will allocate them on bunches on Gen0 heap.
Split.Function will create internal array of System.Int32[] of size 25745930, consuming (102983736 bytes) ~98MB of LOH, which is very expensive L.
关于c# - string.split() "Out of memory exception"读取制表符分隔文件时,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/1404435/