问题描述
我想将超过一百万的数据行导出到excel(.csv)。
为避免OutOfMemory异常,我应该分批导出并在每个批处理后清理内存。
但是GC.Collect()可能会浪费很多时间...
你能给我一些建议吗?
这是我的代码:
public string ExportByTasks()
{
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
string webrootPath = _hostingEnvironment.ContentRootPath;
var folderPath = webrootPath + "\\Files\\" + string.Format("{0:yyyy_MM}",DateTime.Now) + "\\";
if (!Directory.Exists(folderPath))
{
Directory.CreateDirectory(folderPath);
}
string newFileName = Guid.NewGuid().ToString() + ".csv";
var filePath = folderPath + newFileName;
var stream = new FileStream(filePath,FileMode.Create);
var txt = new StreamWriter(stream,Encoding.UTF8);
var thread = 8;
var times = 10;//10 times take 400m-600m
// int count = _context.user.Count() / (thread * times) + 1;
int count = 1000000 / (thread * times) + 1;
List<User> resultList = new List<User>(count * thread);
var j = 0;
for (int t = 0; t < times; t++)
{
var tasks = new Task<List<User>>[thread];
for (int i = 0; i < thread; i++)
{
tasks[i] = Task.Run(async () =>
{
Stopwatch sw = new Stopwatch();
sw.Restart();
int x = Interlocked.Increment(ref j);
int i1 = ((x - 1) * count);
int i2 = (x * count);
using (var db = new BlogContext(options))
{
var result = await db.user
.Where(b => b.UserId > i1 && b.UserId <= i2)
.ToListAsync();
return result;
}
});
}
Task.WaitAll(tasks);
for (int i = 0; i < thread; i++)
{
foreach (var item in tasks[i].Result)
{
var a = GetPropertyReflector(item.GetType());
foreach (var child in a)
{
txt.Write($"{child.GetValue(item)},");
}
txt.WriteLine();
}
tasks[i].Result.Clear();
GC.Collect();
}
}
txt.dispose();
return filePath;
}
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)