所以我有这个代码: 这是主函数,一个并行的 for 循环,迭代所有需要发布的数据并调用函数
ParallelOptions pOpt = new ParallelOptions();
pOpt.MaxDegreeOfParallelism = 30;
Parallel.For(0, maxsize, pOpt, (index,loopstate) => {
//Calls the function where all the webrequests are made
CallRequests(data1,data2);
if (isAborted)
loopstate.Stop();
});
该函数在并行循环内部调用
public static void CallRequests(string data1, string data2)
{
var cookie = new CookieContainer();
var postData = Parameters[23] + data1 +
Parameters[24] + data2;
HttpWebRequest getRequest = (HttpWebRequest)WebRequest.Create(Parameters[25]);
getRequest.Accept = Parameters[26];
getRequest.KeepAlive = true;
getRequest.Referer = Parameters[27];
getRequest.CookieContainer = cookie;
getRequest.UserAgent = Parameters[28];
getRequest.Method = WebRequestMethods.Http.Post;
getRequest.AllowWriteStreamBuffering = true;
getRequest.ProtocolVersion = HttpVersion.Version10;
getRequest.AllowAutoRedirect = false;
getRequest.ContentType = Parameters[29];
getRequest.ReadWriteTimeout = 5000;
getRequest.Timeout = 5000;
getRequest.Proxy = null;
byte[] byteArray = Encoding.ASCII.GetBytes(postData);
getRequest.ContentLength = byteArray.Length;
Stream newStream = getRequest.GetRequestStream(); //open connection
newStream.Write(byteArray, 0, byteArray.Length); // Send the data.
newStream.Close();
HttpWebResponse getResponse = (HttpWebResponse)getRequest.GetResponse();
if (getResponse.Headers["Location"] == Parameters[30])
{
//These are simple get requests to retrieve the source code using the same format as above.
//I need to preserve the cookie
GetRequets(data1, data2, Parameters[31], Parameters[13], cookie);
GetRequets(data1, data2, Parameters[32], Parameters[15], cookie);
}
}
从我所看到和被告知的情况来看,我知道使这些请求异步比使用并行循环更好。我的方法对处理器也很重。我想知道如何使这些请求异步,但也保留多线程方面。在 post 请求完成后,我还需要保留 cookie。
最佳答案
转换CallRequests
方法async
实际上只是使用 await
将同步方法调用切换为异步方法调用的情况。关键字并更改方法签名以返回 Task
.
类似这样的事情:
public static async Task CallRequestsAsync(string data1, string data2)
{
var cookie = new CookieContainer();
var postData = Parameters[23] + data1 +
Parameters[24] + data2;
HttpWebRequest getRequest = (HttpWebRequest)WebRequest.Create(Parameters[25]);
getRequest.Accept = Parameters[26];
getRequest.KeepAlive = true;
getRequest.Referer = Parameters[27];
getRequest.CookieContainer = cookie;
getRequest.UserAgent = Parameters[28];
getRequest.Method = WebRequestMethods.Http.Post;
getRequest.AllowWriteStreamBuffering = true;
getRequest.ProtocolVersion = HttpVersion.Version10;
getRequest.AllowAutoRedirect = false;
getRequest.ContentType = Parameters[29];
getRequest.ReadWriteTimeout = 5000;
getRequest.Timeout = 5000;
getRequest.Proxy = null;
byte[] byteArray = Encoding.ASCII.GetBytes(postData);
getRequest.ContentLength = byteArray.Length;
Stream newStream =await getRequest.GetRequestStreamAsync(); //open connection
await newStream.WriteAsync(byteArray, 0, byteArray.Length); // Send the data.
newStream.Close();
HttpWebResponse getResponse = (HttpWebResponse)getRequest.GetResponse();
if (getResponse.Headers["Location"] == Parameters[30])
{
//These are simple get requests to retrieve the source code using the same format as above.
//I need to preserve the cookie
GetRequets(data1, data2, Parameters[31], Parameters[13], cookie);
GetRequets(data1, data2, Parameters[32], Parameters[15], cookie);
}
}
但是,这本身并没有真正让您有任何帮助,因为您仍然需要在主方法中等待返回的任务。一种非常简单(如果有些生硬)的方法是简单地调用 Task.WaitAll()
(或者 await Task.WhenAll()
如果调用方法本身要变为异步)。像这样的事情:
var tasks = Enumerable.Range(0, maxsize).Select(index => CallRequestsAsync(data1, data2));
Task.WaitAll(tasks.ToArray());
但是,这确实相当生硬,并且无法控制并行运行的迭代次数等。我更喜欢使用 TPL dataflow library对于这种事情。该库提供了一种并行链接异步(或同步)操作并将它们从一个“处理 block ”传递到下一个“处理 block ”的方法。它有无数的选项用于调整并行度、缓冲区大小等。
详细的揭露超出了这个答案的可能范围,所以我鼓励您阅读它,但一种可能的方法是简单地将其推送到操作 block - 如下所示:
var actionBlock = new ActionBlock<int>(async index =>
{
await CallRequestsAsync(data1, data2);
}, new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = 30,
BoundedCapacity = 100,
});
for (int i=0; i <= maxsize; i++)
{
actionBlock.Post(i); // or await actionBlock.SendAsync(i) if calling method is also async
}
actionBlock.Complete();
actionBlock.Completion.Wait(); // or await actionBlock.Completion if calling method is also async
还有一些超出我的回答范围的额外要点,我应该顺便提及一下:
- 它看起来像您的
CallRequests
方法正在用其结果更新一些外部变量。如果可能,最好避免这种模式,并让方法返回结果以供稍后进行整理(TPL 数据流库通过TransformBlock<>
处理)。如果更新外部状态是不可避免的,那么请确保您已经考虑了多线程影响(死锁、竞争条件等),这些影响超出了我的答案范围。 - 我假设
index
有一些有用的属性当您为问题创建最小描述时,哪些内容丢失了?它是否索引到参数列表或类似的内容中?如果是这样,您始终可以直接迭代这些并更改ActionBlock<int>
到ActionBlock<{--whatever the type of your parameter is--}>
- 确保您了解多线程/并行执行和异步之间的区别。肯定有一些相似之处/重叠,但仅仅使某些东西异步并不会使其成为多线程,反之亦然。
关于C# 使用异步加速并行网络请求,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45051847/