gpt4 book ai didi

c# - 如何调用 WebBrowser Navigate 遍历多个 url?

转载 作者:太空宇宙 更新时间:2023-11-03 21:48:02 25 4
gpt4 key购买 nike

要收集网页上的信息,我可以使用 WebBrowser.Navigated 事件。

首先,导航到 url:

WebBrowser wbCourseOverview = new WebBrowser();
wbCourseOverview.ScriptErrorsSuppressed = true;
wbCourseOverview.Navigate(url);
wbCourseOverview.Navigated += wbCourseOverview_Navigated;

然后在调用Navigated时处理网页:

void wbCourseOverview_Navigated(object sender, WebBrowserNavigatedEventArgs e)
{
//Find the control and invoke "Click" event...
}

当我尝试遍历 url 字符串数组时,困难的部分就来了。

foreach (var u in courseUrls)
{
WebBrowser wbCourseOverview = new WebBrowser();
wbCourseOverview.ScriptErrorsSuppressed = true;
wbCourseOverview.Navigate(u);

wbCourseOverview.Navigated += wbCourseOverview_Navigated;
}

此处,由于页面加载需要时间,因此永远不会到达 wbCourseOverview_Navigated

我尝试在 C#5 中使用 async await。任务和基于事件的异步模式 (EAP) 可在 here 中找到.另一个示例可以在 The Task-based Asynchronous Pattern 中找到.

问题是 WebClient 有像 DownloadDataAsyncDownloadStringAsync 这样的异步方法。但是 WebBrowser 中没有 NavigateAsync

有哪位高手可以给我一些建议吗?谢谢。


StackOverflow 中有一篇文章 ( here )。但是,有谁知道如何在其答案中实现该 strut 吗?


再次更新。

another post here in StackOverflow 中建议,

public static Task WhenDocumentCompleted(this WebBrowser browser)
{
var tcs = new TaskCompletionSource<bool>();
browser.DocumentCompleted += (s, args) => tcs.SetResult(true);
return tcs.Task;
}

所以我有:

foreach (var c in courseBriefs)
{
wbCourseOverview.Navigate(c.Url);
await wbCourseOverview.WhenDocumentCompleted();
}

在我的网络浏览器访问第二个 url 之前,它看起来不错。

An attempt was made to transition a task to a final state when it had already completed.

我知道我一定是在 foreach 循环中犯了一个错误。因为循环到第二轮的时候还没有引发DocumentCompleted事件。在 foreach 循环中编写此 await 的正确方法是什么?

最佳答案

There is a post in StackOverflow (here). But, does anyone know how to implement that strut in its answer?

好的,所以你需要一些带有 awaiter 的代码。我做了两段代码。第一个使用 TPL 的内置等待器:

 public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}

private void button1_Click(object sender, EventArgs e)
{
ProcessUrlsAsync(new[] { "http://google.com", "http://microsoft.com", "http://yahoo.com" })
.Start();
}

private Task ProcessUrlsAsync(string[] urls)
{
return new Task(() =>
{
foreach (string url in urls)
{
TaskAwaiter<string> awaiter = ProcessUrlAsync(url);
// or the next line, in case we use method *
// TaskAwaiter<string> awaiter = ProcessUrlAsync(url).GetAwaiter();
string result = awaiter.GetResult();

MessageBox.Show(result);
}
});
}

// Awaiter inside
private TaskAwaiter<string> ProcessUrlAsync(string url)
{
TaskCompletionSource<string> taskCompletionSource = new TaskCompletionSource<string>();
var handler = new WebBrowserDocumentCompletedEventHandler((s, e) =>
{
// TODO: put custom processing of document right here
taskCompletionSource.SetResult(e.Url + ": " + webBrowser1.Document.Title);
});
webBrowser1.DocumentCompleted += handler;
taskCompletionSource.Task.ContinueWith(s => { webBrowser1.DocumentCompleted -= handler; });

webBrowser1.Navigate(url);
return taskCompletionSource.Task.GetAwaiter();
}

// (*) Task<string> instead of Awaiter
//private Task<string> ProcessUrlAsync(string url)
//{
// TaskCompletionSource<string> taskCompletionSource = new TaskCompletionSource<string>();
// var handler = new WebBrowserDocumentCompletedEventHandler((s, e) =>
// {
// taskCompletionSource.SetResult(e.Url + ": " + webBrowser1.Document.Title);
// });
// webBrowser1.DocumentCompleted += handler;
// taskCompletionSource.Task.ContinueWith(s => { webBrowser1.DocumentCompleted -= handler; });

// webBrowser1.Navigate(url);
// return taskCompletionSource.Task;
//}

下一个示例包含 Eric Lippert 正在谈论的 awaiter struct 的示例实现 here .

public partial class Form1 : Form
{
public struct WebBrowserAwaiter
{
private readonly WebBrowser _webBrowser;
private readonly string _url;

private readonly TaskAwaiter<string> _innerAwaiter;

public bool IsCompleted
{
get
{
return _innerAwaiter.IsCompleted;
}
}

public WebBrowserAwaiter(WebBrowser webBrowser, string url)
{
_url = url;
_webBrowser = webBrowser;
_innerAwaiter = ProcessUrlAwaitable(_webBrowser, url);
}

public string GetResult()
{
return _innerAwaiter.GetResult();

}

public void OnCompleted(Action continuation)
{
_innerAwaiter.OnCompleted(continuation);
}

private TaskAwaiter<string> ProcessUrlAwaitable(WebBrowser webBrowser, string url)
{
TaskCompletionSource<string> taskCompletionSource = new TaskCompletionSource<string>();
var handler = new WebBrowserDocumentCompletedEventHandler((s, e) =>
{
// TODO: put custom processing of document here
taskCompletionSource.SetResult(e.Url + ": " + webBrowser.Document.Title);
});
webBrowser.DocumentCompleted += handler;
taskCompletionSource.Task.ContinueWith(s => { webBrowser.DocumentCompleted -= handler; });

webBrowser.Navigate(url);
return taskCompletionSource.Task.GetAwaiter();
}
}

public Form1()
{
InitializeComponent();
}

private void button1_Click(object sender, EventArgs e)
{
ProcessUrlsAsync(new[] { "http://google.com", "http://microsoft.com", "http://yahoo.com" })
.Start();
}

private Task ProcessUrlsAsync(string[] urls)
{
return new Task(() =>
{
foreach (string url in urls)
{
var awaiter = new WebBrowserAwaiter(webBrowser1, url);
string result = awaiter.GetResult();

MessageBox.Show(result);
}
});
}
}
}

希望这对您有所帮助。

关于c# - 如何调用 WebBrowser Navigate 遍历多个 url?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/15932659/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com