gpt4 book ai didi

java - Web 抓取到 applescript 变量中

转载 作者:行者123 更新时间:2023-11-30 07:46:15 26 4
gpt4 key购买 nike

我住的地方距离我每天穿过的一座横跨运河的桥不到 2 分钟路程。有一个网站显示船期表,但很难找出变量。我打算在我的 Indigo Home Automation 系统和 iFindStuff 插件中使用这些变量来估计我是否会等待桥梁,因此我知道要采取另一条路线。

我的问题是如何将本网站上列出的每个桥的 N 和 S 时间放入定义的 applescript 变量中。 http://www.greatlakes-seaway.com/R2/jsp/mNiaBrdgStatus_mb.jsp?language=E

我知道有很多不同的方法可以做到这一点,但我会尝试一种在后台工作的方法。

 do shell script "curl 'http://www.greatlakes-seaway.com/R2/jsp/mNiaBrdgStatus_mb.jsp?language=E' | sed -n '/0-9/,/NewPP/p' | sed -n '/^<tr/ s/^.*title=.\\([^\"]*\\).*$/\\1/p' | perl -n -mHTML::Entities -e ' ; print HTML::Entities::decode_entities($_);'" 

我无法得到任何结果,而且我不知道如何将其放入变量中。预先感谢您的帮助。

最佳答案

解决这个问题非常有趣:-)尝试一下并阅读代码中的注释:

on run {}
set resultSet to bridgeStatus()
end run

on bridgeStatus()
-- Empty return list
set bridgeStatusList to {}

-- Getting the page content
-- The web site had problems wiht answering cUrl! Pretending Safari works :-)
set webContent to paragraphs of (do shell script "curl -A 'Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en-us) AppleWebKit/xxx.x (KHTML like Gecko) Safari/12x.x' 'http://www.greatlakes-seaway.com/R2/jsp/mNiaBrdgStatus_mb.jsp?language=E'")

-- Parsing the page content
repeat with i from 2 to count webContent
-- Only work with lines that contain the N (Northbound) or S (Southbound) info
-- Also only collect the info if the line before contains the needed bridge info
if (item i of webContent contains "&nbsp;&nbsp;S</td>" or item i of webContent contains "&nbsp;&nbsp;N</td>") and item (i - 1) of webContent contains "bridge" then
-- work with the four lines of info we want and strip HTML from it
-- and collect all info in a dictionary {bridge:xxx, nextArrival:xxx, bridgeStatus:xxx, subsequentArrival:xxx}
set foundStatus to {bridge:stripHTML(item (i - 1) of webContent), nextArrival:stripHTML(item i of webContent), bridgeStatus:stripHTML(item (i + 1) of webContent), subsequentArrival:stripHTML(item (i + 2) of webContent)}
-- fill the return list with the found info
copy foundStatus to end of bridgeStatusList
end if
end repeat
return bridgeStatusList
end bridgeStatus

on stripHTML(anyText)
-- easy way to trash HTML-Code, thanks to http://stackoverflow.com/a/33771977/4081207
return (do shell script "echo " & quoted form of ("<!DOCTYPE HTML PUBLIC><meta charset=\"UTF-8\">" & anyText) & " | sed 's#<br />##' | sed 's#&nbsp;&nbsp;# #' | textutil -convert txt -stdin -stdout | xargs")
end stripHTML

我几分钟前运行了这个脚本,它返回了这个列表:

{
{bridge:"Lakeshore Rd St. Catharines (Bridge 1)",
nextArrival:"06:55 N",
bridgeStatus:"Available",
subsequentArrival:"07:15 S"},
{bridge:"Carlton St. St. Catharines (Bridge 3A)",
nextArrival:"05:45 N",
bridgeStatus:"Unavailable (Fully Raised)",
subsequentArrival:"05:50 S"},
{bridge:"Queenston St. St. Catharines (Bridge 4)",
nextArrival:"05:27 N",
bridgeStatus:"Available",
subsequentArrival:"06:20 S"},
{bridge:"Glendale Ave. St. Catharines (Bridge 5)",
nextArrival:"06:25 S",
bridgeStatus:"Available",
subsequentArrival:"07:21 S"},
{bridge:"Highway 20 Thorold (Bridge 11)",
nextArrival:"10:17 S",
bridgeStatus:"Available",
subsequentArrival:"10:59 N"},
{bridge:"Main St. Port Colborne (Bridge 19)",
nextArrival:"08:35 N",
bridgeStatus:"Unavailable (--Work in Progress--)",
subsequentArrival:"09:25 N"},
{bridge:"Mellanby Ave. Port Colborne (Bridge 19A)",
nextArrival:"06:17 S",
bridgeStatus:"Available",
subsequentArrival:"08:00 N"},
{bridge:"Clarence St. Port Colborne (Bridge 21)",
nextArrival:"06:40 S",
bridgeStatus:"Available",
subsequentArrival:"07:53 N"}
}

我希望它有帮助......我讨厌被堵在交通中;-)迈克尔/汉堡

关于java - Web 抓取到 applescript 变量中,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/33909331/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com