gpt4 book ai didi

xml - 如何使用数据类型构造函数包装来自 HXT 的整个匹配列表?

转载 作者:数据小太阳 更新时间:2023-10-29 02:40:08 27 4
gpt4 key购买 nike

我在学习HXT目前通过使用它来解析 GPX文件。一个例子是 here .到目前为止,我有以下内容:

import Data.Time
import Text.XML.HXT.Core

data Gpx = Gpx [Trk] deriving (Show)
data Trk = Trk [TrkSeg] deriving (Show)
data TrkSeg = TrkSeg [TrkPt] deriving (Show)
data TrkPt = TrkPt Double Double deriving (Show)

parseGpx =
getChildren >>> isElem >>> hasName "gpx" >>>
getChildren >>> isElem >>> hasName "trk" >>>
parseGpxTrk >>> arr Gpx

parseGpxTrk = undefined
parseGpxTrkSegs = undefined

你可以看到它是不完整的,但它仍然应该进行类型检查。不幸的是,我已经遇到了一个错误:

Couldn't match type ‘Trk’ with ‘[Trk]’
Expected type: Trk -> Gpx
Actual type: [Trk] -> Gpx
In the first argument of ‘arr’, namely ‘Gpx’
In the second argument of ‘(>>>)’, namely ‘arr Gpx’

这个错误说明我试图通过 arr Gpx 构造函数传递 parseGpxTrk 箭头中的每个匹配项,但我真正想要的是传递通过 arr Gpx 构造函数的整个匹配列表。

那么,如何让 HXT(或一般的箭头?)通过我的 arr Gpx 构造函数将匹配项作为 list 传递而不是通过 arr Gpx 构造函数传递列表中的每个条目?

最佳答案

这是一个对我来说似乎很不错的解决方案

{-# LANGUAGE Arrows #-}

import Data.Maybe
import Text.Read
import Text.XML.HXT.Core
import Control.Applicative

data Gpx = Gpx [Trk] deriving (Show)
data Trk = Trk [TrkSeg] deriving (Show)
data TrkSeg = TrkSeg [TrkPt] deriving (Show)
data TrkPt = TrkPt Double Double deriving (Show)

最棘手的可能是 parseTrkPt因为为了正确执行此操作,您必须处理解析 String s 至 Double ,这可能会失败。我已经决定让它返回 Maybe TrkPt相反,然后进一步处理:

elemsNamed :: ArrowXml cat => String -> cat XmlTree XmlTree
elemsNamed name = isElem >>> hasName name

parseTrkPt :: ArrowXml cat => cat XmlTree (Maybe TrkPt)
parseTrkPt = elemsNamed "trkpt" >>>
proc trkpt -> do
lat <- getAttrValue "lat" -< trkpt
lon <- getAttrValue "lon" -< trkpt
returnA -< TrkPt <$> readMaybe lat <*> readMaybe lon

我还使用了 proc语法在这里,因为我认为它出来更清晰。 TrkPt <$> readMaybe lat <*> readMaybe lon类型为 Maybe TrkPt并将返回 Nothing如果 readMaybe 中的任何一个s 返回 Nothing .我们现在可以汇总所有成功的结果:

parseTrkSeg :: (ArrowXml cat, ArrowList cat) => cat XmlTree TrkSeg
parseTrkSeg =
elemsNamed "trkseg" >>>
(getChildren >>> parseTrkPt >>. catMaybes) >. TrkSeg

括号在这里很重要,我花了一段时间才弄明白这部分。根据放置括号的位置,您会得到不同的结果,例如 [TrkSeg [TrkPt a b], TrkSeg [TrkPt c d]]而不是 [TrkSeg [TrkPt a b, TrkPt c d]] .接下来的解析器都遵循类似的模式:

parseTrk :: ArrowXml cat => cat XmlTree Trk
parseTrk =
elemsNamed "trk" >>>
(getChildren >>> parseTrkSeg) >. Trk

parseGpx :: ArrowXml cat => cat XmlTree Gpx
parseGpx =
elemsNamed "gpx" >>>
(getChildren >>> parseTrk) >. Gpx

然后您可以非常简单地运行它,尽管您仍然需要钻取通过根元素:

main :: IO ()
main = do
gpxs <- runX $ readDocument [withRemoveWS yes] "ana.gpx"
>>> getChildren
>>> parseGpx
-- Pretty print the document
forM_ gpxs $ \(Gpx trks) -> do
putStrLn "GPX:"
forM_ trks $ \(Trk segs) -> do
putStrLn "\tTRK:"
forM_ segs $ \(TrkSeg pts) -> do
putStrLn "\t\tSEG:"
forM_ pts $ \pt -> do
putStr "\t\t\t"
print pt

诀窍是使用 ArrowList 中的方法类型类,特别是 >.其类型为 a b c -> ([c] -> d) -> a b d .它聚合了 ArrowList 中的元素, 将其传递给将其转换为新类型的函数,然后输出新的 ArrowList在那个新类型上d .

如果你愿意,你甚至可以为最后 3 个解析器抽象一点:

nestedListParser :: ArrowXml cat => String -> cat XmlTree a -> ([a] -> b) -> cat XmlTree b
nestedListParser name subparser constructor
= elemsNamed name
>>> (getChildren >>> subparser)
>. constructor

parseTrkSeg :: (ArrowXml cat, ArrowList cat) => cat XmlTree TrkSeg
parseTrkSeg = nestedListParser "trkseg" (parseTrkPt >>. catMaybes) TrkSeg

parseTrk :: ArrowXml cat => cat XmlTree Trk
parseTrk = nestedListParser "trk" parseTrkSeg Trk

parseGpx :: ArrowXml cat => cat XmlTree Gpx
parseGpx = nestedListParser "gpx" parseTrk Gpx

如果您想完成 GPX 文件的其余语法,这可能会派上用场。

关于xml - 如何使用数据类型构造函数包装来自 HXT 的整个匹配列表?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/30108630/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com