gpt4 book ai didi

parsing - 提高 FParsec 解析器的可读性

转载 作者:行者123 更新时间:2023-12-01 09:06:02 24 4
gpt4 key购买 nike

我有一个用 C# 编写的手写 CSS 解析器,它变得难以管理,并试图在 FParsec 中实现它以使其更易于维护。这是一个解析用正则表达式制作的 css 选择器元素的片段:

var tagRegex = @"(?<Tag>(?:[a-zA-Z][_\-0-9a-zA-Z]*|\*))";
var idRegex = @"(?:#(?<Id>[a-zA-Z][_\-0-9a-zA-Z]*))";
var classesRegex = @"(?<Classes>(?:\.[a-zA-Z][_\-0-9a-zA-Z]*)+)";
var pseudoClassRegex = @"(?::(?<PseudoClass>link|visited|hover|active|before|after|first-line|first-letter))";
var selectorRegex = new Regex("(?:(?:" + tagRegex + "?" + idRegex + ")|" +
"(?:" + tagRegex + "?" + classesRegex + ")|" +
tagRegex + ")" +
pseudoClassRegex + "?");

var m = selectorRegex.Match(str);

if (m.Length != str.Length) {
cssParserTraceSwitch.WriteLine("Unrecognized selector: " + str);
return null;
}

string tagName = m.Groups["Tag"].Value;

string pseudoClassString = m.Groups["PseudoClass"].Value;
CssPseudoClass pseudoClass;
if (pseudoClassString.IsEmpty()) {
pseudoClass = CssPseudoClass.None;
} else {
switch (pseudoClassString.ToLower()) {
case "link":
pseudoClass = CssPseudoClass.Link;
break;
case "visited":
pseudoClass = CssPseudoClass.Visited;
break;
case "hover":
pseudoClass = CssPseudoClass.Hover;
break;
case "active":
pseudoClass = CssPseudoClass.Active;
break;
case "before":
pseudoClass = CssPseudoClass.Before;
break;
case "after":
pseudoClass = CssPseudoClass.After;
break;
case "first-line":
pseudoClass = CssPseudoClass.FirstLine;
break;
case "first-letter":
pseudoClass = CssPseudoClass.FirstLetter;
break;
default:
cssParserTraceSwitch.WriteLine("Unrecognized selector: " + str);
return null;
}
}

string cssClassesString = m.Groups["Classes"].Value;
string[] cssClasses = cssClassesString.IsEmpty() ? EmptyArray<string>.Instance : cssClassesString.Substring(1).Split('.');
allCssClasses.AddRange(cssClasses);

return new CssSelectorElement(
tagName.ToLower(),
cssClasses,
m.Groups["Id"].Value,
pseudoClass);

我的第一次尝试得到了这个:

type CssPseudoClass =
| None = 0
| Link = 1
| Visited = 2
| Hover = 3
| Active = 4
| Before = 5
| After = 6
| FirstLine = 7
| FirstLetter = 8

type CssSelectorElement =
{ Tag : string
Id : string
Classes : string list
PseudoClass : CssPseudoClass }
with
static member Default =
{ Tag = "";
Id = "";
Classes = [];
PseudoClass = CssPseudoClass.None; }

open FParsec

let ws = spaces
let str = skipString
let strWithResult str result = skipString str >>. preturn result

let identifier =
let isIdentifierFirstChar c = isLetter c || c = '-'
let isIdentifierChar c = isLetter c || isDigit c || c = '_' || c = '-'
optional (str "-") >>. many1Satisfy2L isIdentifierFirstChar isIdentifierChar "identifier"

let stringFromOptional strOption =
match strOption with
| Some(str) -> str
| None -> ""

let pseudoClassFromOptional pseudoClassOption =
match pseudoClassOption with
| Some(pseudoClassOption) -> pseudoClassOption
| None -> CssPseudoClass.None

let parseCssSelectorElement =
let tag = identifier <?> "tagName"
let id = str "#" >>. identifier <?> "#id"
let classes = many1 (str "." >>. identifier) <?> ".className"
let parseCssPseudoClass =
choiceL [ strWithResult "link" CssPseudoClass.Link;
strWithResult "visited" CssPseudoClass.Visited;
strWithResult "hover" CssPseudoClass.Hover;
strWithResult "active" CssPseudoClass.Active;
strWithResult "before" CssPseudoClass.Before;
strWithResult "after" CssPseudoClass.After;
strWithResult "first-line" CssPseudoClass.FirstLine;
strWithResult "first-letter" CssPseudoClass.FirstLetter]
"pseudo-class"
// (tag?id|tag?classes|tag)pseudoClass?
pipe2 ((pipe2 (opt tag)
id
(fun tag id ->
{ CssSelectorElement.Default with
Tag = stringFromOptional tag;
Id = id })) |> attempt
<|>
(pipe2 (opt tag)
classes
(fun tag classes ->
{ CssSelectorElement.Default with
Tag = stringFromOptional tag;
Classes = classes })) |> attempt
<|>
(tag |>> (fun tag -> { CssSelectorElement.Default with Tag = tag })))
(opt (str ":" >>. parseCssPseudoClass) |> attempt)
(fun selectorElem pseudoClass -> { selectorElem with PseudoClass = pseudoClassFromOptional pseudoClass })

但我不太喜欢它的形成方式。我期待提出一些更容易理解的东西,但是解析 (tag?id|tag?classes|tag)pseudoClass 的部分?有几个 pipe2 和尝试真的很糟糕。

有没有在 FParsec 方面有更多经验的人来指导我如何更好地完成此任务?我正在考虑尝试 FSLex/Yacc 或 Boost.Spirit 而不是 FParsec 看看我是否可以用它们提出更好的代码

最佳答案

您可以将该复杂解析器的某些部分提取为变量,例如:

let tagid = 
pipe2 (opt tag)
id
(fun tag id ->
{ CssSelectorElement.Default with
Tag = stringFromOptional tag
Id = id })

您也可以尝试使用 applicative interface ,我个人觉得比 pipe2 更容易使用和思考:

let tagid =
(fun tag id ->
{ CssSelectorElement.Default with
Tag = stringFromOptional tag
Id = id })
<!> opt tag
<*> id

关于parsing - 提高 FParsec 解析器的可读性,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/7500075/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com