gpt4 book ai didi

scala - 如何在不丢剑的情况下从文本中解析占位符,这样你就可以用灯罩击退掠夺者

转载 作者:行者123 更新时间:2023-12-04 21:25:49 25 4
gpt4 key购买 nike

我需要从文本中解析占位符,例如 abc $$FOO$$ cba .我用 Scala 的解析器组合器破解了一些东西,但我对这个解决方案并不满意。

特别是,我在正则表达式 (?=(\$\$|\z)) 中使用了零宽度匹配器。停止解析文本并开始解析占位符。这听起来非常接近 scala mailing list 上讨论的恶作剧,并以丰富多彩的方式被驳回。 (这启发了这个问题的标题。)

所以,挑战:修复我的解析器在没有这个黑客的情况下工作。我希望看到从问题到您的解决方案的清晰进展,因此我可以替换随机组装组合器直到测试通过的策略。

import scala.util.parsing.combinator.RegexParsers

object PlaceholderParser extends RegexParsers {
sealed abstract class Element
case class Text(text: String) extends Element
case class Placeholder(key: String) extends Element

override def skipWhitespace = false

def parseElements(text: String): List[Element] = parseAll(elements, text) match {
case Success(es, _) => es
case NoSuccess(msg, _) => error("Could not parse: [%s]. Error: %s".format(text, msg))
}

def parseElementsOpt(text: String): ParseResult[List[Element]] = parseAll(elements, text)

lazy val elements: Parser[List[Element]] = rep(element)
lazy val element: Parser[Element] = placeholder ||| text
lazy val text: Parser[Text] = """(?ims).+?(?=(\$\$|\z))""".r ^^ Text.apply
lazy val placeholder: Parser[Placeholder] = delimiter ~> """[\w. ]+""".r <~ delimiter ^^ Placeholder.apply
lazy val delimiter: Parser[String] = literal("$$")
}


import org.junit.{Assert, Test}

class PlaceholderParserTest {
@Test
def parse1 = check("a quick brown $$FOX$$ jumped over the lazy $$DOG$$")(Text("a quick brown "), Placeholder("FOX"), Text(" jumped over the lazy "), Placeholder("DOG"))

@Test
def parse2 = check("a quick brown $$FOX$$!")(Text("a quick brown "), Placeholder("FOX"), Text("!"))

@Test
def parse3 = check("a quick brown $$FOX$$!\n!")(Text("a quick brown "), Placeholder("FOX"), Text("!\n!"))

@Test
def parse4 = check("a quick brown $$F.O X$$")(Text("a quick brown "), Placeholder("F.O X"))

def check(text: String)(expected: Element*) = Assert.assertEquals(expected.toList, parseElements(text))
}

最佳答案

我找到了另一种方法。不再有 regex hack,但代码有点长。它将整个字符串解析为单个字符列表或 Placeholder对象。 compact然后函数压缩列表(即,它将连续的字符串转换为 Text 对象并且不触及 Placeholder 对象):

object PlaceholderParser extends RegexParsers {
sealed abstract class Element
case class Text(text: String) extends Element
case class Placeholder(key: String) extends Element

override def skipWhitespace = false

def parseElements(text: String): List[Element] = parseAll(elements, text) match {
case Success(es, _) => es
case NoSuccess(msg, _) => error("Could not parse: [%s]. Error: %s".format(text, msg))
}

def parseElementsOpt(text: String): ParseResult[List[Element]] = parseAll(elements, text)

def compact(l: List[Any]): List[Element] = {
val builder = new StringBuilder()
val r = l.foldLeft(List.empty[Element])((l, e) => e match {
case s: String =>
builder.append(s)
l
case p: Placeholder =>
val t = if (builder.size > 0) {
val k = l ++ List(Text(builder.toString))
builder.clear
k
} else {
l
}
t ++ List(p)
})
if (builder.size > 0) r ++ List(Text(builder.toString)) else r
}

lazy val elements: Parser[List[Element]] = (placeholder ||| text).+ ^^ compact
lazy val text: Parser[String] = """(?ims).""".r
lazy val placeholder: Parser[Placeholder] = delimiter ~> """[\w. ]+""".r <~ delimiter ^^ Placeholder.apply
lazy val delimiter: Parser[String] = literal("$$")
}

这不是一个完美的解决方案,但也许你可以开始。

关于scala - 如何在不丢剑的情况下从文本中解析占位符,这样你就可以用灯罩击退掠夺者,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/3371334/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com