gpt4 book ai didi

将在字符串中间找到 JSON 对象的 Ruby 正则表达式

转载 作者:数据小太阳 更新时间:2023-10-29 08:37:48 24 4
gpt4 key购买 nike

我有这个正则表达式:

 /(
# define subtypes and build up the json syntax, BNF-grammar-style
# The {0} is a hack to simply define them as named groups here but not match on them yet
# I added some atomic grouping to prevent catastrophic backtracking on invalid inputs
(?<number> -?(?=[1-9]|0(?!\d))\d+(\.\d+)?([eE][+-]?\d+)?){0}
(?<boolean> true | false | null ){0}
(?<string> " (?>[^"\\\\]* | \\\\ ["\\\\bfnrt\/] | \\\\ u [0-9a-f]{4} )* " ){0}
(?<array> \[ (?> \g<json> (?: , \g<json> )* )? \s* \] ){0}
(?<pair> \s* \g<string> \s* : \g<json> ){0}
(?<object> \{ (?> \g<pair> (?: , \g<pair> )* )? \s* \} ){0}
(?<json> \s* (?> \g<number> | \g<boolean> | \g<string> | \g<array> | \g<object> ) \s* ){0}
)
\A \g<json> \Z
/uix

我有一个应该返回 JSON 的 API,但我的一些客户在他们的 API 中安装了其他插件,现在我的响应包含其他非 JSON 字符,但 JSON 在响应字符串中。

我认为此正则表达式不起作用,因为转义字符未被 <string> 识别图案。如果我在被识别为 <string> 的模式中有引号它与模式不匹配。如果我有一个 HTML 字符串值并且其中一个元素有一个属性,就会发生这种情况,如下所示:

<div itemscope itemtype=\\\"http:\\/\\/schema.org\\/Recipe\\\" id=\\\"zlrecipe-container\\\" class=\\\"serif zlrecipe\\\"></div>

这里是 an example我收到的回复。我想提取我的 JSON block 并忽略其余部分。

最佳答案

首先,你忘记了一个 "在您的测试 JSON 字符串中,就在 <\/a><\/div> 之后, 所以它不是有效的 JSON。

我使用以下字符串进行了测试,这是您更正且未转义的示例:

b<---------------->{"status":"ok","plugin_version":"1.2.6","post":{"id":7598,"type":"post","slug":"honeycrisp-apple-sangria-recipe","url":"http:\/\/www.bigbigbutts.com\/2013\/08\/honeycrisp-apple-sangria-recipe\/","status":"publish","title":"Honeycrisp Apple Sangria Recipe","title_plain":"Honeycrisp Apple Sangria Recipe","content":"<div class=\"pin-it-btn-wrapper\"><a href=\"\/\/www.pinterest.com\/pin\/create\/button\/?url=http%3A%2F%2Fwww.bigbigbutts.c…crisp-apple-sangria.jpg&description=Honeycrisp%20Apple%20Sangria%20Recipe\" data-pin-do=\"buttonBookmark\" data-pin-config=\"none\"     rel=\"nobox\"><\/a><\/div>","raw_content":"","excerpt":"","date":"2013-08-24T11:18:07+00:00","modified":"2014-04-24T09:45:00+00:00","author":{"id":2,"slug":"gia","name":"gia","first_name":"gia","last_name":"Wenner chia","nickname":"gia","url":"http:\/\/giawennerchia.com","description":"gia Wenner chia is a writer and mom who gets paid to obsess over Pinterest and blogs for Ahalogy, a Cincinnati-based startup. She lives in her hometown of West Chester, Ohio, with her husband, two young children, and their dog."},"attachments":[{"id":7599,"url":"http:\/\/www.bigbigbutts.com\/wp-content\/uploads\/2013\/08\/honeycrisp-apple-sangria.jpg","slug":"honeycrisp-apple-sangria","title":"honeycrisp-apple-sangria","description":"","caption":"","parent":7598,"mime_type":"image\/jpeg","images":{"full":{"url":"http:\/\/www.bigbigbutts.com\/wp-content\/uploads\/2013\/08\/honeycrisp-apple-sangria.jpg","width":580,"height":406},"thumbnail":{"url":"http:\/\/www.bigbigbutts.com\/wp-content\/uploads\/2013\/08\/honeycrisp-apple-sangria-150x150.jpg","width":150,"height":150},"medium":{"url":"http:\/\/www.bigbigbutts.com\/wp-content\/uploads\/2013\/08\/honeycrisp-apple-sangria-300x210.jpg","width":300,"height":210},"large":{"url":"http:\/\/www.bigbigbutts.com\/wp-content\/uploads\/2013\/08\/honeycrisp-apple-sangria.jpg","width":580,"height":406},"Mini Square":{"url":"http:\/\/www.bigbigbutts.com\/wp-content\/uploads\/2013\/08\/honeycrisp-apple-sangria-70x70.jpg","width":70,"height":70},"Square":{"url":"http:\/\/www.bigbigbutts.com\/wp-content\/uploads\/2013\/08\/honeycrisp-apple-sangria-115x115.jpg","width":115,"height":115},"Featured Tabs":{"url":"http:\/\/www.bigbigbutts.com\/wp-content\/uploads\/2013\/08\/honeycrisp-apple-sangria-150x225.jpg","width":150,"height":225}}}],"featured_image":{"id":7599,"url":"http:\/\/www.bigbigbutts.com\/wp-content\/uploads\/2013\/08\/honeycrisp-apple-sangria.jpg","slug":"honeycrisp-apple-sangria","title":"honeycrisp-apple-sangria","description":"","caption":"","parent":7598,"mime_type":"image\/jpeg","images":{"full":{"url":"http:\/\/www.bigbigbutts.com\/wp-content\/uploads\/2013\/08\/honeycrisp-apple-sangria.jpg","width":580,"height":406},"thumbnail":{"url":"http:\/\/www.bigbigbutts.com\/wp-content\/uploads\/2013\/08\/honeycrisp-apple-sangria-150x150.jpg","width":150,"height":150},"medium":{"url":"http:\/\/www.bigbigbutts.com\/wp-content\/uploads\/2013\/08\/honeycrisp-apple-sangria-300x210.jpg","width":300,"height":210},"large":{"url":"http:\/\/www.bigbigbutts.com\/wp-content\/uploads\/2013\/08\/honeycrisp-apple-sangria.jpg","width":580,"height":406},"Mini Square":{"url":"http:\/\/www.bigbigbutts.com\/wp-content\/uploads\/2013\/08\/honeycrisp-apple-sangria-70x70.jpg","width":70,"height":70},"Square":{"url":"http:\/\/www.bigbigbutts.com\/wp-content\/uploads\/2013\/08\/honeycrisp-apple-sangria-115x115.jpg","width":115,"height":115},"Featured Tabs":{"url":"http:\/\/www.bigbigbutts.com\/wp-content\/uploads\/2013\/08\/honeycrisp-apple-sangria-150x225.jpg","width":150,"height":225}}}}}<random shit><dafkdjkfjdak

接下来,正则表达式。那些\A\Z是错误的,因为只有当 JSON 是唯一的字符串内容时,它们才会使模式匹配。

然后,您在 string 中放置了过多的反斜杠子模式。替换 \\\\\\ .

另一个问题是 [^"\\]*参与 string子模式。替换 *+++ , 因为整个原子团已经有一个 *上的量词。

这是工作正则表达式,PCRE 风格:

(?(DEFINE)
(?<number> -?(?=[1-9]|0(?!\d))\d+(?:\.\d+)?(?:[eE][+-]?\d+)?)
(?<boolean> true | false | null )
(?<string> " (?:[^"\\]++ | \\ ["\\bfnrt\/] | \\ u [0-9a-f]{4} )* " )
(?<array> \[ (?> \g<json> (?: , \g<json> )* )? \s* \] )
(?<pair> \s* \g<string> \s* : \g<json> )
(?<object> \{ (?> \g<pair> (?: , \g<pair> )* )? \s* \} )
(?<json> \s* (?> \g<number> | \g<boolean> | \g<string> | \g<array> | \g<object> ) \s*)
)
\g<json>

演示:http://regex101.com/r/tS8cW7/1

我仍然认为不需要某些原子团,但它们毕竟无害。

现在,由于您使用的是 ruby (Oniguruma),因此您不能使用 (?(DEFINE)...)句法。你的{0}技巧很好,但在一个地方使用它就足够了:

(?:
(?<number> -?(?=[1-9]|0(?!\d))\d+(?:\.\d+)?(?:[eE][+-]?\d+)?)
(?<boolean> true | false | null )
(?<string> " (?:[^"\\]++ | \\ ["\\bfnrt\/] | \\ u [0-9a-f]{4} )* " )
(?<array> \[ (?> \g<json> (?: , \g<json> )* )? \s* \] )
(?<pair> \s* \g<string> \s* : \g<json> )
(?<object> \{ (?> \g<pair> (?: , \g<pair> )* )? \s* \} )
(?<json> \s* (?> \g<number> | \g<boolean> | \g<string> | \g<array> | \g<object> ) \s*)
){0}
\g<json>

关于将在字符串中间找到 JSON 对象的 Ruby 正则表达式,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25273624/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com