gpt4 book ai didi

JavaScript 使用 Lookahead 匹配多行中的相似模式

转载 作者:行者123 更新时间:2023-12-03 03:23:09 24 4
gpt4 key购买 nike

我正在尝试提出正则表达式 block ,该 block 将使用 JavaScript 从 cucumber 样本中提取表。 cucumber 样本如下

Feature: Sample Feature File

Scenario: An international coffee shop must handle currencies
Given the price list for an international coffee shop
| product | currency | price |
| coffee | EUR | 1 |
| donut | SEK | 18 |
When I buy 1 coffee and 1 donut
Then should I pay 1 EUR and 18 SEK

Scenario Outline: eating
Given there are <start> cucumbers
When I eat <eat> cucumbers
Then I should have <left> cucumbers

Examples:
| start | eat | left |
| 12 | 5 | 7 |
| 20 | 5 | 15 |

正则表达式应该在两个匹配中返回以下内容,如下所示

1)

  | product | currency | price |
| coffee | EUR | 1 |
| donut | SEK | 18 |

2)

  | start | eat | left |
| 12 | 5 | 7 |
| 20 | 5 | 15 |

一旦获得 block ,我将按行分割以获取表中的行数。无论如何,我尝试了否定查找表达式来尝试解决这个问题。我的努力如下

/(\|)[\s\S]*\|(?!\s+\|)/gm

但是返回

| product | currency | price |
| coffee | EUR | 1 |
| donut | SEK | 18 |
When I buy 1 coffee and 1 donut
Then should I pay 1 EUR and 18 SEK

Scenario Outline: eating
Given there are <start> cucumbers
When I eat <eat> cucumbers
Then I should have <left> cucumbers

Examples:
| start | eat | left |
| 12 | 5 | 7 |
| 20 | 5 | 15 |

如果我删除第二种情况,正则表达式将按预期工作并仅返回

  | product | currency | price |
| coffee | EUR | 1 |
| donut | SEK | 18 |

关于我的正则表达式出错的地方有什么建议吗?非常感谢。

最佳答案

[\s\S]* 模式匹配任何 0+ 字符,尽可能多,直到字符串中没有 1+ 的最后一个 |紧邻当前位置右侧的空格和 |。由于匹配项是从左到右搜索的,因此您得到单个匹配项是合乎逻辑的。

我建议像这样展开图案

/^[^\S\r\n]*\|.*\|(?:[^\S\r\n]*[\r\n]+[^\S\r\n]*\|.*\|)*/gm

参见its demo here .

请注意,如果动态构建它,您可以使其可读:

var h = "[^\\S\r\n]*";     // horizontal whitespace block
var rx = new RegExp("^" + // start of a line
h + "\\|.*\\|" + // hor. whitespace, |, 0+ chars other than line breaks, |
"(?:" + h + "[\r\n]+" + // 0+ sequences of hor. whitespace, line breaks,
h + "\\|.*\\|)*", // hor. whitespace, |, 0+ chars other than line breaks, |
"gm"); // Global (find multiple matches) and multiline (^ matches line start)
var s = "Feature: Sample Feature File\r\n\r\n Scenario: An international coffee shop must handle currencies\r\n Given the price list for an international coffee shop\r\n | product | currency | price |\r\n | coffee | EUR | 1 |\r\n | donut | SEK | 18 |\r\n When I buy 1 coffee and 1 donut\r\n Then should I pay 1 EUR and 18 SEK\r\n\r\n Scenario Outline: eating\r\n Given there are <start> cucumbers\r\n When I eat <eat> cucumbers\r\n Then I should have <left> cucumbers\r\n\r\n Examples:\r\n | start | eat | left |\r\n | 12 | 5 | 7 |\r\n | 20 | 5 | 15 |";
console.log(s.match(rx));

详细信息

  • ^ - 行的开头
  • [^\S\r\n]* - 0+ 水平空白
  • \| - 一个 |
  • .* - 除换行符之外的任何 0+ 个字符,尽可能多
  • \| - 一个 |
  • (?:[^\S\r\n]*[\r\n]+[^\S\r\n]*\|.*\|)* - 零或多个序列:
    • [^\S\r\n]* - 0+ 水平空白
    • [\r\n]+ - 1 个或多个 CR 或/和 LF 符号(如果您只想匹配 1 个换行符,请使用 (?:\r\n? |\n) 此处)
    • [^\S\r\n]*\|.*\| - 0+ 水平空白、|、除换行符之外的任何 0+ 字符,尽可能多,|

关于JavaScript 使用 Lookahead 匹配多行中的相似模式,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46474659/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com