I would like to split this string:
我想把这根弦分开:
lg:[:after]:hover:color-blue
The two conditions are:
这两个条件是:
- Split by
:
- If there's a
[]
, get the content (even if there's a :
inside)
The result would be
其结果将是
lg
:after
hover
color-blue
Possible inputs would be:
可能的投入包括:
[:after]:hover:color-blue
hover:color-blue
lg:hover:color-blue
What I have so far:
到目前为止,我所拥有的是:
const regex = /(?:([^\:\[\]]+)|\[([^\[\]]+)\])/g;
const matches = [...string.matchAll(regex)].map((match) =>
typeof match[2] !== "undefined" ? match[2] : match[1]
);
It works well but the map
feels hacky.
它工作得很好,但地图感觉很粗糙。
Is there a way to get the desired output directly from the regex?
有没有办法直接从正则表达式获得所需的输出?
更多回答
Depending on your input you could use split()
too, with something like :?\[|\]:?|:(?![^\]\[]*])
though this would leave empty elements if in the input there is a colon/bracket at start or end of the string which would need to be removed.
根据您的输入,您也可以使用Split(),如下所示:?\[|\]:?|:(?![^\]\[]*]),但如果在输入中字符串的开头或结尾处有冒号/方括号需要删除,则会留下空元素。
that's another approach, thanks
这是另一种方法,谢谢
Is the string enclosed in brackets ([
and ]
) always immediately preceded and followed by a colon (:
)?
括在方括号([和])中的字符串是否总是前后紧跟冒号(:)?
If they appear before the final :
, yes. But I've found a special case, e. g. :hover-[:after]:bg-color-[primary-light]-100
... the [primary-light]
doesn't need to be matches but the [:after]
does.
如果他们出现在决赛之前:是的。但我发现了一个特例,例如:hover-[:after]:bg-color-[primary-light]-100...[主光]不需要匹配,但[:After]需要匹配。
Normally you should not substantially modify your answer after one or more answers have been posted, lest your changes invalidate those answers. In this case, however, there is only one answer and that answer has been updated to address your comment immediately above. Consequently, I suggest you edit your question to include the content of your above comment. Readers should not be required to read through the comments to understand the question.
通常,在发布一个或多个答案后,您不应大量修改您的答案,以免您的更改使这些答案无效。然而,在这种情况下,只有一个答案,该答案已被更新,以解决您的评论紧随其后。因此,我建议您编辑您的问题,以包括您的上述评论的内容。读者不应该被要求通读评论来理解问题。
Yes:
是:
(?<=\[)[^[\]]+(?=\]) # 1+ non-square-brackets inside a pair of those
| # or
(?<=^|:) # a segment
(?: # that consists of multiple subsegments,
\[[^[\]]+\] # either bracketed
| # or
[^[:\]]+ # non-bracketed,
){2,} # 2 or more times.
(?=:|$) #
Try it on regex101.com.
在regex101.com上试试吧。
This might be shortened as [^[\]\n]+(?=\])|(?:\[[^[\]\n]+\]|[^[:\]\n]+){2,}
, which looks less intimidating, if that happens to not match anything unexpected.
这可能会被缩写为[^[\]\n]+(?=\])|(?:\[[^[\]\n]+\]|[^[:\]\n]+){2,},如果它碰巧与任何意想不到的东西不匹配,看起来就不那么可怕了。
The regex above is, however, prone to backtracking. A "better" version of it would be:
然而,上面的正则表达式很容易回溯。更好的版本应该是:
(?<=\[)[^[\]\n]+(?=\])
|
(?<=^|:)
(?: # Before trying to match 2+ subsegments
(?=([^[:\]\n]+)) # check if the next non-bracketed subsegment
\1(?=:|$) # is a full segment, in which case we match
| # and skip the 2nd alternative entirely.
(?:
\[[^[\]\n]+\]
|
[^[:\]\n]+
){2,}
)
Try it on regex101.com.
在regex101.com上试试吧。
Since ECMAScript flavor doesn't support atomic groups, we simulated the atomic behaviour using a capturing group inside a lookahead along with the corresponding backreference.
由于ECMAScript风格不支持原子组,因此我们使用前视中的捕获组以及相应的反向引用来模拟原子行为。
Note that we need to use .match()
instead of .split()
:
请注意,我们需要使用.Match()而不是.Split():
string.match(/[^[\]\n]+(?=\])|(?:\[[^[\]\n]+\]|[^[:\]\n]+){2,}/g)
Try it:
试试看:
console.config({ maximize: true });
const testcases = [
'lg:[:after]:hover:color-blue',
'[:after]:hover:color-blue',
'hover:color-blue',
'lg:hover:color-blue',
'hover:[:after]:bg-color-[primary-light]-100',
'foo:[bar]:[start-but]-not-end',
'baz-qux:end-[but:not]:start'
];
const regex = /[^[\]\n]+(?=\])|(?:\[[^[\]\n]+\]|[^[:\]\n]+){2,}/g;
for (const testcase of testcases) {
console.log(testcase, testcase.match(regex));
}
<script src="https://gh-canon.github.io/stack-snippet-console/console.min.js"></script>
Or... you can just split by (?<!\[[^[\]]*):
and handle things from there; that would be the "shortest" and clearest solution:
或者..。您只需除以(?<!\[[^[\]]*):并从那里处理事情;这将是最短的、最清晰的解决方案:
string.split(/(?<!\[[^[\]]*):/).map(
token => token.match(/^\[.+]$/) ? token.slice(1, -1) : token
)
Try it:
试试看:
console.config({ maximize: true });
const testcases = [
'lg:[:after]:hover:color-blue',
'[:after]:hover:color-blue',
'hover:color-blue',
'lg:hover:color-blue',
'hover:[:after]:bg-color-[primary-light]-100',
'foo:[bar]:[start-but]-not-end',
'baz-qux:end-[but:not]:start'
];
const regex = /(?<!\[[^[\]]*):/;
for (const testcase of testcases) {
console.log(
testcase,
testcase.split(regex).map(
token => token.match(/^\[.+]$/) ? token.slice(1, -1) : token
)
);
}
<script src="https://gh-canon.github.io/stack-snippet-console/console.min.js"></script>
更多回答
hi, I've found a special case, e. g. : hover:[:after]:bg-color-[primary-light]-100
... [primary-light]
doesn't need to be matched but [:after]
does.
你好,我发现了一个特例,例如:hover:[:after]:bg-color-[primary-light]-100...[主光]不需要匹配,但[:After]需要匹配。
@JoseCarlosRamírez See updated answer.
@JoseCarlosRamírez查看更新的答案。
Thank you for such a thorough answer. After posting the question, I discovered that I need to know if the "modifier" comes from a []
or not, so the split
will work better in that case. But it's good to have a solution for the OQ. Thanks!
谢谢你如此透彻的回答。在发布了这个问题后,我发现我需要知道“修饰语”是否来自[],因此拆分在这种情况下会更好地工作。但有一个OQ的解决方案是很好的。谢谢!
我是一名优秀的程序员,十分优秀!