gpt4 book ai didi

java - Java 中的复杂组正则表达式模式

转载 作者:行者123 更新时间:2023-12-02 13:32:57 28 4
gpt4 key购买 nike

我开发了正则表达式模式来解析科学文章中的引用书目。我们使用 AMA 引文风格,对于期刊引文,它可以是这样的:

"Nielsen MK, Neergaard MA, Jensen AB, Bro F, Guldin MB. Psychological distress, health, and socio-economic factors in caregivers of terminally ill patients: a nationwide population-based cohort study. Support Care Cancer. 2016; 24(7): 3057-3067."

或没有问题编号:

"Nielsen MK, Neergaard MA, Jensen AB, Bro F, Guldin MB. Psychological distress, health, and socio-economic factors in caregivers of terminally ill patients: a nationwide population-based cohort study. Support Care Cancer. 2016; 24: 3057-3067."

或仅包含首页(电子编号)。

"Nielsen MK, Neergaard MA, Jensen AB, Bro F, Guldin MB. Psychological distress, health, and socio-economic factors in caregivers of terminally ill patients: a nationwide population-based cohort study. Support Care Cancer. 2016; 24(7): 3057."

或仅包含卷号(如果提前打印):

"Nielsen MK, Neergaard MA, Jensen AB, Bro F, Guldin MB. Psychological distress, health, and socio-economic factors in caregivers of terminally ill patients: a nationwide population-based cohort study. Support Care Cancer. 2016; 24."

我的模式匹配所有这些情况并对所有数据进行分组(因为 Java 使用 2 个斜杠转义):

(.*?)\\.(.*?)\\.(.*?)(?<year>\\d+)\\s*?;?\\s*?(?:(?<volume>\\d+))?(?:\\((?<issue>\\d+)\\))?\\s*?(?::\\s*?(?<fpage>\\d+|[A-Za-z]+\\d+))?(?:[\\-\\–](?<lpage>\\d+))?\\.

问题在于作者始终在第一个和最后一个页码之间放置空格。我想也许这个模式也可以改变以匹配这个?

"Nielsen MK, Neergaard MA, Jensen AB, Bro F, Guldin MB. Psychological distress, health, and socio-economic factors in caregivers of terminally ill patients: a nationwide population-based cohort study. Support Care Cancer. 2016; 24(7): 3057 - 3067."

这是一个example ,可以看出模式与此不正确匹配。

最佳答案

正确的正则表达式是

(.*?)\.(.*?)\.(.*?)(?<year>\d+)\s*?;?\s*?(?:(?<volume>\d+))?(?:\((?<issue>\d+)\))?\s*?(?::\s*?(?<fpage>\d+|[A-Za-z]+\d+))?(?:[ ]*[\-|\–][ ]*(?<lpage>\d+))?\.

这个https://regex101.com/r/RAdNgb/2解决你的问题。请检查一下。

关于java - Java 中的复杂组正则表达式模式,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43119616/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com