gpt4 book ai didi

python - 正则表达式从模式中提取多个字段

转载 作者:太空宇宙 更新时间:2023-11-03 18:08:11 24 4
gpt4 key购买 nike

我在 txt 文件中有这样的模式:

["kiarix moreno","116224357500406255237","z120gbkosz2oc3ckv23bc10hhwrudlcjy04",1409770337,"com.youtube.www/watch?v\u003dp1JPKLa-Ofc:https","es"]

我需要一个正则表达式来提取Python中的每个字段。每个字段都可以包含任何字符(不仅是字母数字),但第四个字符除外,它是一个长数字。我该怎么做?非常感谢。

编辑:该文件包含其他 html 元素,这就是为什么我无法直接在 python 列表中解析它。

最佳答案

以下提供了三种不同的获取数据的选项:

>>> TEXT = '["kiarix moreno","116224357500406255237","z120gbkosz2oc3ckv23bc10hhwrudlcjy04",1409770337,"com.youtube.www/watch?v\u003dp1JPKLa-Ofc:https","es"]'
>>> import json, ast, re
>>> json.loads(TEXT)
['kiarix moreno', '116224357500406255237', 'z120gbkosz2oc3ckv23bc10hhwrudlcjy04', 1409770337, 'com.youtube.www/watch?v=p1JPKLa-Ofc:https', 'es']
>>> ast.literal_eval(TEXT)
['kiarix moreno', '116224357500406255237', 'z120gbkosz2oc3ckv23bc10hhwrudlcjy04', 1409770337, 'com.youtube.www/watch?v=p1JPKLa-Ofc:https', 'es']
>>> re.search(r'\["(?P<name>[^"]*)","(?P<number1>[^"]*)","(?P<data>[^"]*)",(?P<number2>\d*),"(?P<website>[^"]*)","(?P<language>[^"]*)"\]', TEXT).groupdict()
{'website': 'com.youtube.www/watch?v=p1JPKLa-Ofc:https', 'number2': '1409770337', 'language': 'es', 'data': 'z120gbkosz2oc3ckv23bc10hhwrudlcjy04', 'number1': '116224357500406255237', 'name': 'kiarix moreno'}
>>>

特别是,您的正则表达式如下:r'\["(?P<name>[^"]*)","(?P<number1>[^"]*)","(?P<data>[^"]*)",(?P<number2>\d*),"(?P<website>[^"]*)","(?P<language>[^"]*)"\]'

关于python - 正则表达式从模式中提取多个字段,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/26467748/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com