gpt4 book ai didi

python - 如何通过定义分隔符前后来提取子字符串

转载 作者:行者123 更新时间:2023-12-04 00:00:04 25 4
gpt4 key购买 nike

我有包含 URL 的数据框,我想在两者之间提取一些东西。

df
URL
https://storage.com/vision/Glass2020/2020-02-04_B8I8FZHl-xJ_2236301468348443721.jpg
https://storage.com/vision/Carpet5020/2020-02-04_B8I8FZHl-xJ_2236301468348443721.jpg
https://storage.com/vision/Metal8020/2020-02-04_B8I8FZHl-xJ_2236301468348443721.jpg

期望的输出是这样的

            URL                                                                           Type
https://storage.com/vision/Glass2020/2020-02-04_B8I8FZHl-xJ_2236301468348443721.jpg Glass2020
https://storage.com/vision/Carpet5020/2020-02-04_B8I8FZHl-xJ_2236301468348443721.jpg Carpet5020
https://storage.com/vision/Metal8020/2020-02-04_B8I8FZHl-xJ_2236301468348443721.jpg Metal8020

我会使用 df['URL'].str.extract 但要了解如何定义定界符前后。

最佳答案

一个想法是使用 Series.str.split通过索引选择倒数第二个值:

df['Type'] = df['URL'].str.split('/').str[-2]
print (df)
URL Type
0 https://storage.com/vision/Glass2020/2020-02-0... Glass2020
1 https://storage.com/vision/Carpet5020/2020-02-... Carpet5020
2 https://storage.com/vision/Metal8020/2020-02-0... Metal8020

编辑:要在预期输出之外指定不同的值,请使用 Series.str.extract :

df['Type'] = df['URL'].str.extract('vision/(.+)/2020')
print (df)
URL Type
0 https://storage.com/vision/Glass2020/2020-02-0... Glass2020
1 https://storage.com/vision/Carpet5020/2020-02-... Carpet5020
2 https://storage.com/vision/Metal8020/2020-02-0... Metal8020

关于python - 如何通过定义分隔符前后来提取子字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/63248053/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com