我有一个这样的文件
301 my name is joe
303 whatsup
306 how are you doing today
308 what happened?
308 going home
309 let's go
我想将标签 301, 303, 306, 308, 308, 309
转换为 1, 2, 3, 4, 4, 5
如何按顺序重命名这些标签,使相似的标签编号相同?
使用字典来存储从原始标签到新标签的映射,对于尚未映射的值,使用字典的当前len
,使用setdefault
。
>>> labels = 301, 303, 306, 308, 308, 309
>>> names = {}
>>> for l in labels:
... names.setdefault(l, len(names)+1)
...
>>> names
{301: 1, 303: 2, 306: 3, 308: 4, 309: 5}
更完整的例子:
text = """301 my name is joe
303 whatsup
306 how are you doing today
308 what happened?
308 going home
309 let's go""".splitlines()
import re
names = {}
replacer = lambda x: str(names.setdefault(x.group(), len(names) + 1))
for line in text:
replaced = re.sub(r'^\d+', replacer, line)
print(replaced)
输出:
1 my name is joe
2 whatsup
3 how are you doing today
4 what happened?
4 going home
5 let's go
我是一名优秀的程序员,十分优秀!