gpt4 book ai didi

python - 使用 Python 3.5.1 按名称中的公共(public)部分对大量 pdf 文件进行排序

转载 作者:太空宇宙 更新时间:2023-11-04 10:13:17 25 4
gpt4 key购买 nike

我需要根据名称中最常见的部分对大量(约 20000 个)pdf 文件进行排序。每个文件的结构都非常相同:XXX_1500004898_CommonPART.pdf(有些文件用“_”分隔,有些用“-”分隔)

这是我使用的代码:

files = []
for root, dirnames, files in os.walk(r'C:PATH/TO/FILES'):
for file in fnmatch.filter(files, '*0000*.pdf'):
print (file)
files.append(os.path.join(root, file))
time.sleep(2)
sorted_files = sorted(files, key=lambda x: str(x.split('-')[2]))

但是当我运行它时,我唯一得到的是回溯:

Traceback (most recent call last):
File "C:\PATH\Sorting.py", line 14, in <module>
sorted_files = sorted(files, key=lambda x: str(x.split('-')[2]))
File "C:\PATH\Sorting.py", line 14, in <lambda>
sorted_files = sorted(files, key=lambda x: str(x.split('-')[2]))
IndexError: list index out of range

我是 Python 的新手,所以我可能看起来没有经验,而且我仍然不知道如何告诉 Python 通过这些公共(public)部分创建文件夹并将文件移动到那里。

你能帮我解决这个问题吗?

非常感谢!

更新代码:

files_result = []
for root, dirnames, files in os.walk(r'C:\PATH\TESTT'):
for file in fnmatch.filter(files, '*0000*.pdf'):
print (file)
files_result.append(os.path.join(root, file))
time.sleep(2)
sorted_files = sorted(file.replace("_", "-").split("-")[2] for file in files_result if (file.count("-")+file.count("_") == 2))
print (sorted_files)

这是结果:

['ALOISE emma.pdf', 'ALOISEEMMA.pdf', 'ARETEIA.pdf', 'ASSEL.pdf', 'AVV.BELLOMI.pdf', 'BRACI E ABBRACCI.pdf', 'CERRATA D..pdf', 'CERRATA REFRIGERAZIONE.pdf', etc.....]

以下是一些典型的文件名:

ANI-150000000106SD_approvato.pdf
ANI-1500000006-CENTROCHIRURGIAAMBULATORIALEsrl_approvato.pdf
ANI-1500000007-EUROMED ECOLOGICA_APPROVATO.pdf
ANI-1500000008-TELECOM_APPROVATO.pdf
ANI-1500000009-TELECOM_APPROVATO.pdf
ANI-15000000100-ALOISE EMMA_approvato.pdf
ANI-15000000101-centro.chirurgia.ambulatoriale_approvato.pdf
ANI-15000000102-TELECOM_APPROVATO.pdf
ANI-15000000103-MCLINK_APPROVATO.pdf
ani-15000000104-idrafer.pdf
ANI-15000000105EUROMEDECOLOGICA_approvata.pdf
ANI-15000000107LAGSERVICE.pdf
ANI-15000000109TCHR_approvato.pdf
ANI-1500000011-COOPSERVICEn9117011288 approvate (2).pdf
ANI-1500000011-COOPSERVICEn°9117011288.pdf
ANI-15000000110-TELECOM_APPROVATO.pdf
ANI-15000000113-SECURLAB_approvato.pdf
ANI-15000000114-SECURLAB_approvato.pdf
ANI-15000000115-COOPSERVICE_approvato.pdf
ANI-15000000116-COOPSERVICE_approvato.pdf
ANI-15000000117-REPOWER_approvato.pdf
ANI-15000000118-CECCHINIlaura_approvato.pdf
ANI-15000000119-DESENA_approvato.pdf
ANI-1500000012-TCHRSERCICES.R.L._approvato (1).pdf
ANI-15000000121-ALOISE_approvato.pdf
ANI-15000000122-LAGSERVICE.pdf
ANI-15000000123-SECURLAB_approvata.pdf
ANI-15000000125-QUERZOLA_approvato.pdf
ANI-15000000129-TC HR_apprpvato.pdf
ANI-1500000013-TAV_approvato.pdf
ANI-15000000130-LAGSERVICE.pdf
ANI-15000000131EUROMEDecologica_approvato.pdf
ANI-15000000132-LAV.pdf
ANI-15000000133-REPOWER.pdf
ANI-15000000134-MCLINK.pdf
ANI-15000000135-COOPSERVICE_approvato.pdf
ANI-15000000136-COOPSERVICE_approvato.pdf
ANI-15000000138-TCHR._approvatopdf.pdf
ANI-15000000139-ALOISEEMMA.pdf
ANI-1500000014-OFFICEDEPOT_approvato.pdf
ANI-15000000140_TELECOM.pdf
ANI-15000000141-CHIRURGIAAMBULATORIALE_approvato.pdf
ANI-15000000142-LAG.pdf
ANI-15000000143-LAG.pdf
ANI-15000000145-TELECOM.pdf
ANI-15000000146-LAG.pdf
ANI-15000000147-WERFEN.pdf
ani-15000000148-enigas.pdf
ANI-15000000153TCHR_approvato.pdf
ANI-15000000154-ASSEL.pdf
ANI-15000000155-DIGIUSEPPEgiancarlo.pdf
ANI-15000000156-SD.pdf
ANI-15000000157-SAS.pdf
ani-15000000158-energeticSOURCE.pdf
ANI-15000000159-chirurgia ambulatoriale.pdf
ANI-1500000016-THEMIX_approvato.pdf
ANI-15000000160-CERRATA REFRIGERAZIONE.pdf
ANI-15000000162-ALOISE emma.pdf
ANI-1500000017-ASSEL_approvato.pdf
ANI-1500000018-QUERZOLA_approvato.pdf
ANI-1500000019-BDO_approvato.pdf
ANI-1500000020-THEMIXfatt_ approvato.134.pdf
ANI-1500000021-SECURLAB_approvato.pdf
ANI-1500000022-LYRECO+DDT_approvato.pdf
ANI-1500000023-COOPSERVICE approvato (1).pdf
ANI-1500000024-REPOWER135812_approvato.pdf
ANI-1500000025-DR.BRANDIMARTE-fatt.35_approvato (1).pdf
ANI-1500000026-D.SSA AMBRUZZI_approvato.pdf
ANI-1500000027-COOPSERVICE9117034433 approvato (1).pdf
ANI-1500000031-TAVf.314_approvato.pdf
ANI-1500000032-d.ALOISEmaggio2015_approvato.pdf
ANI-1500000033-CENTROchirurgiaAMBULATORIALEf201500306_approvato.pdf
ANI-1500000034-WINDf.7407817176_approvato.pdf
ANI-1500000035-avv.BELLOMI.pdf
ANI-1500000038-TOPCARf._approvato.pdf
ANI-1500000039-TCHRf.000544_approvato.pdf
ANI-1500000040-THEMIX_approvato.pdf
ANI-1500000041-DESENA_approvato.pdf
ANI-1500000042-TCHRSERVICESf.000565_approvato.pdf
ANI-1500000043-QUERZOLAf.109_approvato.pdf
ANI-1500000047-TELEPASS.pdf
ANI-1500000049-WIND_approvato.pdf
ANI-1500000051-MCLINKf.109493_approvato.pdf
ANI-1500000052-MCLINKf.88508_approvato.pdf
ANI-1500000053-OFFICEDEPOT_approvato.pdf
ANI-1500000054-COOPSERVICEapprovatof 9117037004.pdf
ANI-1500000055-COOPSERVICEf 9117039325approvato.pdf
ANI-1500000056-SD_approvato.pdf
ANI-1500000057-REPOWER_approvato.pdf
ANI-1500000058-MCLINK_approvato.pdf
ANI-1500000059-LAG.pdf
ANI-1500000059WERFEN_approvato.pdf
ANI-1500000060WERFEN_approvato.pdf
ANI-1500000063-CENTROCHIRURGIAAMBULATORIALE_approvato.pdf
ANI-1500000064-dott.ALOISEemma_approvato.pdf
ANI-1500000066-MERCURI_approvato.pdf
ANI-1500000067-QUERZOLA_approvato.pdf
ANI-1500000070-TIM_approvato.pdf
ANI-1500000071LIFEBRAIN.pdf
ANI-1500000072-TC HR_approvato.pdf
ANI-1500000073-LAVAGGIO E GOMMISTA_approvato.pdf
ANI-1500000075-THEMIX_approvato.pdf
ANI-1500000076-EUROMEDecologica_approvato.pdf
ANI-1500000077-REPOWER_approvato.pdf
ANI-1500000078-SAS_approvato.pdf
ANI-1500000079-LAGSERVICE.pdf
ANI-1500000080-COOPSERVICE appr.pdf
ANI-1500000081-COOPSERVICE appr.pdf
ANI-1500000083-TAV_approvato.pdf
ANI-1500000084-aloise emma_approvato.pdf
ANI-1500000085-centro.chirurgia.ambulatoriale_approvato.pdf
ANI-1500000088-lagSERVICE.pdf
ANI-1500000089-FARMACIACAMERUCCI.pdf
ANI-1500000091-LAGservice.pdf
ANI-1500000092-ASSEL_approvata.pdf
ANI-1500000093-COOPSERVICE_approvato.pdf
ANI-1500000095-TCHR_approvato.pdf
ANI-1500000097-SAS (2)_approvato.pdf
ANI-1500000099-REPOWER_approvato.pdf
ARE-1500000001SAS_approvato.pdf
ARE-1500000002ACEA_approvato.pdf
ARE-1500000004VERGARI_approvato.pdf
ARE-1500000005PINTO_approvato.pdf
ARE-1500000006COSMOPOL_approvato.pdf
ARE-1500000007LAGSERVICE.pdf
ARE-1500000009 OFFICE DEPOT_ARETEIA.pdf
ARE-1500000010 SERVIZI ABITAZIONE_aqpprovato.pdf
ARE-1500000011 TELECOM_approvato.pdf
ARE-1500000012 TELECOM_approvato.pdf
ARE-1500000013 THEMIX_approvato.pdf
ARE-1500000014 QUERZOLA_approvato.pdf
ARE-1500000015 DA.CA. ESTINTORI_approvato.pdf
ARE-1500000016 COOPSERVICE approvato.pdf
ARE-1500000017-SAS.pdf
ARE-1500000017-SAS_approvato.pdf
ARE-1500000018-DR.BRANDIMARTE_approvato.pdf
ARE-1500000019-COOPSERVICE approvato.pdf
ARE-1500000020-BRACI E ABBRACCI.pdf
ARE-1500000021-COSMOPOL_approvato.pdf
ARE-1500000023-SAS_approvato.pdf
ARE-1500000024-MESCHINI_approvato.pdf
ARE-1500000025-VERGARI_approvato.pdf
ARE-1500000026-AVV.BELLOMI.pdf
ARE-1500000027-PINTO_approvato.pdf
ARE-1500000032-DA.CA_approvato.pdf
ARE-1500000033-SERVIZI ABITAZIONE_approvato.pdf
ARE-1500000034-QUERZOLA_approvato.pdf
ARE-1500000035-CERRATA D_approvato..pdf
ARE-1500000036-SECURLAB_approvata.pdf
ARE-1500000037-COSMOPOL_approvato.pdf
ARE-1500000038-OFFICE DEPOT_approvato.pdf
ARE-1500000039-MONIGEST_approvato.pdf
ARE-1500000040-MONIGEST_approvato.pdf
ARE-1500000041-COOPSERVICE approvato.pdf
ARE-1500000042-COOPSERVICE approvato.pdf
ARE-1500000043-SECURLAB_APPROVATO.pdf
ARE-1500000044-MESCHINI_APPROVATO.pdf
ARE-1500000045-ACEA_approvato.pdf
ARE-1500000047-PINTO_approvato.pdf
ARE-1500000050-VERGARI_approvato.pdf
ARE-1500000052-QUERZOLA_approvato.pdf
ARE-1500000053-CONTI ROSELLA_approvato.pdf.pdf
ARE-1500000057-DE SENA_approvato.pdf
ARE-1500000058-SERVIZI ABITAZIONE_approvato.pdf
ARE-1500000059-SECURLAB_approvato.pdf
ARE_1500000048_TELECOM_approvato.pdf
ARE_1500000049_TELECOM_approvato.pdf
ARE_1500000144_CERRATA D..pdf
BIO_1500000048_GIROLAMO LUCIANA_APPROVATO.pdf
BIO_1500000049_SPORTELLI MARIO_APPROVATO20150505_10081133.pdf
BIO_1500000050_LEGROTTAGLIE BENEDETTO_APPROVATO.pdf
BIO_1500000051_ANTIFORTUNISTICA MERIDIONALE_APPROVATO.pdf
BIO_1500000052_SAIL_APPROVATO.pdf
BIO_1500000053_SAIL_APPROVATO.pdf
BIO_1500000056_PRONTO UFFICIO_APPROVATO.pdf
BIO_1500000057_H3G SPA_APPROVATO.pdf
BIO_1500000060_RITELLA BENEDETTA_APPROVATO.pdf
BIO_1500000061_POSTA 7_APPROVATO.pdf
BIO_1500000062_POSTASETTESAS_APPROVATO.pdf
BIO_1500000063_PIGNATELLI_APPROVATO.pdf
BIO_1500000064_DIALINE SRL_APPROVATO.pdf
BIO_1500000065_L2 SRL SRL_APPROVATO.pdf
BIO_1500000066_FARMACIA TREROTOLI_APPROVATO.pdf
BIO_1500000067_FARMACIA TREROTOLI_APPROVATO.pdf
BIO_1500000068_BIOGROUP_APPROVATO.pdf
BIO_1500000069_VITO RINALDI_APPROVATO.pdf
BIO_1500000070_EUROCOMPUTERS_APPROVATO.pdf
BIO_1500000071_SERVIZI DIAGNOSTICI_APPROVATO.pdf
BIO_1500000072_SERVIZI DIAGNOSTICI_APPROVATO.pdf
BIO_1500000073_SERVIZI DIAGNOSTICI_APPROVATO.pdf

最佳答案

您对结果数组和 os.walk 使用相同的名称(文件)。这是带有更正变量名的代码:

import os
import fnmatch

files_result = []
for root, dirnames, files in os.walk(r'C:\PATH\TESTT'):
for f in fnmatch.filter(files, '*0000*.pdf'):
print(f)
files_result.append(os.path.join(root, f))

#sorted_files = sorted(files, key=lambda x: x.split('-')[1])
sorted_files = sorted(files, key=lambda x: x.replace("_", "-").split('-')[1]) # as Byte Commander suggested
print(sorted_files)

正如 Byte Commander 所建议的那样。下划线替换

关于python - 使用 Python 3.5.1 按名称中的公共(public)部分对大量 pdf 文件进行排序,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36999010/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com