- html - 出于某种原因,IE8 对我的 Sass 文件中继承的 html5 CSS 不友好?
- JMeter 在响应断言中使用 span 标签的问题
- html - 在 :hover and :active? 上具有不同效果的 CSS 动画
- html - 相对于居中的 html 内容固定的 CSS 重复背景?
我正在尝试将 xml 解析为多个不同的文件-
示例 XML
<integration-outbound:IntegrationEntity
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<integrationEntityHeader>
<integrationTrackingNumber>281#963-4c1d-9d26-877ba40a4b4b#1583507840354</integrationTrackingNumber>
<referenceCodeForEntity>25428</referenceCodeForEntity>
<attachments>
<attachment>
<id>d6esd1d518b06019e01</id>
<name>durance.pdf</name>
<size>0</size>
</attachment>
<attachment>
<id>182e60164ddd4236b5bd96109</id>
<name>ssds</name>
<size>0</size>
</attachment>
</attachments>
<source>SIM</source>
<entity>SUPPLIER</entity>
<action>CREATE</action>
<timestampUTC>20200306T151721</timestampUTC>
<zDocBaseVersion>2.0</zDocBaseVersion>
<zDocCustomVersion>0</zDocCustomVersion>
</integrationEntityHeader>
<integrationEntityDetails>
<supplier>
<requestId>2614352</requestId>
<controlBlock>
<dataProcessingInfo>
<key>MODE</key>
<value>Onboarding</value>
</dataProcessingInfo>
<dataProcessingInfo>
<key>Supplier_Type</key>
<value>Operational</value>
</dataProcessingInfo>
</controlBlock>
<id>1647059</id>
<facilityCode>0001</facilityCode>
<systemCode>1</systemCode>
<supplierType>Operational</supplierType>
<systemFacilityDetails>
<systemFacilityDetail>
<facilityCode>0001</facilityCode>
<systemCode>1</systemCode>
<FacilityStatus>ACTIVE</FacilityStatus>
</systemFacilityDetail>
</systemFacilityDetails>
<status>ACTIVE</status>
<companyDetails>
<displayGSID>254232128</displayGSID>
<legalCompanyName>asdasdsads</legalCompanyName>
<dunsNumber>03-175-2493</dunsNumber>
<legalStructure>1</legalStructure>
<website>www.aaadistributor.com</website>
<noEmp>25</noEmp>
<companyIndicator1099>No</companyIndicator1099>
<taxidAndWxformRequired>NO</taxidAndWxformRequired>
<taxidFormat>Fed. Tax</taxidFormat>
<wxForm>182e601649ade4c38cd4236b5bd96109</wxForm>
<taxid>27-2204474</taxid>
<companyTypeFix>SUPPLIER</companyTypeFix>
<fields>
<field>
<id>LOW_CUURENT_SERV</id>
<value>1</value>
</field>
<field>
<id>LOW_COI</id>
<value>USA</value>
</field>
<field>
<id>LOW_STATE_INCO</id>
<value>US-PA</value>
</field>
<field>
<id>CERT_INSURANCE</id>
<value>d6e6e460fe8958564c1d518b06019e01</value>
</field>
<field>
<id>COMP_DBA</id>
<value>asdadas</value>
</field>
<field>
<id>LOW_AREUDIVE</id>
<value>N</value>
</field>
<field>
<id>LOW_BU_SIZE1</id>
<value>SMLBUS</value>
</field>
<field>
<id>EDI_CAP</id>
<value>Y</value>
</field>
<field>
<id>EDI_WEB</id>
<value>N</value>
</field>
<field>
<id>EDI_TRAD</id>
<value>N</value>
</field>
</fields>
</companyDetails>
<allLocations>
<location>
<addressInternalid>1704342</addressInternalid>
<isDelete>false</isDelete>
<internalSupplierid>1647059</internalSupplierid>
<acctGrpid>HQ</acctGrpid>
<address1>2501 GRANT AVE</address1>
<country>USA</country>
<state>US-PA</state>
<city>PHILADELPHIA</city>
<zip>19114</zip>
<phone>(215) 745-7900</phone>
</location>
</allLocations>
<contactDetails>
<contactDetail>
<contactInternalid>12232</contactInternalid>
<isDelete>false</isDelete>
<addressInternalid>1704312142</addressInternalid>
<contactType>Main</contactType>
<firstName>Raf</firstName>
<lastName>jas</lastName>
<title>Admin</title>
<email>abcd@gmail.com</email>
<phoneNo>123-42-23-23</phoneNo>
<createPortalLogin>yes</createPortalLogin>
<allowedPortalSideProducts>SIM,iSource,iContract</allowedPortalSideProducts>
</contactDetail>
<contactDetail>
<contactInternalid>1944938</contactInternalid>
<isDelete>false</isDelete>
<addressInternalid>1704342</addressInternalid>
<contactType>Rad</contactType>
<firstName>AVs</firstName>
<lastName>asd</lastName>
<title>Founder</title>
<email>as@sds.com</email>
<phoneNo>21521-2112-7900</phoneNo>
<createPortalLogin>yes</createPortalLogin>
<allowedPortalSideProducts>SIM,iContract,iSource</allowedPortalSideProducts>
</contactDetail>
</contactDetails>
<myLocation>
<addresses>
<myLocationsInternalid>1704342</myLocationsInternalid>
<isDelete>false</isDelete>
<addressInternalid>1704342</addressInternalid>
<usedAt>N</usedAt>
</addresses>
</myLocation>
<bankDetails>
<fields>
<field>
<id>LOW_BANK_KEY</id>
<value>123213</value>
</field>
<field>
<id>LOW_EFT</id>
<value>123123</value>
</field>
</fields>
</bankDetails>
<forms>
<form>
<id>CATEGORY_PRODSER</id>
<records>
<record>
<Internalid>24348</Internalid>
<isDelete>false</isDelete>
<fields>
<field>
<id>CATEGOR_LEVEL_1</id>
<value>MR</value>
</field>
<field>
<id>LOW_PRODSERV</id>
<value>RES</value>
</field>
<field>
<id>LOW_LEVEL_2</id>
<value>keylevel221</value>
</field>
<field>
<id>LOW_LEVEL_3</id>
<value>keylevel3127</value>
</field>
<field>
<id>LOW_LEVEL_4</id>
<value>keylevel4434</value>
</field>
<field>
<id>LOW_LEVEL_5</id>
<value>keylevel5545</value>
</field>
</fields>
</record>
<record>
<Internalid>24349</Internalid>
<isDelete>false</isDelete>
<fields>
<field>
<id>CATEGOR_LEVEL_1</id>
<value>MR</value>
</field>
<field>
<id>LOW_PRODSERV</id>
<value>RES</value>
</field>
<field>
<id>LOW_LEVEL_2</id>
<value>keylevel221</value>
</field>
<field>
<id>LOW_LEVEL_3</id>
<value>keylevel3125</value>
</field>
<field>
<id>LOW_LEVEL_4</id>
<value>keylevel4268</value>
</field>
<field>
<id>LOW_LEVEL_5</id>
<value>keylevel5418</value>
</field>
</fields>
</record>
<record>
<Internalid>24350</Internalid>
<isDelete>false</isDelete>
<fields>
<field>
<id>CATEGOR_LEVEL_1</id>
<value>MR</value>
</field>
<field>
<id>LOW_PRODSERV</id>
<value>RES</value>
</field>
<field>
<id>LOW_LEVEL_2</id>
<value>keylevel221</value>
</field>
<field>
<id>LOW_LEVEL_3</id>
<value>keylevel3122</value>
</field>
<field>
<id>LOW_LEVEL_4</id>
<value>keylevel425</value>
</field>
<field>
<id>LOW_LEVEL_5</id>
<value>keylevel5221</value>
</field>
</fields>
</record>
</records>
</form>
<form>
<id>OTHER_INFOR</id>
<records>
<record>
<isDelete>false</isDelete>
<fields>
<field>
<id>S_EAST</id>
<value>N</value>
</field>
<field>
<id>W_EST</id>
<value>N</value>
</field>
<field>
<id>M_WEST</id>
<value>N</value>
</field>
<field>
<id>N_EAST</id>
<value>N</value>
</field>
<field>
<id>LOW_AREYOU_ASSET</id>
<value>-1</value>
</field>
<field>
<id>LOW_SWART_PROG</id>
<value>-1</value>
</field>
</fields>
</record>
</records>
</form>
<form>
<id>ABDCEDF</id>
<records>
<record>
<isDelete>false</isDelete>
<fields>
<field>
<id>LOW_COD_CONDUCT</id>
<value>-1</value>
</field>
</fields>
</record>
</records>
</form>
<form>
<id>CODDUC</id>
<records>
<record>
<isDelete>false</isDelete>
<fields>
<field>
<id>LOW_SUPPLIER_TYPE</id>
<value>2</value>
</field>
<field>
<id>LOW_DO_INT_BOTH</id>
<value>1</value>
</field>
</fields>
</record>
</records>
</form>
</forms>
</supplier>
</integrationEntityDetails>
</integration-outbound:IntegrationEntity>
目标是实现通用 xml 到 csv 的转换。根据输入文件,xml 应该被展平并分解为多个 csv 并存储。
XPATH,ColumName,CSV_File_Name,ParentKey
/integration-outbound:IntegrationEntity/integrationEntityHeader/integrationTrackingNumber,integrationTrackingNumber,integrationEntityHeader.csv,
/integration-outbound:IntegrationEntity/integrationEntityHeader/referenceCodeForEntity,referenceCodeForEntity,integrationEntityHeader.csv,
/integration-outbound:IntegrationEntity/integrationEntityHeader/attachments/attachment[]/id,id,integrationEntityHeader.csv,
/integration-outbound:IntegrationEntity/integrationEntityHeader/attachments/attachment[]/name,name,integrationEntityHeader.csv,
/integration-outbound:IntegrationEntity/integrationEntityHeader/attachments/attachment[]/size,size,integrationEntityHeader.csv,
/integration-outbound:IntegrationEntity/integrationEntityHeader/source,source,integrationEntityHeader.csv,
/integration-outbound:IntegrationEntity/integrationEntityHeader/entity,entity,integrationEntityHeader.csv,
/integration-outbound:IntegrationEntity/integrationEntityHeader/action,action,integrationEntityHeader.csv,
/integration-outbound:IntegrationEntity/integrationEntityHeader/timestampUTC,timestampUTC,integrationEntityHeader.csv,
/integration-outbound:IntegrationEntity/integrationEntityHeader/zDocBaseVersion,zDocBaseVersion,integrationEntityHeader.csv,
/integration-outbound:IntegrationEntity/integrationEntityHeader/zDocCustomVersion,zDocCustomVersion,integrationEntityHeader.csv,
/integration-outbound:IntegrationEntity/integrationEntityHeader/integrationTrackingNumber,integrationTrackingNumber,integrationEntityDetailsControlBlock.csv,Y
/integration-outbound:IntegrationEntity/integrationEntityHeader/referenceCodeForEntity,referenceCodeForEntity,integrationEntityDetailsControlBlock.csv,Y
/integration-outbound:IntegrationEntity/integrationEntityDetails/supplier/requestId,requestId,integrationEntityDetailsControlBlock.csv,
/integration-outbound:IntegrationEntity/integrationEntityDetails/supplier/controlBlock/dataProcessingInfo[]/key,key,integrationEntityDetailsControlBlock.csv,
/integration-outbound:IntegrationEntity/integrationEntityDetails/supplier/controlBlock/dataProcessingInfo[]/value,value,integrationEntityDetailsControlBlock.csv,
/integration-outbound:IntegrationEntity/integrationEntityDetails/supplier/id,supplier_id,integrationEntityDetailsControlBlock.csv,
/integration-outbound:IntegrationEntity/integrationEntityDetails/supplier/forms/form[]/id,id,integrationEntityDetailsForms.csv,
/integration-outbound:IntegrationEntity/integrationEntityDetails/supplier/forms/form[]/records/record[]/Internalid,Internalid,integrationEntityDetailsForms.csv,
/integration-outbound:IntegrationEntity/integrationEntityDetails/supplier/forms/form[]/records/record[]/isDelete,FormId,integrationEntityDetailsForms.csv,
/integration-outbound:IntegrationEntity/integrationEntityDetails/supplier/forms/form[]/records/record[]/fields/field[]/id,SupplierFormRecordFieldId,integrationEntityDetailsForms.csv,
/integration-outbound:IntegrationEntity/integrationEntityDetails/supplier/forms/form[]/records/record[]/fields/field[]/value,SupplierFormRecordFieldValue,integrationEntityDetailsForms.csv,
/integration-outbound:IntegrationEntity/integrationEntityHeader/integrationTrackingNumber,integrationTrackingNumber,integrationEntityDetailsForms.csv,Y
/integration-outbound:IntegrationEntity/integrationEntityHeader/referenceCodeForEntity,referenceCodeForEntity,integrationEntityDetailsForms.csv,Y
/integration-outbound:IntegrationEntity/integrationEntityDetails/supplier/requestId,requestId,integrationEntityDetailsForms.csv,Y
/integration-outbound:IntegrationEntity/integrationEntityDetails/supplier/id,supplier_id,integrationEntityDetailsForms.csv,Y
我需要从中创建 3 个 csv 文件输出。
import json
import xmltodict
with open("/home/s0998hws/test.xml") as xml_file:
data_dict = xmltodict.parse(xml_file.read())
xml_file.close()
# generate the object using json.dumps()
# corresponding to json data
json_data = json.dumps(data_dict)
# Write the json data to output
# json file
with open("data.json", "w") as json_file:
json_file.write(json_data)
json_file.close()
with open('data.json') as f:
d = json.load(f)
第 2 步 - 使用 panda normalize 函数进行归一化 -
df_1=pd.json_normalize(data=d['integration-outbound:IntegrationEntity'])
df_2=df_1[['integrationEntityHeader.integrationTrackingNumber','integrationEntityDetails.supplier.requestId','integrationEntityHeader.referenceCodeForEntity','integrationEntityDetails.supplier.id','integrationEntityDetails.supplier.forms.form']]
df_3=df_2.explode('integrationEntityDetails.supplier.forms.form')
df_3['integrationEntityDetails.supplier.forms.form.id']=df_3['integrationEntityDetails.supplier.forms.form'].apply(lambda x: x.get('id'))
df_3['integrationEntityDetails.supplier.forms.form.records']=df_3['integrationEntityDetails.supplier.forms.form'].apply(lambda x: x.get('records'))
我试图使用 csv 文件中的元数据并对其进行修改,但挑战是
df_3['integrationEntityDetails.supplier.forms.form.records.record.Internalid']=df_3['integrationEntityDetails.supplier.forms.form.records.record'].apply(lambda x: x.get('Internalid'))
因错误而失败 -
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib64/python3.6/site-packages/pandas/core/series.py", line 3848, in apply
mapped = lib.map_infer(values, f, convert=convert_dtype)
File "pandas/_libs/lib.pyx", line 2327, in pandas._libs.lib.map_infer
File "<stdin>", line 1, in <lambda>
AttributeError: 'list' object has no attribute 'get'
原因是 panda 数据帧中的数据有 list when 和 array 并且无法使用上述方法进行检查。
integrationEntityHeader.integrationTrackingNumber integrationEntityDetails.supplier.requestId integrationEntityHeader.referenceCodeForEntity integrationEntityDetails.supplier.id integrationEntityDetails.supplier.forms.form integrationEntityDetails.supplier.forms.form.id integrationEntityDetails.supplier.forms.form.records
0 281#999eb16e-242c-4239-b33e-ae6f5296fb15#10c7338c-ab63-4c1d-9d26-877ba40a4b4b#1583507840354 2614352 25428 1647059 {'id': 'CATEGORY_PRODSER', 'records': {'record': [{'Internalid': '24348', 'isDelete': 'false', 'fields': {'field': [{'id': 'CATEGOR_LEVEL_1', 'value': 'MR'}, {'id': 'LOW_PRODSERV', 'value': 'RES'}, {'id': 'LOW_LEVEL_2', 'value': 'keylevel221'}, {'id': 'LOW_LEVEL_3', 'value': 'keylevel3127'}, {'id': 'LOW_LEVEL_4', 'value': 'keylevel4434'}, {'id': 'LOW_LEVEL_5', 'value': 'keylevel5545'}]}}, {'Internalid': '24349', 'isDelete': 'false', 'fields': {'field': [{'id': 'CATEGOR_LEVEL_1', 'value': 'MR'}, {'id': 'LOW_PRODSERV', 'value': 'RES'}, {'id': 'LOW_LEVEL_2', 'value': 'keylevel221'}, {'id': 'LOW_LEVEL_3', 'value': 'keylevel3125'}, {'id': 'LOW_LEVEL_4', 'value': 'keylevel4268'}, {'id': 'LOW_LEVEL_5', 'value': 'keylevel5418'}]}}, {'Internalid': '24350', 'isDelete': 'false', 'fields': {'field': [{'id': 'CATEGOR_LEVEL_1', 'value': 'MR'}, {'id': 'LOW_PRODSERV', 'value': 'RES'}, {'id': 'LOW_LEVEL_2', 'value': 'keylevel221'}, {'id': 'LOW_LEVEL_3', 'value': 'keylevel3122'}, {'id': 'LOW_LEVEL_4', 'value': 'keylevel425'}, {'id': 'LOW_LEVEL_5', 'value': 'keylevel5221'}]}}]}} CATEGORY_PRODSER {'record': [{'Internalid': '24348', 'isDelete': 'false', 'fields': {'field': [{'id': 'CATEGOR_LEVEL_1', 'value': 'MR'}, {'id': 'LOW_PRODSERV', 'value': 'RES'}, {'id': 'LOW_LEVEL_2', 'value': 'keylevel221'}, {'id': 'LOW_LEVEL_3', 'value': 'keylevel3127'}, {'id': 'LOW_LEVEL_4', 'value': 'keylevel4434'}, {'id': 'LOW_LEVEL_5', 'value': 'keylevel5545'}]}}, {'Internalid': '24349', 'isDelete': 'false', 'fields': {'field': [{'id': 'CATEGOR_LEVEL_1', 'value': 'MR'}, {'id': 'LOW_PRODSERV', 'value': 'RES'}, {'id': 'LOW_LEVEL_2', 'value': 'keylevel221'}, {'id': 'LOW_LEVEL_3', 'value': 'keylevel3125'}, {'id': 'LOW_LEVEL_4', 'value': 'keylevel4268'}, {'id': 'LOW_LEVEL_5', 'value': 'keylevel5418'}]}}, {'Internalid': '24350', 'isDelete': 'false', 'fields': {'field': [{'id': 'CATEGOR_LEVEL_1', 'value': 'MR'}, {'id': 'LOW_PRODSERV', 'value': 'RES'}, {'id': 'LOW_LEVEL_2', 'value': 'keylevel221'}, {'id': 'LOW_LEVEL_3', 'value': 'keylevel3122'}, {'id': 'LOW_LEVEL_4', 'value': 'keylevel425'}, {'id': 'LOW_LEVEL_5', 'value': 'keylevel5221'}]}}]}
0 281#999eb16e-242c-4239-b33e-ae6f5296fb15#10c7338c-ab63-4c1d-9d26-877ba40a4b4b#1583507840354 2614352 25428 1647059 {'id': 'OTHER_INFOR', 'records': {'record': {'isDelete': 'false', 'fields': {'field': [{'id': 'S_EAST', 'value': 'N'}, {'id': 'W_EST', 'value': 'N'}, {'id': 'M_WEST', 'value': 'N'}, {'id': 'N_EAST', 'value': 'N'}, {'id': 'LOW_AREYOU_ASSET', 'value': '-1'}, {'id': 'LOW_SWART_PROG', 'value': '-1'}]}}}} OTHER_INFOR {'record': {'isDelete': 'false', 'fields': {'field': [{'id': 'S_EAST', 'value': 'N'}, {'id': 'W_EST', 'value': 'N'}, {'id': 'M_WEST', 'value': 'N'}, {'id': 'N_EAST', 'value': 'N'}, {'id': 'LOW_AREYOU_ASSET', 'value': '-1'}, {'id': 'LOW_SWART_PROG', 'value': '-1'}]}}}
0 281#999eb16e-242c-4239-b33e-ae6f5296fb15#10c7338c-ab63-4c1d-9d26-877ba40a4b4b#1583507840354 2614352 25428 1647059 {'id': 'CORPORATESUSTAINABILITY', 'records': {'record': {'isDelete': 'false', 'fields': {'field': {'id': 'LOW_COD_CONDUCT', 'value': '-1'}}}}} CORPORATESUSTAINABILITY {'record': {'isDelete': 'false', 'fields': {'field': {'id': 'LOW_COD_CONDUCT', 'value': '-1'}}}}
0 281#999eb16e-242c-4239-b33e-ae6f5296fb15#10c7338c-ab63-4c1d-9d26-877ba40a4b4b#1583507840354 2614352 25428 1647059 {'id': 'PRODUCTSERVICES', 'records': {'record': {'isDelete': 'false', 'fields': {'field': [{'id': 'LOW_SUPPLIER_TYPE', 'value': '2'}, {'id': 'LOW_DO_INT_BOTH', 'value': '1'}]}}}} PRODUCTSERVICES {'record': {'isDelete': 'false', 'fields': {'field': [{'id': 'LOW_SUPPLIER_TYPE', 'value': '2'}, {'id': 'LOW_DO_INT_BOTH', 'value': '1'}]}}}
预期输出integrationTrackingNumber requestId referenceCodeForEntity supplier.id integrationEntityDetails.supplier.forms.form.id InternalId isDelete SupplierFormRecordFieldId SupplierFormRecordFieldValue
281#963-4c1d-9d26-877ba40a4b4b#1583507840354 2614352 25428 1647059 CATEGORY_PRODSER 24348 FALSE CATEGOR_LEVEL_1 MR
281#963-4c1d-9d26-877ba40a4b4b#1583507840354 2614352 25428 1647059 CATEGORY_PRODSER 24348 FALSE LOW_PRODSERV RES
281#963-4c1d-9d26-877ba40a4b4b#1583507840354 2614352 25428 1647059 CATEGORY_PRODSER 24348 FALSE LOW_LEVEL_2 keylevel221
281#963-4c1d-9d26-877ba40a4b4b#1583507840354 2614352 25428 1647059 CATEGORY_PRODSER 24348 FALSE LOW_LEVEL_3 keylevel3127
281#963-4c1d-9d26-877ba40a4b4b#1583507840354 2614352 25428 1647059 CATEGORY_PRODSER 24348 FALSE LOW_LEVEL_4 keylevel4434
281#963-4c1d-9d26-877ba40a4b4b#1583507840354 2614352 25428 1647059 CATEGORY_PRODSER 24348 FALSE LOW_LEVEL_5 keylevel5545
281#963-4c1d-9d26-877ba40a4b4b#1583507840354 2614352 25428 1647059 CATEGORY_PRODSER 24350 FALSE CATEGOR_LEVEL_1 MR
281#963-4c1d-9d26-877ba40a4b4b#1583507840354 2614352 25428 1647059 CATEGORY_PRODSER 24350 FALSE LOW_PRODSERV RES
281#963-4c1d-9d26-877ba40a4b4b#1583507840354 2614352 25428 1647059 CATEGORY_PRODSER 24350 FALSE LOW_LEVEL_2 keylevel221
281#963-4c1d-9d26-877ba40a4b4b#1583507840354 2614352 25428 1647059 CATEGORY_PRODSER 24350 FALSE LOW_LEVEL_3 keylevel3122
281#963-4c1d-9d26-877ba40a4b4b#1583507840354 2614352 25428 1647059 CATEGORY_PRODSER 24350 FALSE LOW_LEVEL_4 keylevel425
281#963-4c1d-9d26-877ba40a4b4b#1583507840354 2614352 25428 1647059 CATEGORY_PRODSER 24350 FALSE LOW_LEVEL_5 keylevel5221
281#963-4c1d-9d26-877ba40a4b4b#1583507840354 2614352 25428 1647059 OTHER_INFOR FALSE S_EAST N
281#963-4c1d-9d26-877ba40a4b4b#1583507840354 2614352 25428 1647059 OTHER_INFOR FALSE W_EST N
281#963-4c1d-9d26-877ba40a4b4b#1583507840354 2614352 25428 1647059 OTHER_INFOR FALSE M_WEST N
281#963-4c1d-9d26-877ba40a4b4b#1583507840354 2614352 25428 1647059 OTHER_INFOR FALSE N_EAST N
281#963-4c1d-9d26-877ba40a4b4b#1583507840354 2614352 25428 1647059 OTHER_INFOR FALSE LOW_AREYOU_ASSET -1
281#963-4c1d-9d26-877ba40a4b4b#1583507840354 2614352 25428 1647059 CORPORATESUSTAINABILITY FALSE LOW_SWART_PROG -1
281#963-4c1d-9d26-877ba40a4b4b#1583507840354 2614352 25428 1647059 CORPORATESUSTAINABILITY FALSE LOW_COD_CONDUCT -1
281#963-4c1d-9d26-877ba40a4b4b#1583507840354 2614352 25428 1647059 PRODUCTSERVICES FALSE LOW_SUPPLIER_TYPE 2
281#963-4c1d-9d26-877ba40a4b4b#1583507840354 2614352 25428 1647059 PRODUCTSERVICES FALSE LOW_DO_INT_BOTH 1
最佳答案
考虑XSLT ,一种特殊用途的语言,旨在转换 XML 文件,例如在某些部分将它们展平。 Python的第三方模块,lxml ,可以运行 XSLT 1.0 脚本和 XPath 1.0 表达式。
具体来说,XSLT 可以处理您的 XPath 提取。然后,从单个转换结果树中,构建所需的三个数据框。为了格式良好,下面假设以下根和数据结构:
<integration-outbound:IntegrationEntity
xmlns:integration-outbound="http://example.com"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
...same content...
</integration-outbound:IntegrationEntity>
XSLT (另存为 .xsl,一个特殊的 .xml 文件)
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:integration-outbound="http://example.com"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<xsl:output method="xml" omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="integration-outbound:IntegrationEntity">
<data>
<xsl:apply-templates select="integrationEntityHeader/descendant::attachment"/>
<xsl:apply-templates select="integrationEntityDetails/descendant::dataProcessingInfo"/>
<xsl:apply-templates select="integrationEntityDetails/descendant::forms/descendant::field"/>
</data>
</xsl:template>
<xsl:template match="attachment">
<integrationEntityHeader>
<xsl:copy-of select="ancestor::integrationEntityHeader/*[name()!='attachments']"/>
<xsl:copy-of select="*"/>
</integrationEntityHeader>
</xsl:template>
<xsl:template match="dataProcessingInfo">
<integrationEntityDetailsControlBlock>
<xsl:copy-of select="ancestor::integration-outbound:IntegrationEntity/integrationEntityHeader/*[position() <= 2]"/>
<requestId><xsl:value-of select="ancestor::supplier/requestId"/></requestId>
<supplier_id><xsl:value-of select="ancestor::supplier/id"/></supplier_id>
<xsl:copy-of select="*"/>
</integrationEntityDetailsControlBlock>
</xsl:template>
<xsl:template match="field">
<integrationEntityDetailsForms>
<form_id><xsl:value-of select="ancestor::form/id"/></form_id>
<xsl:copy-of select="ancestor::record/*[name()!='fields']"/>
<SupplierFormRecordFieldId><xsl:value-of select="id"/></SupplierFormRecordFieldId>
<SupplierFormRecordFieldValue><xsl:value-of select="id"/></SupplierFormRecordFieldValue>
<xsl:copy-of select="ancestor::integration-outbound:IntegrationEntity/integrationEntityHeader/*[position() <= 2]"/>
<requestId><xsl:value-of select="ancestor::supplier/requestId"/></requestId>
<supplier_id><xsl:value-of select="ancestor::supplier/id"/></supplier_id>
</integrationEntityDetailsForms>
</xsl:template>
</xsl:stylesheet>
Online Transformation
import lxml.etree as et
import pandas as pd
# LOAD XML AND XSL
doc = et.parse('Input.xml')
style = et.parse('Script.xsl')
# INITIALIZE AND RUN TRANSFORMATION
transformer = et.XSLT(style)
flat_doc = transformer(doc)
# BUILD THREE DATA FRAMES
df_header = pd.DataFrame([{i.tag:i.text for i in el}
for el in flat_doc.xpath('integrationEntityHeader')])
df_detailsControlBlock = pd.DataFrame([{i.tag:i.text for i in el}
for el in flat_doc.xpath('integrationEntityDetailsControlBlock')])
df_detailsForms = pd.DataFrame([{i.tag:i.text for i in el}
for el in flat_doc.xpath('integrationEntityDetailsForms')])
关于python - 使用 Python Pandas 从 XML/Json 创建 CSV,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62766200/
pandas.crosstab 和 Pandas 数据透视表似乎都提供了完全相同的功能。有什么不同吗? 最佳答案 pivot_table没有 normalize争论,不幸的是。 在 crosstab
我能找到的最接近的答案似乎太复杂:How I can create an interval column in pandas? 如果我有一个如下所示的 pandas 数据框: +-------+ |
这是我用来将某一行的一列值移动到同一行的另一列的当前代码: #Move 2014/15 column ValB to column ValA df.loc[(df.Survey_year == 201
我有一个以下格式的 Pandas 数据框: df = pd.DataFrame({'a' : [0,1,2,3,4,5,6], 'b' : [-0.5, 0.0, 1.0, 1.2, 1.4,
所以我有这两个数据框,我想得到一个新的数据框,它由两个数据框的行的克罗内克积组成。正确的做法是什么? 举个例子:数据框1 c1 c2 0 10 100 1 11 110 2 12
TL;DR:在 pandas 中,如何绘制条形图以使其 x 轴刻度标签看起来像折线图? 我制作了一个间隔均匀的时间序列(每天一个项目),并且可以像这样很好地绘制它: intensity[350:450
我有以下两个时间列,“Time1”和“Time2”。我必须计算 Pandas 中的“差异”列,即 (Time2-Time1): Time1 Time2
从这个 df 去的正确方法是什么: >>> df=pd.DataFrame({'a':['jeff','bob','jill'], 'b':['bob','jeff','mike']}) >>> df
我想按周从 Pandas 框架中的列中累积计算唯一值。例如,假设我有这样的数据: df = pd.DataFrame({'user_id':[1,1,1,2,2,2],'week':[1,1,2,1,
数据透视表的表示形式看起来不像我在寻找的东西,更具体地说,结果行的顺序。 我不知道如何以正确的方式进行更改。 df示例: test_df = pd.DataFrame({'name':['name_1
我有一个数据框,如下所示。 Category Actual Predicted 1 1 1 1 0
我有一个 df,如下所示。 df: ID open_date limit 1 2020-06-03 100 1 2020-06-23 500
我有一个 df ,其中包含与唯一值关联的各种字符串。对于这些唯一值,我想删除不等于单独列表的行,最后一行除外。 下面使用 Label 中的各种字符串值与 Item 相关联.所以对于每个唯一的 Item
考虑以下具有相同名称的列的数据框(显然,这确实发生了,目前我有一个像这样的数据集!:() >>> df = pd.DataFrame({"a":range(10,15),"b":range(5,10)
我在 Pandas 中有一个 DF,它看起来像: Letters Numbers A 1 A 3 A 2 A 1 B 1 B 2
如何减去两列之间的时间并将其转换为分钟 Date Time Ordered Time Delivered 0 1/11/19 9:25:00 am 10:58:00 am
我试图理解 pandas 中的下/上百分位数计算,但有点困惑。这是它的示例代码和输出。 test = pd.Series([7, 15, 36, 39, 40, 41]) test.describe(
我有一个多索引数据框,如下所示: TQ bought HT Detailed Instru
我需要从包含值“低”,“中”或“高”的数据框列创建直方图。当我尝试执行通常的df.column.hist()时,出现以下错误。 ex3.Severity.value_counts() Out[85]:
我试图根据另一列的长度对一列进行子串,但结果集是 NaN .我究竟做错了什么? import pandas as pd df = pd.DataFrame([['abcdefghi','xyz'],
我是一名优秀的程序员,十分优秀!