gpt4 book ai didi

python - Pandas 对列进行排序并查找差异

转载 作者:太空宇宙 更新时间:2023-11-03 16:39:42 25 4
gpt4 key购买 nike

我有一个数据框,我想对其进行排序,因此列 a == 列b。如果没有匹配,则将其放入 C 列

我的数据看起来像

filenamesLocal          FilenamesServer
filea.csv fileab.csv
filec.csv filea.csv
fileab.csv filec.csv
filexyz.csv
fileyh.csv

我希望它们在 filenamesLocal = FilenamesServer 处排序,其余部分在另一列中..

filenamesLocal          FilenamesServer        Difference
filea.csv filea.csv filexyz.csv
filec.csv filec.csv fileyh.csv
fileab.csv fileab.csv

到目前为止我的代码..

    ldsdata = pd.read_csv('filelist.csv', sep=" ", header = None)
#data.to_csv("filelist.csv", index=False)
dataproj = pd.read_csv('edslist.txt', sep=" ", header = None)
dataproj.columns = ["fileNameEdsComputer"]
result = pd.concat([ldsdata, dataproj], axis=1, ignore_index=True)
result.columns = ['fileNameLDS', path]
result.sort(['fileNameLDS',path], ascending=[True, False], inplace=True)
result.to_csv('list.csv', index=False)
checkDifferences()

最佳答案

设置

import pandas as pd
from StringIO import StringIO

text="""filenamesLocal FilenamesServer
filea.csv fileab.csv
filec.csv filea.csv
fileab.csv filec.csv
filexyz.csv
fileyh.csv"""

df = pd.read_csv(StringIO(text), delim_whitespace=True)

fnl = df.iloc[:, [0]].set_index(['filenamesLocal'], drop=False).dropna()
fns = df.iloc[:, [1]].set_index(['FilenamesServer'], drop=False).dropna()

print fnl

filenamesLocal
filenamesLocal
filea.csv filea.csv
filec.csv filec.csv
fileab.csv fileab.csv
filexyz.csv filexyz.csv
fileyh.csv fileyh.csv

print fns

FilenamesServer
FilenamesServer
fileab.csv fileab.csv
filea.csv filea.csv
filec.csv filec.csv

对齐fnlfns

aligned = pd.concat([fnl, fns], axis=1)

print aligned

filenamesLocal FilenamesServer
filea.csv filea.csv filea.csv
fileab.csv fileab.csv fileab.csv
filec.csv filec.csv filec.csv
filexyz.csv filexyz.csv NaN
fileyh.csv fileyh.csv NaN

master = aligned.filenamesLocal.combine_first(aligned.FilenamesServer)

print master

filea.csv filea.csv
fileab.csv fileab.csv
filec.csv filec.csv
filexyz.csv filexyz.csv
fileyh.csv fileyh.csv
Name: filenamesLocal, dtype: object

分配差异

aligned['Difference'] = master[aligned.isnull().any(axis=1)]

print aligned

filenamesLocal FilenamesServer Difference
filea.csv filea.csv filea.csv filea.csv
fileab.csv fileab.csv fileab.csv fileab.csv
filec.csv filec.csv filec.csv filec.csv
filexyz.csv filexyz.csv NaN filexyz.csv
fileyh.csv fileyh.csv NaN fileyh.csv

关于python - Pandas 对列进行排序并查找差异,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36922830/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com