gpt4 book ai didi

python - 查找两列比较之间的唯一字符

转载 作者:太空宇宙 更新时间:2023-11-04 09:27:01 25 4
gpt4 key购买 nike

我想比较 column1 和 column2 并获取导致从 column1 检测到差异的唯一值。所以在这种情况下,我应该得到的答案是“Residence - Location”、“-12”、“NAN”和“NA”(空的)。比较第一列和第二列

另外,我们可以创建结果并将其存储在另一列中吗?

Result
index column1 column2 diff
1. Admission Date Residence - Location Residence - Location
2. Malnutrition Malnutrition-12 -12
3. TB NAN NAN
4. Anaemia NA NA

代码可以使用 R 或 Python。我不介意

def FindDifference(Row):
x = Row['column1']
y = Row['column2']

Difference = ""
if pd.isnull(y) or y=="nan" or y=="NA":
return NaN
if len(x) <= len(y):
for i in y:
if i not in x:
Difference += str(i)
else:
for i in x:
if i not in y:
Difference += str(i)
return Difference

ReadDataT = Final_df[['column1','column2']]
ReadDataT['diff']= ReadDataT.apply(lambda x: FindDifference(x),axis=1)
ReadDataT

这段代码的问题在于它比较了两个字符之间的每个字符并给出了不仅在两列中的字符的结果......比如说第一行给出了'RC-Lc'作为差异

最佳答案

library(dplyr); library(stringr)
df %>% mutate(diff = str_remove(column2, column1))

index column1 column2 diff
1 1 Admission Date Residence - Location Residence - Location
2 2 Malnutrition Malnutrition-12 -12
3 3 TB NAN NAN
4 4 Anaemia <NA> <NA>

编辑:与 dplyr 相同

df$diff = stringr::str_remove(df$column2, df$column1)

关于python - 查找两列比较之间的唯一字符,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57163512/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com