gpt4 book ai didi

python-3.x - pandas df : add column if doesn't exist, 将值添加到字典中的新列

转载 作者:行者123 更新时间:2023-12-04 01:54:06 25 4
gpt4 key购买 nike

我是 pandas 的新手,但我正在尝试创建一个大型数据框,我在其中按序列 ID (Seq_ID) 组织有关大量序列的信息,并将有关序列的信息添加到数据框。目前 df 看起来像这样:

     Seq_ID        mol_type
0 4_cDNA_v RNA
1 2_133+_v RNA
2 5_BM4D_g RNA
. .
. .
1301 4_PB_g RNA

我想写一个函数来查看我当前的 df,source_df,如果列“Seq_source”不存在,它会添加它。然后填写“Seq_Source”列,我有一系列称为 cell_type 的键值对。我想搜索 Seq_ID 列以查看是否在 Seq_ID 中找到了任何值,如果是,则将键添加到新列“Seq_Source”的相应行中,使其如下所示:

     Seq_ID    mol_type    Seq_Source
0 4_cDNA_v RNA PB
1 2_133+_v RNA HSPC
2 5_BM4D_g RNA BMMC
.
.
1301 4_CD4_g RNA PBMC

我写了一些伪代码来帮助解释我对一种方法的想法。

cell_type = {
'PBMC':['CD4','NK', 'CD8'],
'HSPC': ['133+', '133+F'],
'PB': ['cDNA', 'cDNAA', 'cDNAB', 'cDNAC'],
'BMMC':['cDNABM', '34D_Vc','BM4_Vs', 'BM4_Vc', 'BM4n_Vs']
}


def find_cell_source(dictionary, df, reference, new_header):
'''
takes in a dictionary where key corresponds to list of values.
If new_header does not exist, the new column is created.
If a value from key:value pair is found within any of the string entries under reference column
in the database, key is added to reference row under new_header.
'''

# add new_header if does not exist
df[new_header] = [df[new_header] if new_header not in df]

# read rows of reference column and see if values from dict is in references
# add key to row under new_header if it exists, pass if it doesn't

for i in df['reference']:
for k,v in dictionary:
for j in v:
if j in i:
df['new_header'] = k
else:
pass
return df


find_cell_source(cell_type, source_df, 'Seq_ID', 'Seq_Source')

最佳答案

您可以通过多种方式获取Seq_ID 的相关部分,在这种情况下,您似乎可以只使用.str.split,然后映射值。如果在 _ 上拆分还不够,也许可以使用 regex

d = dict((k,v) for v, x in cell_type.items() for k in x)
df['Seq_Source'] = df.Seq_ID.str.split('_', expand=True)[1].map(d)

输出:

        Seq_ID mol_type Seq_Source
0 4_cDNA_v RNA PB
1 2_133+_v RNA HSPC
2 5_BM4D_g RNA NaN
1301 4_CD4_g RNA PBMC

请注意,由于 BM4D 不在 cell_type 的任何列表中,因此它被映射到 NaN

关于python-3.x - pandas df : add column if doesn't exist, 将值添加到字典中的新列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51546445/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com