gpt4 book ai didi

python-3.x - Pandas 合并有两个具有相同代码和输入数据的结果

转载 作者:行者123 更新时间:2023-12-04 01:39:26 25 4
gpt4 key购买 nike

我有两个dataframe要合并。当我用相同的输入数据和代码运行程序时,会出现两种情况(第一种:合并成功;第二种:合并数据中属于'annotate'的数据为NaN。)

raw_df2 = pd.merge(annotate,raw_df,on='gene',how='right').fillna("unkown")

然后我有一个测试:

count = 10001
while (count > 10000):
raw_df2 = pd.merge(annotate,raw_df,on='gene',how='right').fillna("unkown")
count = len(raw_df2[raw_df2["type"]=="unkown"])
print(count)

如果合并失败,“raw_df”在运行过程中总是失败,我必须重新提交脚本,结果可能会成功。

[前两列来自'annotate';其他来自'raw_df']
失败的结果:

|  type  |     gene      |          locus           | sample_1 | sample_2 | status | value_1 | value_2  |
+--------+---------------+--------------------------+----------+----------+--------+---------+----------+
| unknow | 0610040J01Rik | chr5:63812494-63899619 | Ctrl | SPION10 | OK | 2.02125 | 0.652688 |
| unknow | 1110008F13Rik | chr2:156863121-156887078 | Ctrl | SPION10 | OK | 87.7115 | 49.8795 |
+--------+---------------+--------------------------+----------+----------+--------+---------+----------+

成功的结果:

+--------+----------+------------------------+----------+----------+--------+----------+---------+
| gene | type | locus | sample_1 | sample_2 | status | value_1 | value_2 |
+--------+----------+------------------------+----------+----------+--------+----------+---------+
| St18 | misc_RNA | chr1:6487230-6860940 | Ctrl | SPION10 | OK | 1.90988 | 3.91643 |
| Arid5a | misc_RNA | chr1:36307732-36324029 | Ctrl | SPION10 | OK | 1.33796 | 2.21057 |
| Carf | misc_RNA | chr1:60076867-60153953 | Ctrl | SPION10 | OK | 0.846988 | 1.47619 |
+--------+----------+------------------------+----------+----------+--------+----------+---------+

最佳答案

我有一个解决方案,但我仍然不知道是什么原因导致了之前的问题。将我要合并的两个数据框中的列设置为索引。然后使用索引合并两个数据框。运行脚本10次以上,结果不再出错。

# the first dataframe
DataQiime = pd.read_csv(args.FileTranseq,header=None,sep=',') #
DataQiime.columns=['Feature.ID','Frequency']
DataQiime_index = DataQiime.set_index('Feature.ID', inplace=False, drop=True)
# the second dataframe
DataTranseq = pd.read_table(args.FileQiime,header=0,sep='\t',encoding='utf-8') #
DataTranseq_index = DataTranseq.set_index('Feature.ID', inplace=False, drop=True)
# merge by index
DataMerge = pd.merge(DataQiime,DataTranseq,left_index=True,right_index=True,how="inner")

关于python-3.x - Pandas 合并有两个具有相同代码和输入数据的结果,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51838810/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com