gpt4 book ai didi

python - 如何通过计划执行合并两个程序

转载 作者:太空宇宙 更新时间:2023-11-03 17:35:02 25 4
gpt4 key购买 nike

我正在尝试合并两个程序或编写第三个程序,将这两个程序作为函数调用。它们应该在一定时间(以分钟为单位)间隔后一个接一个地运行。类似于 make 文件,稍后会包含更多程序。我无法合并它们,也无法将它们放入某种格式,以便我可以在新的 main 程序中调用它们。

program_master_id.py 从文件夹位置选取 *.csv 文件,并在计算后将 master_ids.csv 文件附加到另一个位置的文件夹。

Program_master_count.pycount 除以相应 timeseriesId 的计数。

Program_1 master_id.py

import pandas as pd
import numpy as np

# csv file contents
# Need to change to path as the Transition_Data has several *.CSV files

csv_file1 = 'Transition_Data/Test_1.csv'
csv_file2 = '/Transition_Data/Test_2.csv'

#master file to be appended only

master_csv_file = 'Data_repository/master_lac_Test.csv'

csv_file_all = [csv_file1, csv_file2]

# read csv into df using list comprehension
# I use buffer here, replace stringIO with your file path

df_all = [pd.read_csv(csv_file) for csv_file in csv_file_all]

# processing
# =====================================================
# concat along axis=0, outer join on axis=1
merged = pd.concat(df_all, axis=0, ignore_index=True, join='outer').set_index('Ids')

# custom function to handle/merge duplicates on Ids (axis=0)
def apply_func(group):
return group.fillna(method='ffill').iloc[-1]

# remove Ids duplicates
merged_unique = merged.groupby(level='Ids').apply(apply_func)

# do the subtraction

df_master = pd.read_csv(master_csv_file, index_col=['Ids']).sort_index()

# select matching records and horizontal concat
df_matched = pd.concat([df_master,merged_unique.reindex(df_master.index)], axis=1)

# use broadcasting
df_matched.iloc[:, 1:] = df_matched.iloc[:, 1:].sub(df_matched.iloc[:, 0], axis=0)

print(df_matched)

Program_2 master_count.py #这不会给出任何错误,也不会给出任何输出。

import pandas as pd
import numpy as np

csv_file1 = '/Data_repository/master_lac_Test.csv'
csv_file2 = '/Data_repository/lat_lon_master.csv'

df1 = pd.read_csv(csv_file1).set_index('Ids')

# need to sort index in file 2
df2 = pd.read_csv(csv_file2).set_index('Ids').sort_index()

# df1 and df2 has a duplicated column 00:00:00, use df1 without 1st column
temp = df2.join(df1.iloc[:, 1:])

# do the division by number of occurence of each Ids
# and add column 00:00:00
def my_func(group):
num_obs = len(group)
# process with column name after 00:30:00 (inclusive)
group.iloc[:,4:] = (group.iloc[:,4:]/num_obs).add(group.iloc[:,3], axis=0)
return group

result = temp.groupby(level='Ids').apply(my_func)

我正在尝试编写一个main程序,该程序将首先调用master_ids.py,然后调用master_count.py。它们是一种将两者合并在一个程序中或将它们编写为函数并在新程序中调用这些函数的方法吗?请建议。

最佳答案

好吧,假设您有program1.py:

import pandas as pd
import numpy as np

def main_program1():
csv_file1 = 'Transition_Data/Test_1.csv'
...
return df_matched

然后是program2.py:

import pandas as pd
import numpy as np

def main_program2():
csv_file1 = '/Data_repository/master_lac_Test.csv'
...
result = temp.groupby(level='Ids').apply(my_func)
return result

您现在可以在单独的 python 程序中使用它们,例如 main.py

import time
import program1 # imports program1.py
import program2 # imports program2.py

df_matched = program1.main_program1()
print(df_matched)
# wait
min_wait = 1
time.sleep(60*min_wait)
# call the second one
result = program2.main_program2()

有很多方法可以“改进”这些,但希望这能向您展示要点。我特别建议您使用 What does if __name__ == "__main__": do?在每个文件中,以便可以轻松地从命令行执行或从 python 调用它们。

另一个选项是 shell 脚本,它对于您的“master_id.py”和“master_count.py”来说变成(以其最简单的形式)

python master_id.py
sleep 60
python master_count.py

保存在“main.sh”中,可以执行为

sh main.sh

关于python - 如何通过计划执行合并两个程序,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31314792/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com