gpt4 book ai didi

python - 在 python 中对列表进行分箱

转载 作者:行者123 更新时间:2023-11-30 23:31:18 24 4
gpt4 key购买 nike

首先,我想说我是 python 新手,这段代码是根据 stackoverflow 上用户的意见和建议创建的。代码如下所示:

f = open('E:\Python27\WASP DATA\Sample Data.txt',"r")
num=0
line = f.readlines()

X = []
for n, lines in enumerate(line, 0): #6621
# make it 109 to remove the first line "['# Column 3: Magnitude error\n']"
if (n > 109):
linSplit = lines.split(' ')
joined = ' '.join(linSplit)
# apply the float function to every item in joined.split
# create a new list of floats in tmp variable
tmp = map((lambda x: float(x)), joined.split())
X.append(tmp)

#print X[0] # print first element in the list

Period_1 = float(line[28][23:31])
Epoch_1 = float(line[27][22:31])
Period_2 = float(line[44][23:31])
Epoch_2 = float(line[43][22:31])
#Period_3 = float(line[60][23:31])
#Epoch_3 = float(line[59][22:31])
#Period_4 = float(line[76][23:31])
#Epoch_4 = float(line[75][22:31])
#Period_5 = float(line[108][23:31])
#Epoch_5 = float(line[91][22:31])

print("The time periods are:")
print Period_1
print Period_2
#print Period_3
#print Period_4
#print Period_5

print("\nThe Epoch times are:")
print Epoch_1
print Epoch_2
#print Epoch_3
#print Epoch_4
#print Epoch_5
print('respectively.')

P = []
phase_var = float

for j in range(0,len(X),1):
phase_var = (X[j][0] + (10*Period_1) - Epoch_1)/Period_1
P.append(phase_var)

print P[0]

for m in range(0,len(P),1):
P[m]=float(P[m]-int(P[m]))

#print P[0]

Mag = []

for n in range(0,len(X),1):
temp = X[n][1]
Mag.append(temp)

#print Mag[0]
#print X[0]

from pylab import *

#Plotting the first scatter diagram to see if data is phased correctly.

#plot(P, Mag)
scatter(P, Mag)
xlabel('Phase (Periods)')
ylabel('Magnitude')
#title('Dunno yet')
grid(True)
savefig("test.png")
show()

#Bin the data to create graph where magnitudes are averaged, and B lets us mess around with the binning resolution, and reducing effect of extraneous data points.

B = 2050
minv = min(P)
maxv = max(P)
bincounts = []
for i in range(B+1):
bincounts.append(0)
for d in P:
b = int((d - minv) / (maxv - minv) * B)
bincounts[b] += 1

# plot new scatter

scatter(bincounts, Mag)
show()

原图是P和Mag的散点图。然而,每个时间段有多个 Mag 点。我希望尝试创建一个新的散点图,在其中我可以获取所有这些 Y 值并对每个单独的 X 值进行平均,从而创建一个具有两个下降的更紧密的图表。

我尝试过查看各种对数据进行分箱的方法,但是无论我使用哪种方法,包含分箱数据的图表似乎都无法正确显示。 X 值应从 0 到 1,就像预分箱数据图上一样。

这是我正在使用的数据,以防万一您需要查看它。

http://pastebin.com/60E84azv

任何人都可以就如何创建分箱数据图提供任何建议或意见吗?我对数据分箱的了解非常少。

感谢您的宝贵时间!

最佳答案

这实际上解决了很多问题,而不仅仅是分箱部分。我已经包含了用于解析数据文件开头的 block 的代码,因此您可以获得所有峰值数据

import numpy
import re
import matplotlib.pyplot as plt

f = open('sample_data.txt')
f.next()

pair = re.compile(r'# (.*?)[ \t]*:[ \t]*([0-9e\.-]+).*')

blocks = []
block = {}
blocks.append(block)

for line in f:
if line[0] <> '#':
blocks.append(block)
break
line = line.strip()
m = pair.match(line)
if m:
print line
key, valstr = m.groups()
print key, valstr
try:
value = float(valstr)
except:
value = valstr
block[key] = value

if (line == "#") and len(block) > 0:
blocks.append(block)
block = {}

peaks = sorted([block for block in blocks if 'PEAK' in block],
key=lambda b: b['PEAK'])
print peaks

colnames = ['HJD', 'Tamuz-corrected magnitude', 'Magnitude error']
data = numpy.loadtxt(f, [(colname, 'float64') for colname in colnames])

Nbins = 50
for peak in peaks:
plt.figure()
phase, _ = numpy.modf((data['HJD'] + 10*peak['Period (days)'] - peak['Epoch'])/peak['Period (days)'])
mag = data['Tamuz-corrected magnitude']

# use numpy.histogram to calculate the sum and the number of points in the bins
sums, _ = numpy.histogram(phase, bins=Nbins, weights=mag)
N, bin_edges = numpy.histogram(phase, bins=Nbins)

# We'll plot the value at the center of each bin
centers = (bin_edges[:-1] + bin_edges[1:])/2

plt.scatter(phase, mag, alpha=0.2)
plt.plot(centers, sums/N, color='red', linewidth=2)
plt.show()

关于python - 在 python 中对列表进行分箱,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/20017267/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com