我正在尝试使用 pandas 和 numpy 根据“a”列在 Python 中按组计算布里渊多样性指数。但是出了点问题。
import pandas as pd
import numpy as np
def Brillouin_Index(x):
for i in range(len(x)):
x["Brillouin_Index"] = (np.log10(np.math.factorial(np.sum(x))) - np.sum(np.log10(np.math.factorial(x[i])))) / np.sum(x)
return x
a = list("ABCDEADECS")
b = [12,23,12,12,32,34,21,2,10,5]
c = {"a":a,"b":b}
data = pd.DataFrame(c)
data
data.groupby("a").apply(Brillouin_Index)
我执行了上面的代码,有两个错误。
TypeError: cannot convert the series to <class 'int'>
AttributeError: 'int' object has no attribute 'log10'
具体公式见以下链接Brillouin’s Diversity Index
我用其他软件计算了每组的值
- H_A = 0.2965
- H_B = 0
- H_C = 0.264
- H_D = 0.259
- H_E = 0.08085
- H_S = 0
非常感谢!
我用R分组计算布里渊多样性指数,代码如下:
Brillouin_Diversity_Index <- function(x)
{ N <- sum(x)
(log10(factorial(N)) - sum(log10(factorial(x)))) / N
}
dt <- data.table(x = c("A","B","C","D","E","A","D","E","C","S"),
y = c(12,23,12,12,32,34,21,2,10,5))
dt[,Brillouin_Diversity_Index(y),by = .(x)]
- x V1
- A 0.23021887
- B 0.00000000
- C 0.26412121
- D 0.25909105
- E 0.08085185
- 0.00000000
我是一名优秀的程序员,十分优秀!