gpt4 book ai didi

vba - 分析数组中的数据,删除,然后重新分析VBA

转载 作者:行者123 更新时间:2023-12-03 03:34:14 25 4
gpt4 key购买 nike

正如现在的代码,我可以分析数据集并根据存储在 LowConf 和 HighConf 中的值检测异常值,这些值基于数据集的大小。然而,我只能确定其中一个异常值。我想扩展代码或添加代码,以便我可以执行以下操作:

  1. 首先确定最外围的元素
  2. 判断是否为异常值,如果是,则从DataSet中删除该元素,如果不是,则结束程序。
  3. 更改样本大小(减去 1)并再次分析数据集以查找下一个最外围元素并循环 k 次。

我的想法是创建一个新数组,它是每个 DataSet 元素与平均值的距离(绝对值),然后找到它的 UBound 并在下面的 If 语句中进行分析。我的问题是,一旦我确定它是否是异常值,如何返回 DataSet 中的相应元素并将其删除?如果是这样,有没有更简单的方法来解决这个问题?另外,我是编码新手,因此任何有关可接受格式/如何清理代码的提示也值得赞赏。

Sub CalculateOutliers()

Dim n As Integer
Dim mean As Double
Dim SD As Double
Dim k As Integer
Dim DataSet As Variant
Dim LowConf As Single
Dim HighConf As Single


'--------------------------------------------------------
DataSet = Selection.Value
'Copies highlighted data into DataSet variable
'Cell A1 is (1,1) Because it starts at 0 which is out of range
'--------------------------------------------------------



'--------------------------------------------------------
n = Selection.CountLarge
'Counts number of entries
'--------------------------------------------------------



'--------------------------------------------------------
'DEFINES 95(LowConf) AND 99(HighConf) PERCENT CONFIDENCES BASED ON
'SAMPLE SIZE

If n <= 5 Then

LowConf = 1.72
HighConf = 1.76
End If

If n = 6 Then

LowConf = 1.89
HighConf = 1.97

End If

If n = 7 Then

LowConf = 2.02
HighConf = 2.14

End If

If n = 8 Then

LowConf = 2.13
HighConf = 2.28

End If

If n = 9 Then

LowConf = 2.21
HighConf = 2.39

End If

If n = 10 Then

LowConf = 2.29
HighConf = 2.48

End If

If n = 11 Then

LowConf = 2.36
HighConf = 2.56

End If

If n = 12 Then

LowConf = 2.41
HighConf = 2.64

End If

If n = 13 Then

LowConf = 2.46
HighConf = 2.7

End If

If n = 14 Then

LowConf = 2.51
HighConf = 2.75

End If

If n = 15 Then

LowConf = 2.55
HighConf = 2.81

End If

If n = 16 Then

LowConf = 2.59
HighConf = 2.85

End If

If n = 17 Then

LowConf = 2.62
HighConf = 2.9

End If

If n = 18 Then

LowConf = 2.65
HighConf = 2.93

End If

If n = 19 Then

LowConf = 2.68
HighConf = 2.97

End If

If n = 20 Then

LowConf = 2.71
HighConf = 3

End If

If n = 21 Then

LowConf = 2.73
HighConf = 3.03

End If

If n = 22 Then

LowConf = 2.76
HighConf = 3.06

End If

If n = 23 Then

LowConf = 2.78
HighConf = 3.08

End If

If n = 24 Then

LowConf = 2.8
HighConf = 3.11

End If

If n = 25 Then

LowConf = 2.82
HighConf = 3.14

End If

If n = 26 Then

LowConf = 2.84
HighConf = 3.16

End If

If n = 27 Then

LowConf = 2.86
HighConf = 3.18

End If

If n = 28 Then

LowConf = 2.88
HighConf = 3.2

End If

If n = 29 Then

LowConf = 2.89
HighConf = 3.22

End If

If n = 30 Then

LowConf = 2.91
HighConf = 3.24

End If

If n <= 35 And n > 30 Then

LowConf = 2.98
HighConf = 3.32

End If

If n <= 40 And n > 35 Then

LowConf = 3.04
HighConf = 3.38

End If

If n <= 45 And n > 40 Then

LowConf = 3.09
HighConf = 3.44

End If

If n <= 50 And n > 45 Then

LowConf = 3.13
HighConf = 3.48

End If

If n <= 60 And n > 50 Then

LowConf = 3.2
HighConf = 3.56

End If

If n <= 70 And n > 60 Then

LowConf = 3.26
HighConf = 3.62

End If

If n <= 80 And n > 70 Then

LowConf = 3.31
HighConf = 3.67

End If

If n <= 90 And n > 80 Then

LowConf = 3.35
HighConf = 3.72

End If

If n <= 100 And n > 90 Then

LowConf = 3.38
HighConf = 3.75

End If

If n <= 150 And n > 100 Then

LowConf = 3.52
HighConf = 3.89

End If

If n <= 200 And n > 150 Then

LowConf = 3.61
HighConf = 3.98

End If

If n <= 300 And n > 200 Then

LowConf = 3.72
HighConf = 4.09

End If

If n <= 400 And n > 300 Then

LowConf = 3.8
HighConf = 4.17

End If

If n <= 500 And n > 400 Then

LowConf = 3.86
HighConf = 4.32

End If

If n > 500 Then

MsgBox "Sample size cannot exceed 500."

End If
'--------------------------------------------------------



'--------------------------------------------------------
If n < 50 Then

k = Int(n / 10)

Else

k = 5

End If
'determines k = number of possible outliers
'--------------------------------------------------------



'--------------------------------------------------------
mean = Application.WorksheetFunction.Average(DataSet)
'Calculates mean of Data Set
'--------------------------------------------------------



'--------------------------------------------------------
SD = Application.WorksheetFunction.StDev(DataSet)
'Calculates Standard Deviation of Data Set
'--------------------------------------------------------



'--------------------------------------------------------


For Each element In DataSet


If (Abs(element - mean) / SD) > LowConf Then

MsgBox "95% outlier: " & element


End If

If (Abs(element - mean) / SD) > HighConf Then

MsgBox "99% outlier: " & element


End If


Next element
'--------------------------------------------------------


End Sub

更新:我已经弄清楚了以下代码块。这个循环应该可以工作,但现在我只需要弄清楚如何从数据集中删除可疑对象。一旦删除,它将循环返回并重新计算平均值、SD 和可疑值。我意识到我没有关于何时 UBound 等于 LBound 的代码,但这是我之后要处理的问题。使用此代码,Suspect 始终显示值 1。我不确定为什么会这样,也不确定 DataSet.Remove(Suspect) 是否有效。

Dim i As Long

For i = 1 To k


'--------------------------------------------------------
mean = Application.WorksheetFunction.Average(DataSet)
'Calculates mean of Data Set
'--------------------------------------------------------



'--------------------------------------------------------
SD = Application.WorksheetFunction.StDev(DataSet)
'Calculates Standard Deviation of Data Set
'--------------------------------------------------------



'--------------------------------------------------------
Dim Suspect As Double
If (Abs(UBound(DataSet) - mean)) > (Abs(LBound(DataSet) - mean)) Then
Suspect = UBound(DataSet)
End If
If (Abs(UBound(DataSet) - mean)) < (Abs(LBound(DataSet) - mean)) Then
Suspect = LBound(DataSet)
End If
'Defines what the most outlying value is
'--------------------------------------------------------



'--------------------------------------------------------
Dim Retest As Boolean

If (Abs(Suspect - mean) / SD) > LowConf Then

MsgBox "95% outlier: " & Suspect
Retest = True

End If

If (Abs(Suspect - mean) / SD) > HighConf Then

MsgBox "99% outlier: " & Suspect
Retest = True

End If

If Retest = True Then
DataSet.Remove (Suspect)
End If
MsgBox Suspect
'--------------------------------------------------------
Next i

更新:我稍微更改了部分。我可以将其应用到最小值,但如何指定可疑的位置并将其用作删除和上移的范围?

                        Dim Retest As Boolean

If (Abs(Suspect - mean) / SD) > LowConf Then

MsgBox "95% outlier: " & Suspect
Retest = True

End If

If (Abs(Suspect - mean) / SD) > HighConf Then

MsgBox "99% outlier: " & Suspect
Retest = True

End If
Dim pos As Range

Set pos = Application.Match(Suspect, DataSet)
MsgBox pos
If Retest = True And Suspect = Application.WorksheetFunction.Max(DataSet) Then

Range(pos).Delete Shift:=xlUp

End If

最佳答案

这解决了有关如何清理代码的问题。包括空白在内,您有近 300 行代码(确定配置级别),可以压缩为大约十几行。首先,定义一个函数:

Function ConfLevels(n As Long) As Variant
Dim i As Long
Dim cutpoints As Variant, lowconfs As Variant, highconfs As Variant

cutpoints = Array(5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 150, 200, 300, 400, 500)
lowconfs = Array(1.72, 1.89, 2.02, 2.13, 2.21, 2.29, 2.36, 2.41, 2.46, 2.51, 2.55, 2.59, 2.62, 2.65, 2.68, 2.71, 2.73, 2.76, 2.78, 2.8, 2.82, 2.84, 2.86, 2.88, 2.89, 2.91, 2.98, 3.04, 3.09, 3.13, 3.2, 3.26, 3.31, 3.35, 3.38, 3.52, 3.61, 3.72, 3.8, 3.86)
highconfs = Array(1.76, 1.97, 2.14, 2.28, 2.39, 2.48, 2.56, 2.64, 2.7, 2.75, 2.81, 2.85, 2.9, 2.93, 2.97, 3#, 3.03, 3.06, 3.08, 3.11, 3.14, 3.16, 3.18, 3.2, 3.22, 3.24, 3.32, 3.38, 3.44, 3.48, 3.56, 3.62, 3.67, 3.72, 3.75, 3.89, 3.98, 4.09, 4.17, 4.32)
i = 0
Do While n > cutpoints(i)
i = i + 1
Loop
ConfLevels = Array(lowconfs(i), highconfs(i))
End Function

然后在主子中将所有代码替换为:

Dim levels As Variant 'in the declaration part, use a different name if you want

'if n > 500, handle error and exit sub
levels = ConfLevels(n)
LowConf = levels(0)
HignConf = levels(1)

您甚至可以跳过变量 LowConfHighConf 并使用 levels(0)levels(1) 直接。

如果您解释了所使用的启发式方法,则可以通过使用诸如 T.Inv.2T 等工作表函数来进一步缩短此时间。这些数字从何而来?

就代码的其余部分而言,合理的策略是编写一个递归函数,该函数输入 VBA 数组并返回删除了所有异常值的 VBA 数组(基本情况是返回数组不变)因为没有更多的异常值需要删除)。主子系统(实际上与电子表格和用户交互)可能相对较小。

关于vba - 分析数组中的数据,删除,然后重新分析VBA,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44421289/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com