gpt4 book ai didi

f# - 给定真值表需要创建二叉树的帮助

转载 作者:行者123 更新时间:2023-12-04 13:45:23 25 4
gpt4 key购买 nike

首先,为了提供完整的公开信息,我想指出的是,这与机器学习类(class)中的作业有关。这个问题不是作业问题,而是我需要解决的问题,以解决创建ID3决策树算法这一更大的问题。

给定真值表时,我需要生成类似于以下内容的树

let learnedTree = Node(0,"A0", Node(2,"A2", Leaf(0), Leaf(1)), Node(1,"A1", Node(2,"A2", Leaf(0), Leaf(1)), Leaf(0)))

LearningTree的类型为BinaryTree,其定义如下:
type BinaryTree =
| Leaf of int
| Node of int * string * BinaryTree * BinaryTree

ID3算法考虑了各种方程式来确定在哪里拆分树,而我已经弄清了所有问题,只是在从真值表创建学习到的树时遇到了麻烦。例如,如果我有下表
A1 | A2 | A3 | Class
1 0 0 1
0 1 0 1
0 0 0 0
1 0 1 0
0 0 0 0
1 1 0 1
0 1 1 0

我决定拆分属性A1,最后得到以下结果:
              (A1 = 1)  A1   (A1 = 0)
A2 | A3 | Class A2 | A3 | Class
0 0 1 1 0 1
0 1 0 0 0 0
1 0 1 0 0 0
0 1 1

然后,我将拆分左侧并拆分右侧,然后继续递归模式,直到叶节点是纯净的,然后基于该拆分,最终得到一棵类似于以下内容的树。
let learnedTree = Node(0,"A0", Node(2,"A2", Leaf(0), Leaf(1)), Node(1,"A1", Node(2,"A2", Leaf(0), Leaf(1)), Leaf(0)))

到目前为止,这是我被“黑客入侵”的一种方式,但是我认为我可能会遥遥无期:
let rec createTree (listToSplit : list<list<float>>) index =
let leftSideSplit =
listToSplit |> List.choose (fun x -> if x.Item(index) = 1. then Some(x) else None)
let rightSideSplit =
listToSplit |> List.choose (fun x -> if x.Item(index) = 0. then Some(x) else None)
if leftSideSplit.Length > 0 then
let pureCheck = isListPure leftSideSplit
if pureCheck = 0 then
printfn "%s" "Pure left node class 0"
createTree leftSideSplit (index + 1)
else if pureCheck = 1 then
printfn "%s" "Pure left node class 1"
createTree leftSideSplit (index + 1)
else
printfn "%s - %A" "Recursing Left" leftSideSplit
createTree leftSideSplit (index + 1)
else printfn "%s" "Pure left node class 0"

我应该使用模式匹配吗?有任何提示/想法/帮助吗?谢谢一堆!

最佳答案

编辑:此后,我在博客上发布了ID3的实现:
http://blogs.msdn.com/chrsmith

嘿吉姆,我一直想写一篇博客文章,用F#实现ID3-感谢您给我执行程序。尽管此代码未完全(或正确)实现算法,但对于您入门来说应该足够了。

通常,您采用正确的方法-将每个分支表示为有区别的联合案例是好的。就像Brian所说的那样,List.partition绝对是一个方便的函数。正确执行此操作的诀窍在于确定要分割的最佳属性/值对-并需要通过熵等来计算信息增益。

type Attribute = string
type Value = string

type Record =
{
Weather : string
Temperature : string
PlayTennis : bool
}
override this.ToString() =
sprintf
"{Weather = %s, Temp = %s, PlayTennis = %b}"
this.Weather
this.Temperature
this.PlayTennis

type Decision = Attribute * Value

type DecisionTreeNode =
| Branch of Decision * DecisionTreeNode * DecisionTreeNode
| Leaf of Record list

// ------------------------------------

// Splits a record list into an optimal split and the left / right branches.
// (This is where you use the entropy function to maxamize information gain.)
// Record list -> Decision * Record list * Record list
let bestSplit data =
// Just group by weather, then by temperature
let uniqueWeathers =
List.fold
(fun acc item -> Set.add item.Weather acc)
Set.empty
data

let uniqueTemperatures =
List.fold
(fun acc item -> Set.add item.Temperature acc)
Set.empty
data

if uniqueWeathers.Count = 1 then
let bestSplit = ("Temperature", uniqueTemperatures.MinimumElement)
let left, right =
List.partition
(fun item -> item.Temperature = uniqueTemperatures.MinimumElement)
data
(bestSplit, left, right)
else
let bestSplit = ("Weather", uniqueWeathers.MinimumElement)
let left, right =
List.partition
(fun item -> item.Weather = uniqueWeathers.MinimumElement)
data
(bestSplit, left, right)

let rec determineBranch data =
if List.length data < 4 then
Leaf(data)
else
// Use the entropy function to break the dataset on
// the category / value that best splits the data
let bestDecision, leftBranch, rightBranch = bestSplit data
Branch(
bestDecision,
determineBranch leftBranch,
determineBranch rightBranch)

// ------------------------------------

let rec printID3Result indent branch =
let padding = new System.String(' ', indent)
match branch with
| Leaf(data) ->
data |> List.iter (fun item -> printfn "%s%s" padding <| item.ToString())
| Branch(decision, lhs, rhs) ->
printfn "%sBranch predicate [%A]" padding decision
printfn "%sWhere predicate is true:" padding
printID3Result (indent + 4) lhs
printfn "%sWhere predicate is false:" padding
printID3Result (indent + 4) rhs


// ------------------------------------

let dataset =
[
{ Weather = "windy"; Temperature = "hot"; PlayTennis = false }
{ Weather = "windy"; Temperature = "cool"; PlayTennis = false }
{ Weather = "nice"; Temperature = "cool"; PlayTennis = true }
{ Weather = "nice"; Temperature = "cold"; PlayTennis = true }
{ Weather = "humid"; Temperature = "hot"; PlayTennis = false }
]

printfn "Given input list:"
dataset |> List.iter (printfn "%A")

printfn "ID3 split resulted in:"
let id3Result = determineBranch dataset
printID3Result 0 id3Result

关于f# - 给定真值表需要创建二叉树的帮助,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/1418829/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com