gpt4 book ai didi

algorithm - 计算 Go 中给定字符串中句子中的最大单词数

转载 作者:数据小太阳 更新时间:2023-10-29 03:09:24 25 4
gpt4 key购买 nike

我是 Go 的新手...我正在寻找新的方法来优化和/或修复此算法以计算给定字符串中句子中的最大单词数。句子以“?”结尾要么 '!'要么 '。'并且函数应该返回 int >= 0。

// MaxWordsInSentences - return max words in one sentences
func MaxWordsInSentences(S string) (result int) {


r, _ := regexp.Compile("[.||?||!]")
count := strings.Count(S, ".") + strings.Count(S, "!") + strings.Count(S, "?") // Total sentaces

for i := 0; i < count; i++ {
sentence := r.Split(S, count)[i]
splitSentence := strings.Split(sentence, " ")

var R []string
for _, str := range splitSentence {
if str != "" {
R = append(R, str)
}
}

if len(R) > result {
result = len(R)
}
}

return

}

例子

Sentence => “一二三四五六七八。一二?一二三四五六七八九?一二三!一二三四。”

应该返回 9 作为结果

最佳答案

在您提供的简单测试用例中,您的算法似乎有效。您的算法在真实文本上效果不佳。


考虑我的简单算法:

func maxSentenceWords(s string) int {
maxWords, nWords := 0, 0
inWord := false
for _, r := range s {
switch r {
case '.', '?', '!':
inWord = false
if maxWords < nWords {
maxWords = nWords
}
nWords = 0
default:
if unicode.IsSpace(r) {
inWord = false
} else if inWord == false {
inWord = true
nWords++
}
}
if maxWords < nWords {
maxWords = nWords
}
}
return maxWords
}

Playground :https://play.golang.org/p/OD8jNW1hyAa

它通过了您的简单测试。一个短的基准(Lorem Ipsum)运行得非常快,一个长的基准(Shakespeare)运行得很快

$ go test words_test.go -run=PeterSO -v -bench=PeterSO -benchmem -timeout=5m
=== RUN TestPeterSO
--- PASS: TestPeterSO (0.00s)
BenchmarkPeterSOL-4 300000 4027 ns/op 0 B/op 0 allocs/op
BenchmarkPeterSOS-4 20 54084832 ns/op 0 B/op 0 allocs/op
$

考虑您的复杂算法:

func MaxWordsInSentences(S string) (result int) {
r, _ := regexp.Compile("[.||?||!]")
count := strings.Count(S, ".") + strings.Count(S, "!") + strings.Count(S, "?") // Total sentaces

for i := 0; i < count; i++ {
sentence := r.Split(S, count)[i]
splitSentence := strings.Split(sentence, " ")

var R []string
for _, str := range splitSentence {
if str != "" {
R = append(R, str)
}
}

if len(R) > result {
result = len(R)
}
}

return
}

Playground :https://play.golang.org/p/MCj-XxEid73

它通过了您的简单测试。短基准测试 (Lorem Ipsum) 运行缓慢,而长基准测试 (Shakespeare) 运行时间很长(5 分钟后终止)。

$ go test words_test.go -run=Ljubon -v -bench=Ljubon -benchmem -timeout=5m
=== RUN TestLjubon
--- PASS: TestLjubon (0.00s)
BenchmarkLjubonL-4 20000 78623 ns/op 6984 B/op 62 allocs/op
*** Test killed with quit: ran too long (6m0s).
$

测试words_test.go:

package main

import (
"fmt"
"io/ioutil"
"regexp"
"strings"
"testing"
"unicode"
)

var sentences = "One two three four five six seven eight. One two? One two three four five six seven eight nine? One two three! One two three four."

var loremipsum = `
Lorem ipsum dolor sit amet, consectetur adipiscing elit,
sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
Ut enim ad minim veniam,
quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Excepteur sint occaecat cupidatat non proident,
sunt in culpa qui officia deserunt mollit anim id est laborum.
`

var shakespeare = func() string {
// The Complete Works of William Shakespeare by William Shakespeare
// http://www.gutenberg.org/files/100/100-0.txt
data, err := ioutil.ReadFile(`/home/peter/shakespeare.100-0.txt`)
if err != nil {
panic(err)
}
return string(data)
}()

func maxSentenceWords(s string) int {
maxWords, nWords := 0, 0
inWord := false
for _, r := range s {
switch r {
case '.', '?', '!':
inWord = false
if maxWords < nWords {
maxWords = nWords
}
nWords = 0
default:
if unicode.IsSpace(r) {
inWord = false
} else if inWord == false {
inWord = true
nWords++
}
}
if maxWords < nWords {
maxWords = nWords
}
}
return maxWords
}

func TestPeterSO(t *testing.T) {
want := 9
got := maxSentenceWords(sentences)
if got != want {
t.Errorf("want %d; got %d", want, got)
}
}

func BenchmarkPeterSOL(b *testing.B) {
for N := 0; N < b.N; N++ {
maxSentenceWords(loremipsum)
}
}

func BenchmarkPeterSOS(b *testing.B) {
for N := 0; N < b.N; N++ {
maxSentenceWords(shakespeare)
}
}

// MaxWordsInSentences - return max words in one sentences
func MaxWordsInSentences(S string) (result int) {
r, _ := regexp.Compile("[.||?||!]")
count := strings.Count(S, ".") + strings.Count(S, "!") + strings.Count(S, "?") // Total sentaces

for i := 0; i < count; i++ {
sentence := r.Split(S, count)[i]
splitSentence := strings.Split(sentence, " ")

var R []string
for _, str := range splitSentence {
if str != "" {
R = append(R, str)
}
}

if len(R) > result {
result = len(R)
}
}

return
}

func TestLjubon(t *testing.T) {
want := 9
got := MaxWordsInSentences(sentences)
if got != want {
t.Errorf("want %d; got %d", want, got)
}
}

func BenchmarkLjubonL(b *testing.B) {
for N := 0; N < b.N; N++ {
MaxWordsInSentences(loremipsum)
}
}

func BenchmarkLjubonS(b *testing.B) {
for N := 0; N < b.N; N++ {
MaxWordsInSentences(shakespeare)
}
}

func main() {
s := "One two three four five six seven eight. One two? One two three four five six seven eight nine? One two three! One two three four."
max := maxSentenceWords(s) // 9
fmt.Println(max)
s = "One two three! One two three four"
max = maxSentenceWords(s) // 4
fmt.Println(max)
s = loremipsum
max = maxSentenceWords(s)
fmt.Println(max)
}

I call it the law of the instrument, and it may be formulated as follows: Give a small boy a hammer, and he will find that everything he encounters needs pounding.

Abraham Kaplan, The Conduct of Inquiry: Methodology for Behavioral Science, 1964, page 28.


Go 的 regexp 包是你敲打任何和所有文本的锤子吗?

关于algorithm - 计算 Go 中给定字符串中句子中的最大单词数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54389477/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com