gpt4 book ai didi

Python: Find the longest word in a string(Python:查找字符串中最长的单词)

翻译 作者:bug小助手 更新时间:2023-10-26 22:24:47 25 4
gpt4 key购买 nike



I'm preparing for an exam but I'm having difficulties with one past-paper question. Given a string containing a sentence, I want to find the longest word in that sentence and return that word and its length. Edit: I only needed to return the length but I appreciate your answers for the original question! It helps me learn more. Thank you.

我正在准备一场考试,但有一道过去的试卷我遇到了困难。给定一个包含一个句子的字符串,我希望找到该句子中最长的单词,并返回该单词及其长度。编辑:我只需要返回长度,但我感谢您对原始问题的回答!它帮助我学到了更多。谢谢。



For example: string = "Hello I like cookies". My program should then return "Cookies" and the length 7.



Now the thing is that I am not allowed to use any function from the class String for a full score, and for a full score I can only go through the string once. I am not allowed to use string.split() (otherwise there wouldn't be any problem) and the solution shouldn't have too many for and while statements. The strings contains only letters and blanks and words are separated by one single blank.

现在的问题是我不允许使用String类中的任何函数来获得满分,而对于满分,我只能遍历字符串一次。我不允许使用string.split()(否则不会有任何问题),解决方案不应该有太多的for和while语句。字符串只包含字母和空格,单词由一个空格分隔。



Any suggestions? I'm lost i.e. I don't have any code.

有什么建议吗?我迷路了,也就是说我没有任何密码。



Thanks.

谢谢。



EDIT: I'm sorry, I misread the exam question. You only have to return the length of the longest word it seems, not the length + the word.

编辑:对不起,我读错了试题。你只需要返回看起来最长的单词的长度,而不是长度+单词。



EDIT2: Okay, with your help I think I'm onto something...

EDIT2:好的,有你的帮助,我想我正在做一些事情……



def longestword(x):
alist = []
length = 0
for letter in x:
if letter != " ":
length += 1
else:
alist.append(length)
length = 0
return alist


But it returns [5, 1, 4] for "Hello I like cookies" so it misses "cookies". Why? EDIT: Ok, I got it. It's because there's no more " " after the last letter in the sentence and therefore it doesn't append the length. I fixed it so now it returns [5, 1, 4, 7] and then I just take the maximum value.

但是对于“Hello I Like Cookies”,它返回[5,1,4],因此它错过了“Cookie”。为什么?编辑:好的,我知道了。这是因为在句子的最后一个字母后面没有更多的“”,因此它不会附加长度。我修正了它,现在它返回[5,1,4,7],然后我取最大值。



I suppose using lists but not .split() is okay? It just said that functions from "String" weren't allowed or are lists part of strings?

我想使用列表而不是.Split()可以吗?它只是说“字符串”中的函数是不允许的,或者列表是字符串的一部分?


更多回答

Technically, you can import the string module and call split from there because you wouldn't be using it from the string class. Maybe this is a test of knowing the documentation.

从技术上讲,您可以导入字符串模块并从那里调用Split,因为您不会在字符串类中使用它。也许这是对文档了解程度的一次测试。

If you test only that the letter is not " ", it will fail with punctuation: Hello, I like cookies!". You'd better test that the character is not a letter.

如果您只测试字母不是“”,它将失败并显示标点符号:你好,我喜欢饼干!“。您最好测试字符不是字母。

@FrancisColas my bad again. It says in the exam question that "The text contains only letters and blanks, and each word is separated by one single blank". Then it's fine?

@FrancisColas又是我的错。试题中写道:“课文只包含字母和空格,每个单词由一个空格分隔”。那就没问题了?

Yes, in that case it works fine: your test is also faster than mine and simpler than using regular expressions.

是的,在这种情况下,它工作得很好:您的测试也比我的更快,而且比使用正则表达式更简单。

优秀答案推荐

You can try to use regular expressions:

您可以尝试使用正则表达式:



import re

string = "Hello I like cookies"
word_pattern = "\w+"

regex = re.compile(word_pattern)
words_found = regex.findall(string)

if words_found:
longest_word = max(words_found, key=lambda word: len(word))
print(longest_word)


Finding a max in one pass is easy:

在一次传递中找到最大值很容易:



current_max = 0
for v in values:
if v>current_max:
current_max = v


But in your case, you need to find the words. Remember this quote (attribute to J. Zawinski):

但在你的情况下,你需要找到合适的词语。记住这句话(归功于J.Zawinski):




Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.




Besides using regular expressions, you can simply check that the word has letters. A first approach is to go through the list and detect start or end of words:

除了使用正则表达式之外,您还可以简单地检查单词是否有字母。第一种方法是浏览列表并检测单词的开始或结束:



current_word = ''
current_longest = ''
for c in mystring:
if c in string.ascii_letters:
current_word += c
else:
if len(current_word)>len(current_longest):
current_longest = current_word
current_word = ''
else:
if len(current_word)>len(current_longest):
current_longest = current_word


A final way is to split words in a generator and find the max of what it yields (here I used the max function):

最后一种方法是在生成器中拆分单词,并找到它产生的最大值(这里我使用了max函数):



def split_words(mystring):
current = []
for c in mystring:
if c in string.ascii_letters:
current.append(c)
else:
if current:
yield ''.join(current)
max(split_words(mystring), key=len)


Just search for groups of non-whitespace characters, then find the maximum by length:

只需搜索非空格字符组,然后按长度找到最大值:



longest = len(max(re.findall(r'\S+',string), key = len))


For python 3. If both the words in the sentence is of the same length, then it will return the word that appears first.

如果句子中的两个单词具有相同的长度,则它将返回最先出现的单词。



def findMaximum(word):
li=word.split()
li=list(li)
op=[]
for i in li:
op.append(len(i))
l=op.index(max(op))
print (li[l])
findMaximum(input("Enter your word:"))


It's quite simple:

这很简单:



def long_word(s):
n = max(s.split())
return(n)


IN [48]: long_word('a bb ccc dddd')

在[48]中:Long_Word(‘a BB CCC dddd’)



Out[48]: 'dddd'

Out[48]:‘dddd’



found an error in a previous provided solution, he's the correction:

在以前提供的解决方案中发现错误,他是更正:


def longestWord(text):

current_word = ''
current_longest = ''
for c in text:
if c in string.ascii_letters:
current_word += c
else:
if len(current_word)>len(current_longest):
current_longest = current_word
current_word = ''

if len(current_word)>len(current_longest):
current_longest = current_word
return current_longest


I can see imagine some different alternatives. Regular expressions can probably do much of the splitting words you need to do. This could be a simple option if you understand regexes.

我可以想象一些不同的选择。正则表达式可能可以完成您需要做的大部分拆分单词。如果您了解正则表达式,这可能是一个简单的选择。



An alternative is to treat the string as a list, iterate over it keeping track of your index, and looking at each character to see if you're ending a word. Then you just need to keep the longest word (longest index difference) and you should find your answer.

另一种方法是将字符串视为一个列表,遍历它,跟踪您的索引,并查看每个字符,以确定您是否正在结束一个单词。然后你只需要保留最长的单词(最长的索引差异),你就应该找到答案了。



Regular Expressions seems to be your best bet. First use re to split the sentence:

正则表达式似乎是您最好的选择。首先用re来拆分句子:



>>> import re
>>> string = "Hello I like cookies"
>>> string = re.findall(r'\S+',string)


\S+ looks for all the non-whitespace characters and puts them in a list:

\S+查找所有非空格字符并将其放入列表中:



>>> string
['Hello', 'I', 'like', 'cookies']


Now you can find the length of the list element containing the longest word and then use list comprehension to retrieve the element itself:

现在,您可以找到包含最长单词的列表元素的长度,然后使用列表理解来检索元素本身:



>>> maxlen = max(len(word) for word in string)
>>> maxlen
7
>>> [word for word in string if len(word) == maxlen]
['cookies']


This method uses only one for loop, doesn't use any methods in the String class, strictly accesses each character only once. You may have to modify it depending on what characters count as part of a word.

该方法只使用一个for循环,不使用字符串类中的任何方法,严格地只访问每个字符一次。您可能需要修改它,具体取决于哪些字符被视为单词的一部分。



s = "Hello I like cookies"
word = ''
maxLen = 0
maxWord = ''
for c in s+' ':
if c == ' ':
if len(word) > maxLen:
maxWord = word
word = ''
else:
word += c


print "Longest word:", maxWord
print "Length:", len(maxWord)


Given you are not allowed to use string.split() I guess using a regexp to do the exact same thing should be ruled out as well.

考虑到不允许使用string.plit(),我想也应该排除使用regexp来做完全相同的事情。



I do not want to solve your exercise for you, but here are a few pointers:

我不想为你解决你的练习问题,但这里有几个建议:




  • Suppose you have a list of numbers and you want to return the highest value. How would you do that? What information do you need to track?

  • Now, given your string, how would you build a list of all word lengths? What do you need to keep track of?

  • Now, you only have to intertwine both logics so computed word lengths are compared as you go through the string.



My proposal ...



import re
def longer_word(sentence):
word_list = re.findall("\w+", sentence)
word_list.sort(cmp=lambda a,b: cmp(len(b),len(a)))
longer_word = word_list[0]
print "The longer word is '"+longer_word+"' with a size of", len(longer_word), "characters."
longer_word("Hello I like cookies")


import re

def longest_word(sen):
res = re.findall(r"\w+",sen)
n = max(res,key = lambda x : len(x))
return n

print(longest_word("Hey!! there, How is it going????"))


Output : there

输出:在那里



Here I have used regex for the problem. Variable "res" finds all the words in the string and itself stores them in the list after splitting them.
It uses split() to store all the characters in a list and then regex does the work.

在这里,我使用了正则表达式来解决这个问题。变量“res”查找字符串中的所有单词,并在拆分后将它们存储在列表中。它使用Split()将所有字符存储在一个列表中,然后使用regex来完成工作。



findall keyword is used to find all the desired instances in a string. Here \w+ is defined which tells the compiler to look for all the words without any spaces.

Findall关键字用于在字符串中查找所有需要的实例。这里定义了\w+,它告诉编译器查找所有不带空格的单词。



Variable "n" finds the longest word from the given string which is now free of any undesired characters.

变量“n”从给定的字符串中查找现在没有任何不需要的字符的最长单词。



Variable "n" uses lambda expressions to define the key len() here.

变量“n”在这里使用lambda表达式来定义键len()。



Variable "n" finds the longest word from "res" which has removed all the non-string charcters like %,&,! etc.

变量“n”从“res”中查找最长的单词,该单词删除了%、&、!等所有非字符串字符等。



>>>#import regular expressions for the problem.**
>>>import re

>>>#initialize a sentence
>>>sen = "fun&!! time zone"

>>>res = re.findall(r"\w+",sen)
>>>#res variable finds all the words and then stores them in a list.

>>>res
Out: ['fun','time','zone']

>>>n = max(res)
Out: zone

>>>#Here we get "zone" instead of "time" because here the compiler
>>>#sees "zone" with the higher value than "time".
>>>#The max() function returns the item with the highest value, or the item with the highest value in an iterable.

>>>n = max(res,key = lambda x:len(x))
>>>n
Out: time


Here we get "time" because lambda expression discards "zone" as it sees the key is for len() in a max() function.

这里我们得到“time”,因为lambda表达式丢弃了“zone”,因为它看到了max()函数中的len()键。



in case you want to ignore punctuation as well here is a more Pythonic way: this will return the longest string you can use len to get length and return it

如果你也想忽略标点符号,这里有一个更Python的方法:这将返回最长的字符串,你可以使用len来获取长度并返回它


from string import punctuation
def longest_word(str):
return max("".join([s for s in str if s not in punctuation]).split(" "), key=len)

in case you have two strings with the same length the first one that appeared will be returned

如果您有两个长度相同的字符串,则将返回出现的第一个字符串


simple Inputs,Outputs:

简单的输入、输出:


print(longest_word("test1 test2")) # -> test1
print(longest_word("t.est1: test2")) # -> test1
print(longest_word("hy: hello")) # -> hello


list1 = ['Happy', 'Independence', 'Day', 'Zeal']
listLen = []
for i in list1:
listLen.append(len(i))
print list1[listLen.index(max(listLen))]


Output - Independence

输出--独立


更多回答

Thanks, I'll look at this some more. We haven't talked about regular expressions I think.

谢谢,我再看看这个。我想我们还没有讨论过正则表达式。

That first method fails to detect "cookies" because it's the last word in the string.

第一个方法无法检测到“cookies”,因为它是字符串中的最后一个单词。

Indeed (I had read "I like cookies." with punctuation). I edited with a fix.

事实上,我读过《我喜欢饼干》。带有标点符号)。我是用补丁编辑的。

I like the generator/max approach, it is elegant and modular. Not sure this is the intended approach for the exercise, but it would deserve full score imo. You would have to tweak it a bit so the word is returned as well, though.

我喜欢生成器/最大值方法,它是优雅和模块化的。不确定这是演习的目的方法,但它应该得到国际海事组织的满分。不过,您必须对它进行一些调整,这样才能同时返回单词。

Thank you, very thorough answer.

谢谢,回答得非常透彻。

The OP stated that using split is not allowed.

操作员声明,不允许使用Split。

Okay, I'll just use regex then.

好的,那我就用正则表达式吧。

This returns the longest words length, not the actual word.

这将返回最长的单词长度,而不是实际的单词。

@ChaseRoberts Ok no problem, remove the call to len

@ChaseRoberts OK没问题,删除对len的调用

This is not the questions' answer, it returns the highest word by alphabetical order.

这不是问题的答案,它按字母顺序返回最高的单词。

Thanks, your second paragraph inspired me and I think I solved it.

谢谢,你的第二段启发了我,我想我已经解决了。

Mmt, Regarding your code, you cannot find cookies, because there is no space at the end of your string .. and the the last word is never stored in your list.

MMT,关于您的代码,您找不到Cookie,因为字符串末尾没有空格。最后一个词永远不会存储在你的列表中。

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com