gpt4 book ai didi

ruby - 将字符串拆分成 block (不同大小)而不打断单词

转载 作者:太空宇宙 更新时间:2023-11-03 16:01:07 25 4
gpt4 key购买 nike

我正在尝试创建一个方法,给定一个字符串,返回三个字符串:title、description1、description2

这是我发现的一个相关问题:Split a string into chunks of specified size without breaking words - 但我的 block 大小不同。

标题最多需要 25 个字符。

Description1 最多需要 35 个字符。

Description2 最多需要 35 个字符。

问题是:

我怎样才能拆分一个字符串,以便它最多创建三个实体(注意:如果该字符串只适合第一个没问题的实体,我不需要返回三个实体),其中第一个实体最多有 25 个字符,另外两个最多有 35 个字符。使该方法足够聪明以考虑单词(可能还有标点符号),这样它就不会返回剪切结果。

我做了以下事情:

def split_text_to_entities(big_string)
title = big_string(0..24)
description1 = big_string(25..59)
description2 = big_string(60..94)
end

但这种方法的问题在于,如果输入是“从我们的商店购买我们的新品牌鞋子。镇上最好的折扣,首次购买可享受 40% 的折扣。”,结果将是:

title = "Buy our new brand shoes f"
description1 = "rom our store. Best discounts in to"
description2 = "wn and 40% off for first purchase."

理想情况下,他们应该是:

title = "Buy our new brand shoes"
description1 = "from our store. Best discounts in"
description2 = "town and 40% off for first"

因此,考虑到单词,尝试按字符大小拆分。

最佳答案

为了涵盖所有基础,我将执行以下操作。

代码

def divide_text(str, max_chars)
max_chars.map do |n|
str.lstrip!
s = str[/^.{,#{n}}(?=\b)/] || ''
str = str[s.size..-1]
s
end
end

(?=\b) 是匹配分词符的(零宽度)正向先行。

示例

max_nbr_chars = [25,35,35]

str = "Buy our new brand shoes from our store. Best discounts in " +
"town and 40% off for first purchase."
divide_text(str, max_nbr_chars)
#=> ["Buy our new brand shoes",
# "from our store. Best discounts in",
# "town and 40% off for first"]

str = "Buy our new brand shoes from our store."
divide_text(str, max_nbr_chars)
#=> ["Buy our new brand shoes", "from our store.", ""]

str = "Buy our new"
divide_text(str, max_nbr_chars)
#=> ["Buy our new", "", ""]

str = ""
divide_text(str, max_nbr_chars)
#=> ["", "", ""]

str = "Buyournewbrandshoesfromourstore."
divide_text(str, max_nbr_chars)
#=> ["", "Buyournewbrandshoesfromourstore.", ""]

str = "Buyournewbrandshoesfromourstoreandshoesfromourstore."
divide_text(str, max_nbr_chars)
#=> ["", "", ""]

请注意,如果正则表达式中省略了 ^:

str = "Buyournewbrandshoesfromourstore."
divide_text(str, max_nbr_chars)
#=> ["ewbrandshoesfromourstore.", "rstore.", ""]

关于ruby - 将字符串拆分成 block (不同大小)而不打断单词,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24871475/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com