gpt4 book ai didi

string - 处理字符串的字符限制

转载 作者:行者123 更新时间:2023-12-04 20:38:28 24 4
gpt4 key购买 nike

单元格可以包含大量字符。我不确定限制,但我正在测试 450 多个字符。在 VBA 中,我可以毫无问题地将该单元格的值插入字符串中,通过 debug.print 读取它,使用 Len(str) 等函数来查找字符数。

我的问题

我要使用的字符串是 HTML 字符串,我在其上应用格式,然后删除 HTML 标记。格式应用没有问题,使用我认为没有必要显示的宏(它很长)但是当需要删除 HTML 标记时,当字符串高于 255 个字符时我会遇到问题。

自己转载看看

这是一段代码的一部分,用于删除有关字体颜色的 HTML 标记,经过调整以使情况突出。要使用它,请选择一个包含 HTML 标记的单元格并运行代码。请注意 - 当长度大于 255 个字符时,它将运行一个无限循环,因此请使用 F8 单步执行并第一次查看 debug.prints。删除被简单地跳过,甚至没有出现任何错误。

Sub removeColorTags()
Dim i As Integer
Dim rng As Range
Dim str As String
Set rng = ActiveCell
i = InStr(rng.Value, "<font")
Do Until i = 0
Debug.Print Len(rng.Value)
str = rng.Value
Debug.Print str 'Displays correctly
rng.Characters(i, 20).Delete
i = InStr(rng.Value, "</font>")
rng.Characters(i, 7).Delete
i = InStr(rng.Value, "<font")
Loop
End Sub

这是您可以在单元格中解析的示例以尝试代码以查看它是否成功而没有问题。它将删除颜色标签,但保留尺寸标签。确保你得到整行(250 个字符)
<font color=#8DB4E2><size=09>Action ligne 3</font><font color=#8DB4E2><size=09>Action ligne 3</font><font color=#8DB4E2><size=09>Action ligne 3</font><font color=#8DB4E2><size=09>Action ligne 3</font><font color=#8DB4E2><size=09>Action ligne 3</font>

这是您可以在单元格中解析的示例,以尝试代码以查看它是否失败。确保你得到整行(450 个字符)
<font color=#8DB4E2><size=09>Action ligne 3</font><font color=#8DB4E2><size=09>Action ligne 3</font><font color=#8DB4E2><size=09>Action ligne 3</font><font color=#8DB4E2><size=09>Action ligne 3</font><font color=#8DB4E2><size=09>Action ligne 3</font><font color=#8DB4E2><size=09>Action ligne 3</font><font color=#8DB4E2><size=09>Action ligne 3</font><font color=#8DB4E2><size=09>Action ligne 3</font><font color=#8DB4E2><size=09>Action ligne 3</font>

我想要什么

我想要一种方法来删除长字符串上的 HTML 标签。在不引用单元格的情况下执行此操作(获取字符串中的值,使用 Replace 或其他方式删除标签)是行不通的,因为在放回值时,格式会丢失。这样做的重点是格式化单元格。

最佳答案

这就是我最终要做的。首先,让我们看一下输入、转换信息和输出的屏幕截图。我从带有文本的普通格式的 excel 单元格开始,然后将其转换为类似于(但不太像)HTML 的内容。这个问题是问我如何在不丢失格式的情况下从 HTMl 字符串(屏幕截图的中间部分)中删除子字符串(HTML 标记)。

enter image description here

这如何回答问题

我需要一种在不丢失超过 255 个字符的单元格格式的情况下删除子字符串的方法。这意味着不使用 characters.insertcharacters.delete ,因为正如蒂姆威廉姆斯指出的那样,它们在 255 个字符后会引起问题。因此,作为一种解决方法,我在要删除的子字符串之间分割输入字符串,记录它们的格式,将它们重新组合在一起,然后使用 characters(x,y).font 重新应用格式.

我将要展示的 sub 逐个字符地扫描 HTML 字符串,并将其记录在一个临时字符串中。当它遇到 HTML 标记时,它会停止记录临时字符串并将其连同与该临时字符串相关的格式记录在一个数组中。然后它读取标签并将“当前格式”更改为 HTML 标签所做的,并在新的临时字符串中再次开始录制。我承认可以通过调用函数来缩短 sub,但它可以工作。

Sub FromHTML(rngToConvert As Range)
Dim i As Integer, j As Integer, k As Integer
Dim strHTML As String, strTemp As String
Dim rng As Range
Dim arr()
Dim lengthFormatted As Integer
Dim optBold As Boolean, optIta As Boolean, optUnd As Boolean, optCol As String, optSize As Integer
Dim inStrTemp As Boolean
Dim nbChars As Integer

Set rng = rngToConvert.Offset(0, 2)
rng.Clear
strHTML = rngToConvert.Value

If InStr(strHTML, "<") = 0 Then Exit Sub


ReDim arr(6, 0)

inStrTemp = False
strTemp = ""
optBold = False
optIta = False
optUnd = False
optCol = "0,0,0"
optSize = "11"

For i = 1 To Len(strHTML)

If Not Mid(strHTML, i, 1) = "<" And Not Mid(strHTML, i, 4) = "[LF]" Then
'All WANTED characters go here
strTemp = strTemp & Mid(strHTML, i, 1)
inStrTemp = True

If Len(strTemp) > 200 Or i = Len(strHTML) Then
'Cuts them shorter than 200 chars
'In retrospect this isn't necessary but doesn't interfere
ReDim Preserve arr(6, j)
arr(0, j) = strTemp
arr(1, j) = optBold
arr(2, j) = optIta
arr(3, j) = optUnd
arr(4, j) = optCol
arr(5, j) = optSize
arr(6, j) = Len(strTemp)
strTemp = ""
j = j + 1
End If
ElseIf Mid(strHTML, i, 4) = "[LF]" Then
'[LF] is what I used to indicate that there was a line change in the original text
ReDim Preserve arr(6, j)
arr(0, j) = strTemp
arr(1, j) = optBold
arr(2, j) = optIta
arr(3, j) = optUnd
arr(4, j) = optCol
arr(5, j) = optSize
arr(6, j) = Len(strTemp)
strTemp = ""
j = j + 1

strTemp = vbLf
inStrTemp = True
i = i + 3

ReDim Preserve arr(6, j)
arr(0, j) = strTemp
arr(1, j) = optBold
arr(2, j) = optIta
arr(3, j) = optUnd
arr(4, j) = optCol
arr(5, j) = optSize
arr(6, j) = Len(strTemp)
strTemp = ""
j = j + 1
Else
If inStrTemp = True Then
'Records the temporary string and the formats it used
ReDim Preserve arr(6, j)
arr(0, j) = strTemp
arr(1, j) = optBold
arr(2, j) = optIta
arr(3, j) = optUnd
arr(4, j) = optCol
arr(5, j) = optSize
arr(6, j) = Len(strTemp)
strTemp = ""
j = j + 1
inStrTemp = False
End If

'If we get here we hit a HTML tag, so we read it and skip to after it
If Mid(strHTML, i, 3) = "<b>" Then
optBold = True
i = i + 2
ElseIf Mid(strHTML, i, 4) = "</b>" Then
optBold = False
i = i + 3
ElseIf Mid(strHTML, i, 3) = "<i>" Then
optIta = True
i = i + 2
ElseIf Mid(strHTML, i, 4) = "</i>" Then
optIta = False
i = i + 3
ElseIf Mid(strHTML, i, 3) = "<u>" Then
optUnd = True
i = i + 2
ElseIf Mid(strHTML, i, 4) = "</u>" Then
optUnd = False
i = i + 3
ElseIf Mid(strHTML, i, 11) Like "<c=???????>" Then
'optCol = RED, GREEN, BLUE
optCol = CInt("&H" & Mid(strHTML, i + 4, 2)) & "," & _
CInt("&H" & Mid(strHTML, i + 6, 2)) & "," & _
CInt("&H" & Mid(strHTML, i + 8, 2))
i = i + 10
ElseIf Mid(strHTML, i, 6) Like "<s=??>" Then
optSize = CInt(Mid(strHTML, i + 3, 2))
i = i + 5
End If
End If
Next

'Filling the cell with unformatted text
For i = 0 To UBound(arr, 2)
'This debug.print shows the tempString that was recorded and the associated formats
Debug.Print arr(0, i) & " Bold=" & arr(1, i) & " Italic=" & arr(2, i) & " Underline=" & arr(3, i) & " RGB=" & arr(4, i) & " Size =" & arr(5, i)
rng.Value = rng.Value + arr(0, i)
Next
'Applying formats according to the arrays
nbChars = 1
For i = 0 To UBound(arr, 2)
If arr(0, i) = vbLf Then
nbChars = nbChars + 1
Else
rng.Characters(nbChars, arr(6, i)).Font.Bold = arr(1, i)
rng.Characters(nbChars, arr(6, i)).Font.Italic = arr(2, i)
rng.Characters(nbChars, arr(6, i)).Font.Underline = arr(3, i)
rng.Characters(nbChars, arr(6, i)).Font.Color = RGB(Split(arr(4, i), ",")(0), Split(arr(4, i), ",")(1), Split(arr(4, i), ",")(2))
rng.Characters(nbChars, arr(6, i)).Font.Size = CInt(arr(5, i))
nbChars = nbChars + arr(6, i)
End If
Next
End Sub

我觉得这个 sub 很复杂,我想用它回答的原因是因为它可以帮助任何试图实现类似目标的人。当然,需要进行一些调整。这是我用来从格式化文本到类似 HTML 的文本的功能。这不是问题的一部分,但有助于理解标签。它基于我在网上找到的一个功能(虽然我不记得在哪里)。如果您想按原样使用两个潜艇,请务必删除 <html></html>此函数放置的 HTML 字符串的开头和结尾处的标签。
Function fnConvert2HTML(myCell As Range) As String
Dim bldTagOn, itlTagOn, ulnTagOn, colTagOn, sizTagOn As Boolean
Dim i, chrCount As Integer
Dim chrCol, chrLastCol, chrSiz, chrLastSiz, htmlTxt As String

bldTagOn = False
itlTagOn = False
ulnTagOn = False
colTagOn = False
sizTagOn = False
chrCol = "NONE"
htmlTxt = "<html>"
chrCount = myCell.Characters.Count

For i = 1 To chrCount
With myCell.Characters(i, 1)
'If (.Font.Color) Then
chrCol = fnGetCol(.Font.Color)
If chrCol <> chrLastCol Then
htmlTxt = htmlTxt & "<c=#" & chrCol & ">"
chrLastCol = chrCol
End If
'End If

If (.Font.Size) Then
chrSiz = .Font.Size
If Len(chrSiz) = 1 Then chrSiz = "0" & chrSiz
If Not chrLastSiz = chrSiz Then
htmlTxt = htmlTxt & "<s=" & chrSiz & ">"
End If
chrLastSiz = chrSiz
End If

If .Font.Bold = True Then
If Not bldTagOn Then
htmlTxt = htmlTxt & "<b>"
bldTagOn = True
End If
Else
If bldTagOn Then
htmlTxt = htmlTxt & "</b>"
bldTagOn = False
End If
End If

If .Font.Italic = True Then
If Not itlTagOn Then
htmlTxt = htmlTxt & "<i>"
itlTagOn = True
End If
Else
If itlTagOn Then
htmlTxt = htmlTxt & "</i>"
itlTagOn = False
End If
End If

If .Font.Underline > 0 Then
If Not ulnTagOn Then
htmlTxt = htmlTxt & "<u>"
ulnTagOn = True
End If
Else
If ulnTagOn Then
htmlTxt = htmlTxt & "</u>"
ulnTagOn = False
End If
End If

If (Asc(.Text) = 10) Then
htmlTxt = htmlTxt & "[LF]"
Else
htmlTxt = htmlTxt & .Text
End If
End With
Next

If bldTagOn Then
htmlTxt = htmlTxt & "</b>"
bldTagOn = False
End If
If itlTagOn Then
htmlTxt = htmlTxt & "</i>"
itlTagOn = False
End If
If ulnTagOn Then
htmlTxt = htmlTxt & "</u>"
ulnTagOn = False
End If
htmlTxt = htmlTxt & "</html>"
fnConvert2HTML = htmlTxt
End Function

关于string - 处理字符串的字符限制,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38375431/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com