gpt4 book ai didi

vba - 如何以编程方式迭代 Word 文档中的下标、上标和方程

转载 作者:行者123 更新时间:2023-12-02 03:43:06 34 4
gpt4 key购买 nike

我有一些 Word 文档,每个文档都包含数百页的科学数据,其中包括:

  • 化学式(H2SO4 以及所有正确的下标和上标)
  • 科学数字(使用上标格式化的指数)
  • 大量数学方程。使用 Word 中的数学方程编辑器编写。

问题是,以 Word 形式存储这些数据对我们来说效率不高。所以我们想将所有这些信息存储在数据库(MySQL)中。我们希望将这些格式转换为 LaTex。

有没有办法使用 VBA 迭代所有下标、上标和方程?

迭代数学方程怎么样?

最佳答案

基于您的comment关于迈克尔的回答

No! I just want to replace content in the subscript with _{ subscriptcontent } and similarly superscript content with ^{ superscriptcontent }. That would be the Tex equivalent. Now, I'll just copy everything to a text file which will remove the formatting but leaves these characters. Problem solved. But for that I need to access the subscript & superscript objects of document

Sub sampler()
Selection.HomeKey wdStory
With Selection.find
.ClearFormatting
.Replacement.ClearFormatting
.Font.Superscript = True
.Replacement.Text = "^^{^&}"
.Execute Replace:=wdReplaceAll
.Font.Subscript = True
.Replacement.Text = "_{^&}"
.Execute Replace:=wdReplaceAll
End With
End Sub

编辑

或者,如果您还想将 OMaths 转换为 TeX/LaTeX,请执行以下操作:

  • 迭代 Omaths > 将每个文件转换为 MathML > [将 MathML 保存到磁盘] + [在描述 MathML 文件引用的文档中放置一些标记来代替 OMath] > 将 Word 文件转换为文本
  • 现在准备一个转换器,如 MathParser并将 MathML 文件转换为 LateX。
  • 解析文本文件 > 相应地搜索并替换 LaTeX 代码。

如需完全不同的想法,请访问 David Carlisle's blog ,您可能会感兴趣。

更新

模块

Option Explicit

'This module requires the following references:
'Microsoft Scripting Runtime
'MicroSoft XML, v6.0

Private fso As New Scripting.FileSystemObject
Private omml2mml$, mml2Tex$

Public Function ProcessFile(fpath$) As Boolean
'convPath set to my system at (may vary on your system):
omml2mml = "c:\program files\microsoft office\office14\omml2mml.xsl"
'download: http://prdownloads.sourceforge.net/xsltml/xsltml_2.0.zip
'unzip at «c:\xsltml_2.0»
mml2Tex = "c:\xsltml_2.0\mmltex.xsl"

Documents.Open fpath

'Superscript + Subscript
Selection.HomeKey wdStory
With Selection.find
.ClearFormatting
.Replacement.ClearFormatting

'to make sure no paragraph should contain any emphasis
.Text = "^p"
.Replacement.Text = "^&"
.Replacement.Font.Italic = False
.Replacement.Font.Bold = False
.Replacement.Font.Superscript = False
.Replacement.Font.Subscript = False
.Replacement.Font.SmallCaps = False
.Execute Replace:=wdReplaceAll


.Font.Italic = True
.Replacement.Text = "\textit{^&}"
.Execute Replace:=wdReplaceAll

.Font.Bold = True
.Replacement.Text = "\textbf{^&}"
.Execute Replace:=wdReplaceAll

.Font.SmallCaps = True
.Replacement.Text = "\textsc{^&}"
.Execute Replace:=wdReplaceAll


.Font.Superscript = True
.Replacement.Text = "^^{^&}"
.Execute Replace:=wdReplaceAll


.Font.Subscript = True
.Replacement.Text = "_{^&}"
.Execute Replace:=wdReplaceAll
End With

Dim dict As New Scripting.Dictionary
Dim om As OMath, t, counter&, key$
key = Replace(LCase(Dir(fpath)), " ", "_omath_")
counter = 0

For Each om In ActiveDocument.OMaths
DoEvents
counter = counter + 1
Dim tKey$, texCode$
tKey = "<" & key & "_" & counter & ">"
t = om.Range.WordOpenXML

texCode = TransformString(TransformString(CStr(t), omml2mml), mml2Tex)
om.Range.Select
Selection.Delete
Selection.Text = tKey

dict.Add tKey, texCode

Next om

Dim latexDoc$, oPath$
latexDoc = "\documentclass[10pt]{article}" & vbCrLf & _
"\usepackage[utf8]{inputenc} % set input encoding" & vbCrLf & _
"\usepackage{amsmath,amssymb}" & vbCrLf & _
"\begin{document}" & vbCrLf & _
"###" & vbCrLf & _
"\end{document}"

oPath = StrReverse(Mid(StrReverse(fpath), InStr(StrReverse(fpath), "."))) & "tex"
'ActiveDocument.SaveAs FileName:=oPath, FileFormat:=wdFormatText, Encoding:=1200
'ActiveDocument.SaveAs FileName:=oPath, FileFormat:=wdFormatText, Encoding:=65001
ActiveDocument.Close

Dim c$, i
c = fso.OpenTextFile(oPath).ReadAll()

counter = 0

For Each i In dict
counter = counter + 1
Dim findText$, replaceWith$
findText = CStr(i)
replaceWith = dict.item(i)
c = Replace(c, findText, replaceWith, 1, 1, vbTextCompare)
Next i

latexDoc = Replace(latexDoc, "###", c)

Dim ost As TextStream
Set ost = fso.CreateTextFile(oPath)
ost.Write latexDoc

ProcessFile = True


End Function

Private Function CreateDOM()
Dim dom As New DOMDocument60
With dom
.async = False
.validateOnParse = False
.resolveExternals = False
End With
Set CreateDOM = dom
End Function

Private Function TransformString(xmlString$, xslPath$) As String
Dim xml, xsl, out
Set xml = CreateDOM
xml.LoadXML xmlString
Set xsl = CreateDOM
xsl.Load xslPath
out = xml.transformNode(xsl)
TransformString = out
End Function

调用(从即时窗口):

?ProcessFile("c:\test.doc")

结果将在 c:\ 中创建为 test.tex

<小时/>

模块可能需要修复一些地方。如果是这样,请告诉我。

关于vba - 如何以编程方式迭代 Word 文档中的下标、上标和方程,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/11565839/

34 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com