gpt4 book ai didi

excel - 在 VBA 中处理大型分隔文本文件

转载 作者:行者123 更新时间:2023-12-04 21:41:16 26 4
gpt4 key购买 nike

使用 VBA,我需要将当前位于分隔文本文件中的数据(如数百列数万行)“反透视”为规范化形式(四列数百万行);也就是说,结果表将包含列,对于每个单元格:

  • 识别原始表/文件;
  • 识别原始表格中单元格的行;
  • 识别原始表格中单元格的列;
  • 包含该单元格的值。

  • 对于如何有效地完成这项任务的任何想法,我通常会心存感激。

    到目前为止,我已经考虑过使用 ADODB 构造一个 SELECT INTO ... UNION ...构建输出表的查询,但遗憾的是,默认文本文件提供程序限制为 255 列(有哪些不是吗?)。

    Sébastien Lorion 建立了一个了不起的 Fast CSV Reader ,我很想使用它,但我不知道如何在 VBA 中使用它-感谢任何想法(我认为它没有被编译为导​​出 COM 接口(interface),而且我没有重新编译的工具它)。为此,Microsoft 还提供了 TextFieldParser类,但我不知道是否/如何从 VBA 中使用它。

    另一种方法可能是让 Excel >=2007 打开源文件并从那里构造输出表,但直观地“感觉”好像它会产生相当大的浪费开销......

    最佳答案

    已编译但未测试

    Sub UnpivotFile(sPath As String)

    Const DELIM As String = ","
    Const QUOTE As String = """"

    Dim FSO As New FileSystemObject
    Dim arrHeader
    Dim arrContent
    Dim lb As Integer, ub As Integer
    Dim x As Integer
    Dim inData As Boolean
    Dim l As String, fName As String
    Dim fIn As Scripting.TextStream
    Dim fOut As Scripting.TextStream
    Dim tmp As String
    Dim lineNum As Long

    fName = FSO.GetFileName(sPath)

    Set fIn = FSO.OpenTextFile(sPath, ForReading)
    Set fOut = FSO.OpenTextFile(sPath & "_out", ForWriting)
    lineNum = 0

    Do While Not fIn.AtEndOfStream

    l = fIn.ReadLine
    lineNum = lineNum + 1
    arrContent = ParseLineToArray(l, DELIM, QUOTE)

    If Not inData Then
    arrHeader = arrContent
    lb = LBound(arrHeader)
    ub = UBound(arrHeader)
    inData = True
    Else
    For x = lb To ub
    fOut.WriteLine Join(Array(fName, lineNum, _
    QID(arrHeader(x), DELIM, QUOTE), _
    QID(arrContent(x), DELIM, QUOTE)), DELIM)

    Next x
    End If
    Loop
    fIn.Close
    fOut.Close
    End Sub

    'quote if delimiter found
    Function QID(s, d As String, q As String)
    QID = IIf(InStr(s, d) > -1, q & s & q, s)
    End Function


    'Split a string into an array based on a Delimiter and a Text Identifier
    Private Function ParseLineToArray(sInput As String, m_Delim As String, _
    m_TextIdentifier As String) As Variant
    'Dim vArr As Variant
    Dim sArr() As String
    Dim bInText As Boolean
    Dim i As Long, n As Long
    Dim sTemp As String, tmp As String

    If sInput = "" Or InStr(1, sInput, m_Delim) = 0 Then
    'zero length string, or delimiter not present
    'dump all input into single-element array (minus Text Identifier)
    ReDim sArr(0)
    sArr(0) = Replace(sInput, m_TextIdentifier, "")
    ParseLineToArray = sArr()
    Else
    If InStr(1, sInput, m_TextIdentifier) = 0 Then
    'no text identifier so just split and return
    sArr() = Split(sInput, m_Delim)
    ParseLineToArray = sArr()
    Else
    'found the text identifier, so do it the long way
    bInText = False
    sTemp = ""
    n = 0

    For i = 1 To Len(sInput)
    tmp = Mid(sInput, i, 1)
    If tmp = m_TextIdentifier Then
    'just toggle the flag - don't add to string
    bInText = Not bInText
    Else
    If tmp = m_Delim Then
    If Not bInText Then
    'delimiter not within quoted text, so add next array member
    ReDim Preserve sArr(n)
    sArr(n) = sTemp
    sTemp = ""
    n = n + 1
    Else
    sTemp = sTemp & tmp
    End If
    Else
    sTemp = sTemp & tmp
    End If 'character is a delimiter
    End If 'character is a quote marker
    Next i

    ReDim Preserve sArr(n)
    sArr(n) = sTemp

    ParseLineToArray = sArr()
    End If 'has any quoted text
    End If 'parseable

    End Function

    关于excel - 在 VBA 中处理大型分隔文本文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/9821013/

    26 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com