gpt4 book ai didi

regex - CSV(在字段值中有额外的引号)到 ColdFusion 中的数组

转载 作者:行者123 更新时间:2023-12-01 23:39:51 26 4
gpt4 key购买 nike

我正在使用 this 将 CSV 文件转换为数组邮政。一切都工作正常。但我得到了一个文件,其中在字段值中包含额外的引号,例如:

“bash:“快捷方式”是”

“bash:\“快捷方式\”是”

所以我尝试像这样替换这些引号:

<cffile action="read" file="#filePath#" variable="csvContent">
<cfset csvContent = reReplace(csvContent, '(?:[^,\r\n])"(?:[^,\r\n])', '&quot;', 'ALL')>

<--- Then do the conversion --->
<cfset array = csvToArray(csv = csvContent)>

但是非捕获组不起作用。我做错了什么?

还有其他方法可以做到这一点吗?

Edit 1:

我还尝试使用 cfhttp 并收到以下错误:

<cfhttp name="csvToQuery" method="get" url="#url#" />

Detail : Verify the number of columns specified in the columnsattribute and in the target file

Message : Incorrect number of columns in row.

StackTrace :coldfusion.tagext.net.HttpTag$InvalidColumnsException: Incorrectnumber of columns in row. atcoldfusion.tagext.net.HttpTag.connHelper(HttpTag.java:1149) atcoldfusion.tagext.net.HttpTag.doEndTag(HttpTag.java:1219) atcfmfhttp2ecfm308364137.runPage(C:\inetpub\wwwroot\mfhttp.cfm:1) atcoldfusion.runtime.CfJspPage.invoke(CfJspPage.java:244) atcoldfusion.tagext.lang.IncludeTag.doStartTag(IncludeTag.java:446) atcoldfusion.filter.CfincludeFilter.invoke(CfincludeFilter.java:65) atcoldfusion.filter.IpFilter.invoke(IpFilter.java:64) atcoldfusion.filter.ApplicationFilter.invoke(ApplicationFilter.java:430)atcoldfusion.filter.RequestMonitorFilter.invoke(RequestMonitorFilter.java:48)at coldfusion.filter.MonitoringFilter.invoke(MonitoringFilter.java:40)at coldfusion.filter.PathFilter.invoke(PathFilter.java:112) atcoldfusion.filter.LicenseFilter.invoke(LicenseFilter.java:30) atcoldfusion.filter.ExceptionFilter.invoke(ExceptionFilter.java:94) atcoldfusion.filter.ClientScopePersistenceFilter.invoke(ClientScopePersistenceFilter.java:28)at coldfusion.filter.BrowserFilter.invoke(BrowserFilter.java:38) atcoldfusion.filter.NoCacheFilter.invoke(NoCacheFilter.java:58) atcoldfusion.filter.GlobalsFilter.invoke(GlobalsFilter.java:38) atcoldfusion.filter.DatasourceFilter.invoke(DatasourceFilter.java:22) atcoldfusion.filter.CachingFilter.invoke(CachingFilter.java:62) atcoldfusion.CfmServlet.service(CfmServlet.java:219) atcoldfusion.bootstrap.BootstrapServlet.service(BootstrapServlet.java:89)atorg.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:303)atorg.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)atcoldfusion.monitor.event.MonitoringServletFilter.doFilter(MonitoringServletFilter.java:42)atcoldfusion.bootstrap.BootstrapFilter.doFilter(BootstrapFilter.java:46)atorg.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)atorg.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)atorg.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220)atorg.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)atorg.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:501)atorg.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)atorg.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)atorg.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:950)atorg.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)atorg.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:422)at org.apache.coyote.ajp.AjpProcessor.process(AjpProcessor.java:199)atorg.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:607)atorg.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:314)atjava.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)atjava.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)atorg.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)at java.lang.Thread.run(Thread.java:722)

最佳答案

哦,您将无法如此轻松地自行修复此类输入。正则表达式会进一步破坏您的数据。

你能用 Java 创建一个小脚本来处理这个问题吗?如果这样做,请使用 uniVocity-parsers读取您的 CSV 输入并使用正确的引号转义将其写回:

这是唯一可以处理损坏的引号转义的 CSV 解析器。试试这个例子:

import com.univocity.parsers.csv;

import java.io.*;
import java.util.*;

public class Test {

public static void main(String ... args){
CsvParserSettings settings = new CsvParserSettings();
settings.getFormat().setLineSeparator("\r\n");
settings.setParseUnescapedQuotes(true); // THIS IS IMPORTANT FOR YOU
CsvParser parser = new CsvParser(settings);

String line1 = "something,\"a quoted value \"with unescaped quotes\" can be parsed\", something\r\n";
System.out.println("Input line: " + line1);

String line2 = "\"after the newline \r\n you will find \" more stuff\r\n";
System.out.println("Input line: " + line2);

List<String[]> allInputLines = parser.parseAll(new StringReader(line1 + line2));

System.out.println("===============\nParsed input values\n===============");
int count = 0;
for(String[] line : allInputLines){
System.out.println("From line " + ++count + ":");
for(String element : line){
System.out.println("\t" + element);

}
System.out.println();
}

//Let's write your output CSV
StringWriter output = new StringWriter();
CsvWriterSettings writerSettings = new CsvWriterSettings();
writerSettings.getFormat().setLineSeparator("\r\n");
writerSettings.getFormat().setQuoteEscape('\\'); //it seems you are using backslash as quote escape
writerSettings.getFormat().setCharToEscapeQuoteEscaping('\\'); //when your quote escape character is not the same as the quote character, you might need to escape the escape character as well
writerSettings.setQuoteAllFields(true); //let's force quotes on all fields so whatever is parsing your input file has more chance of doing it properly
CsvWriter writer = new CsvWriter(output, writerSettings);

for(String[] row : allInputLines){
writer.writeRow(row);
}
writer.close();

System.out.println("===============\nNicely formatted output\n===============");
System.out.println(output.toString());

}

}

此代码将产生以下输出(您的数据导入工具可能会读取该输出):

Input line: something,"a quoted value "with unescaped quotes" can be parsed", something

Input line: "after the newline
you will find " more stuff

===============
Parsed input values
===============
From line 1:
something
a quoted value "with unescaped quotes" can be parsed
something

From line 2:
after the newline
you will find " more stuff


===============
Nicely formatted output
===============
"something","a quoted value \"with unescaped quotes\" can be parsed","something"

"after the newline
you will find \" more stuff"

披露:我是这个库的作者。它是开源且免费的(Apache V2.0 许可证)。

ColdFusion 10+ 示例:

  1. 将 jar 加载到 Application.cfc

    this.javaSettings = { loadPaths: ["C:\path\to\univocity-parsers-1.5.6.jar" ]};
  2. 使用 createObject 创建解析器类的实例:

    filePath = "c:\path\to\yourFile.csv";
    settings = createObject("java", "com.univocity.parsers.csv.CsvParserSettings").init();
    settings.getFormat().setLineSeparator(chr(13)& chr(10));
    settings.getFormat().setQuoteEscape("\");
    settings.setParseUnescapedQuotes(true); // THIS IS IMPORTANT FOR YOU
    parser = createObject("java", "com.univocity.parsers.csv.CsvParser").init(settings);
    reader = createObject("java", "java.io.StringReader").init(fileRead(filePath));
    arrayOfLines = parser.parseAll(reader);

    // display results
    counter = 1;
    for (line in arrayOfLines) {
    writeOutput("<br>From line "& (counter++) & ":");
    for (element in line) {
    writeOutput("<br>"& element);
    }
    }

关于regex - CSV(在字段值中有额外的引号)到 ColdFusion 中的数组,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/30711467/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com