gpt4 book ai didi

java - replaceAll() 方法使用文本文件中的参数

转载 作者:行者123 更新时间:2023-12-04 05:43:29 27 4
gpt4 key购买 nike

我在数据库的表中有一组原始文本,我需要使用一组单词替换此集合中的一些单词。
我将所有要替换的术语及其替代项放在一个文本文件中,如下所示

min=admin
lelet=lambat
lemot=lambat
nii=nih
ntu=itu

等等。
我已经成功启动了 File 和 Scanner 的变量来阅读术语及其替代品的集合。

我循环所有数据集并将原始文本保存在一个字符串中
在同一个循环中
我循环所有术语集合并将其行保存到字符串名称“pattern”,并将模式拆分为两个名为“term”和“replacer”的字符串
在这个循环中,我启动一个新字符串,它的值是来自 replaceAll(term,replacer) 修改的数据集中的字符串
术语收集的结束循环
然后我将新字符串插入数据库中的另一个表
数据集结束循环

我手动做如下
replaceAll("min","admin")
和它的工作原理,但它确实需要手动编码近 2000 个术语来替换它。

任何人都曾经面对过这种真正的事情..
我现在真的很需要帮助 :(
 package sentimenrepo;
import javax.swing.*;
import java.sql.*;
import java.io.*;
//import java.util.HashMap;
import java.util.Scanner;
//import java.util.Map;
/**
*
* @author herman
*/
public class synonimReplaceV2 extends SwingWorker {
protected Object doInBackground() throws Exception {
new skripsisentimen.sentimenttwitter().setVisible(true);

Integer row = 0;
File synonimV2 = new File("synV2/catatan_kata_sinonim.txt");
String newTweet = "";
DB db = new DB();
Connection conn = db.dbConnect("jdbc:mysql://localhost:3306/tweet", "root", "");
try{
Statement select = conn.createStatement();
select.executeQuery("select * from synonimtweet");
ResultSet RS = select.getResultSet();
Scanner scSynV2 = new Scanner(synonimV2);
while(RS.next()){
row++;

String no = RS.getString("no");
String tweet = " "+ RS.getString("tweet");
String published = RS.getString("published");
String label = RS.getString("label");
clean2 cleanv2 = new clean2();

newTweet = cleanv2.cleanTweet(tweet);
try{
Statement insert = conn.createStatement();
insert.executeUpdate("INSERT INTO synonimtweet_v2(no,tweet,published,label) values('"
+no+"','"+newTweet+"','"+published+"','"+label+"')");
String current = skripsisentimen.sentimenttwitter.txtAreaResult.getText();
skripsisentimen.sentimenttwitter.txtAreaResult.setText(current+"\n"+row+"original : "+tweet+"\n"+newTweet+"\n______________________\n");
skripsisentimen.sentimenttwitter.lblStat.setText(row+" tweet read");
skripsisentimen.sentimenttwitter.txtAreaResult.setCaretPosition(skripsisentimen.sentimenttwitter.txtAreaResult.getText().length() - 1);

}catch(Exception e){
skripsisentimen.sentimenttwitter.lblStat.setText(e.getMessage());

}

skripsisentimen.sentimenttwitter.lblStat.setText(e.getMessage());

}
}catch(Exception e){
skripsisentimen.sentimenttwitter.lblStat.setText(e.getMessage());

}
return row;
}
class clean2{

public clean2(){}

public String cleanTweet(String tweet){
File synonimV2 = new File("synV2/catatan_kata_sinonim.txt");

String pattern = "";
String term = "";
String replacer = "";
String newTweet="";
try{
Scanner scSynV2 = new Scanner(synonimV2);
while(scSynV2.hasNext()){
pattern = scSynV2.next();
term = pattern.split("=")[0];
replacer = pattern.split("=")[1];
newTweet = tweet.replace(term, replacer);
}
}catch(Exception e){
e.printStackTrace();
}

System.out.println(newTweet+"\n"+tweet);
return newTweet;

}
}

}

更新

我刚刚意识到代码实际上有效,但仅适用于数据库中的第一行,第二行等等。这是我更新我构建的最新代码
public class synonimReplaceV2 extends SwingWorker {

protected Object doInBackground() throws Exception {
new skripsisentimen.sentimenttwitter().setVisible(true);

Integer row = 0;

String newTweet = "";
DB db = new DB();
Connection conn = db.dbConnect("jdbc:mysql://localhost:3306/tweet", "root", "");
try{
Statement select = conn.createStatement();
select.executeQuery("select * from synonimtweet limit 2,10");
ResultSet RS = select.getResultSet();
FileReader readSyn = new FileReader("synV2/catatan_kata_sinonim.txt");
BufferedReader buffSyn = new BufferedReader(readSyn);
while(RS.next()){
row++;
String no = RS.getString("no");
String tweet = " "+ RS.getString("tweet");
String published = RS.getString("published");
String label = RS.getString("label");
String pattern = "";
while((pattern=buffSyn.readLine())!=null){
String patternTerm = pattern.split("=")[0];
String patternSubs = pattern.split("=")[1];
tweet = tweet.replaceAll("\\s"+patternTerm, patternSubs);
}

try{
Statement insert = conn.createStatement();
insert.executeUpdate("INSERT INTO synonimtweet_v2(no,tweet,published,label) values('"
+no+"','"+tweet+"','"+published+"','"+label+"')");
String current = skripsisentimen.sentimenttwitter.txtAreaResult.getText();
skripsisentimen.sentimenttwitter.txtAreaResult.setText(current+"\n"+row+"original : "+tweet+"\n"+newTweet+"\n______________________\n");
skripsisentimen.sentimenttwitter.lblStat.setText(row+" tweet read");
skripsisentimen.sentimenttwitter.txtAreaResult.setCaretPosition(skripsisentimen.sentimenttwitter.txtAreaResult.getText().length() - 1);

}catch(Exception e){
skripsisentimen.sentimenttwitter.lblStat.setText(e.getMessage());
}


}
}catch(Exception e){
skripsisentimen.sentimenttwitter.lblStat.setText(e.getMessage());
// System.out.println(e.getMessage());
}
Thread.sleep(100);
return row;
}
}

最佳答案

打开同义词文件并为 ResultSet 中的每一行迭代 2,000 多行有点浪费。

将您的同义词加载到内存映射中,以唯一的拼写错误为关键字,然后在映射上查找结果集中的每一行,并根据需要进行替换。

关于java - replaceAll() 方法使用文本文件中的参数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/10983981/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com