gpt4 book ai didi

java - 从文本文件中提取某些值(姓名、电子邮件、电话号码)

转载 作者:行者123 更新时间:2023-12-02 08:24:38 29 4
gpt4 key购买 nike

我有一大堆电子邮件,需要从中提取信息。我最近接手了一个网站,该网站将客户的所有联系信息存储在电子邮件中。他们想要开始将其存储在数据库中。我正在使用 Java 来尝试提取这些信息。我有点陷入困境。

我能够自行加载电子邮件,但无法提取信息。以下是电子邮件示例:

> ----------------------------------------------------------------------
> Name: Person's Name
> Phone:=20
> Email: test@testperson.com
> Street:=20
> City:=20
> State:=20
> Zip:=20
> Country:=20
> Arrival: 15 Nov 2010
> Departure: 22 Nov 2010
> Message: This is a message
> ----------------------------------------------------------------------
> Name: Second Person
> Phone:=555-5554
> Email: test@testpsdf.com
> Street:=1234 Main St.
> City:=20
> State:=20
> Zip:=23412
> Country:=20
> Arrival: 15 Nov 2010
> Departure: 22 Nov 2010
> Message: This is a message
> ----------------------------------------------------------------------

我需要在没有 =20 的地方进行拉取。我需要以某种方式将所有这些信息放入表或 CSV 文件中,以便可以将其导入 mysql 数据库。

编辑:

这实际上是文件看起来更像的

> ----------------------------------------------------------------------
> Name: Erin
> Phone: 401-
> Email: eri
> Street: 737
> City: Paw
> State:
> Zip: 02
> Country: USA
> Arrival: 17 Jul 2011
> Departure: 23 Jul 2011
> Message: I .=20
> ----------------------------------------------------------------------
>=20
> A representative will be in touch shortly.
> Thank You,
>
>=20
Begin forwarded message:

> From:
> Date: July 8, 2010 12:35:13 PM EDT
> To:
> Subject: Thank you for completing our contact form!
>=20
> Thank you for completing our contact form! We received the following =
information from you:
> ----------------------------------------------------------------------
> Name: Ludd
> Phone:=20
> Email: aedu
> Street: 25
> City: Signal
> State:
> Zip:
> Country: USA
> Arrival: 25 Nov 2010
> Departure: 30 Nov 2010
> Message: Not sure if
> ----------------------------------------------------------------------
>=20
> A representative will be in touch shortly.
> Thank You,
>
>=20
Begin forwarded message:

> From:
> Date: July 8, 2010 11:29:49 AM EDT
> To:
> Subject: Thank you for completing our contact form!
>=20
> Thank you for completing our contact form! We received the following =
information from you:
> ----------------------------------------------------------------------
> Name: Stephanie
> Phone: 41
> Email: sgor
> Street: 2-
> City:
> State: On
> Zip: 1J6
> Country:
> Arrival: 18 Aug 2010
> Departure: 21 Aug 2010
> Message:=20
> ----------------------------------------------------------------------
>=20
> A representative will be in touch shortly.
> Thank You,

>=20
Begin forwarded message:

> From:
> Date: July 8, 2010 11:16:36 AM EDT
> To:
> Subject: Thank you for completing our contact form!
>=20
> Thank you for completing our contact form! We received the following =
information from you:
> ----------------------------------------------------------------------
> Name: Stacey
> Phone: 001
> Email: staceymou
> Street: 60
> City: New York
> State: NY
> Zip: 0
> Country: USA
> Arrival: 10 Dec 2010
> Departure: 14 Dec 2010
> Message: Looking to reserve
> ----------------------------------------------------------------------

最佳答案

这是一种将所有此类 header 提取到 Map<String, String> 的方法。 。它使用 Google 的 Guava library使事情变得简单:

public static Map<String, String> readValuesFromFile(final File f)
throws IOException{

final Splitter splitter =
Splitter.on(':').trimResults().omitEmptyStrings();

final Map<String, String> map = Maps.newHashMap();

for(final String line :

Lists.transform(
Files.readLines(f, Charsets.UTF_8),
new Function<String, String>(){

@Override
public String apply(final String input){
return input != null && input.startsWith("> ")
? input.substring(2)
: input;
}

})){

if(line.startsWith("---")){
break;
}
final String[] items =
Iterables.toArray(splitter.split(line), String.class);
if(items.length == 2 && !items[1].startsWith("=20")){
map.put(items[0], items[1]);
}
}
return map;
}

关于java - 从文本文件中提取某些值(姓名、电子邮件、电话号码),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/4783780/

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com