gpt4 book ai didi

java - 仅使用具有多个 SSN 的文件中的部分掩码来屏蔽所有 SSN

转载 作者:行者123 更新时间:2023-11-30 05:30:31 28 4
gpt4 key购买 nike

首先要声明我对正则表达式很糟糕。我想在字符串中查找社会保障号的每个实例,并屏蔽除破折号 (-) 和 SSN 的最后 4 个之外的所有实例。

示例

String someStrWithSSN = "This is an SSN,123-31-4321, and here is another 987-65-8765";
Pattern formattedPattern = Pattern.compile("^\\d{9}|^\\d{3}-\\d{2}-\\d{4}$");
Matcher formattedMatcher = formattedPattern.matcher(someStrWithSSN);

while (formattedMatcher.find()) {
// Here is my first issue. not finding the pattern
}

// my next issue is that I need to my String should look like this
// "This is an SSN,XXX-XX-4321, and here is another XXX-XX-8765"

预期结果是找到每个 SSN 并替换。上面的代码应生成字符串“这是一个 SSN,XXX-XX-4321,这里是另一个 XXX-XX-8765”

最佳答案

您可以通过执行以下操作来简化此操作:

String initial = "This is an SSN,123-31-4321, and here is another 987-65-8765";
String processed = initial.replaceAll("\\d{3}\\-\\d{2}(?=\\-\\d{4})","XXX-XX");
System.out.println(initial);
System.out.println(processed);

输出:

This is an SSN,123-31-4321, and here is another 987-65-8765
This is an SSN,XXX-XX-4321, and here is another XXX-XX-8765

正则表达式 \d{3}\-\d{2}(?=\-\d{4}) 捕获三个数字,后跟两个数字,用破折号分隔(然后后跟破折号和 4 位数字,非捕获)。将 replaceAll 与此正则表达式一起使用将创建所需的屏蔽效果。

编辑:

如果您还希望此替换针对 9 个连续数字,您可以执行以下操作:

String initial = "This is an SSN,123-31-4321, and here is another 987658765";
String processed = initial.replaceAll("\\d{3}\\-\\d{2}(?=\\-\\d{4})","XXX-XX")
.replaceAll("\\d{5}(?=\\d{4})","XXXXX");
System.out.println(initial);
System.out.println(processed);

输出:

This is an SSN,123-31-4321, and here is another 987658765
This is an SSN,XXX-XX-4321, and here is another XXXXX8765

正则表达式 \d{5}(?=\d{4}) 捕获 5 位数字(后跟 4 位数字,非捕获)。使用第二次调用 replaceAll 将使用适当的替换来定位这些序列。

编辑:这是以前的正则表达式的更强大的版本,以及新正则表达式如何工作的更长演示:

String initial = "123-45-6789 is a SSN that starts at the beginning of the string,
and still matches. This is an SSN, 123-31-4321, and here is another 987658765. These
have 10+ digits, so they don't match: 123-31-43214, and 98765876545.
This (123-31-4321-blah) has 9 digits, but is followed by a dash, so it doesn't match.
-123-31-4321 is preceded by a dash, so it doesn't match as well. :123-31-4321 is
preceded by a non-colon/digit, so it does match. Here's a 4-2-4 non-SSN that would've
tricked the initial regex: 1234-56-7890. Here's two SSNs in parentheses: (777777777)
(777-77-7777), and here's four invalid SSNs in parentheses: (7777777778) (777-77-77778)
(777-778-7777) (7778-77-7777). At the end of the string is a matching SSN:
998-76-4321";
String processed = initial.replaceAll("(?<=^|[^-\\d])\\d{3}\\-\\d{2}(?=\\-\\d{4}([^-\\d]|$))","XXX-XX")
.replaceAll("(?<=^|[^-\\d])\\d{5}(?=\\d{4}($|\\D))","XXXXX");
System.out.println(initial);
System.out.println(processed);

输出:

123-45-6789 is a SSN that starts at the beginning of the string, and still matches. This is an SSN, 123-31-4321, and here is another 987658765. These have 10+ digits, so they don't match: 123-31-43214, and 98765876545. This (123-31-4321-blah) has 9 digits, but is followed by a dash, so it doesn't match. -123-31-4321 is preceded by a dash, so it doesn't match as well. :123-31-4321 is preceded by a non-colon/digit, so it does match. Here's a 4-2-4 non-SSN that would've tricked the initial regex: 1234-56-7890. Here's two SSNs in parentheses: (777777777) (777-77-7777), and here's four invalid SSNs in parentheses: (7777777778)(777-77-77778) (777-778-7777) (7778-77-7777). At the end of the string is a matching SSN: 998-76-4321

XXX-XX-6789 is a SSN that starts at the beginning of the string, and still matches. This is an SSN, XXX-XX-4321, and here is another XXXXX8765. These have 10+ digits, so they don't match: 123-31-43214, and 98765876545. This (123-31-4321-blah) has 9 digits, but is followed by a dash, so it doesn't match. -123-31-4321 is preceded by a dash, so it doesn't match as well. :XXX-XX-4321 is preceded by a non-colon/digit, so it does match. Here's a 4-2-4 non-SSN that would've tricked the initial regex: 1234-56-7890. Here's two SSNs in parentheses: (XXXXX7777) (XXX-XX-7777), and here's four invalid SSNs in parentheses: (7777777778)(777-77-77778) (777-778-7777) (7778-77-7777). At the end of the string is a matching SSN: XXX-XX-4321

关于java - 仅使用具有多个 SSN 的文件中的部分掩码来屏蔽所有 SSN,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57663989/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com