gpt4 book ai didi

java - 将配置单元函数转换为 java - 翻译和 regexp_replace

转载 作者:可可西里 更新时间:2023-11-01 14:50:17 24 4
gpt4 key购买 nike

1) 如何将下面的 hive 部分转换为 java map reduce?

 translate(regexp_replace(colA,"(\\\\=)","\\\\equalto"),"\[\]\(\)\{\}\^\?\+\*\$","____________") 

在 regexp_replace 中,我将替换所有 =,在外部翻译中,我将替换所有影响 future regexp_replace 解析的字符。(如果我不替换这些字符,它们稍后会引发异常)。

2) 我是否必须使用 replaceChars(),如果是,那么如何?

示例字符串格式为:

tag1=573 tag2=ABC 0nuif6d Saturn 0i899 AA 0 (WORD) LOWER 0 (WORD2) HH 0 BB 0 CC 1 LL 0 D 0 FF 0 AB 0 UPPER 0 (ONCOLD) UPPER 1 部分:已售出\= 88vb JJ number\= 0 String "String_here"ANDND JUJFNG fill EXTRA SUNSET: empty tag3=/Informational tag4=/Value tag5=Value1/Value2 tag6=/AB/Acs Sy/Api Afg Hold Cones/HHH+11: 4.3。 2-4.3.4 tag6=11123 tag7=Hello World tag8=a-dfdAds\=\= tag9=Value3 tag.9=空格分隔的单词\= 88 , cold 87 Goal Run\=2, LOT OF SPACE SEARATED GARBAGE WORDS tag .a=0( tag.b=02

注意:标签没有硬编码为标签。它们可以是任何英文单词,例如 serial_number 或 website.address,例如 serial_no=hello world website.address=\SO.com=/question 其中 serial_nowebsite。地址是标签。

最佳答案

描述

这个表达式将:

  • 假设标签名称不包含空格
  • 假设标签名称与它们各自的值由 = 分隔= 两边都没有空格符号并且不以 \ 开头
  • 假设标签名称与前面的字符串用空格分隔,同样,值将与下一个标签名称用空格分隔
  • 捕获标签名称和值
  • 将避免在嵌入在字符串的值侧的等号处打断字符串

(\S*?)(?<!\\)=(\S*.*?)(?=\S*(?<!\\)=|\Z)

enter image description here

然后您可以根据需要重新组装或进一步处理字符串的各个组件。

例子

Live Demo

示例文本

来自您在评论中包含的示例文本。目前还不清楚是什么定义了标签或用于将名称与值分开的等号:

serial_no=hello world website.address=\SO.com=/question
tag1=573 tag2=ABC 0nuif6d Saturn 0i899 AA 0 (WORD) LOWER 0 (WORD2) HH 0 BB 0 CC 1 LL 0 D 0 FF 0 AB 0 UPPER 0 (ONCOLD) UPPER 1 PART: Sold \= 88vb JJ number\= 0 String "String_here" ANDND JUJFNG fill EXTRA SUNSET: empty tag3=/Informational tag4=/Value tag5=Value1/Value2 tag6=/AB/Acs Sy/Api Afg Hold Cones/HHH+11: 4.3.2-4.3.4 tag6=11123 tag7=Hello World tag8=a-dfdAds\=\= tag9=Value3 tag.9=Space separated words \= 88 , cold 87 Goal Run\=2, LOT OF SPACE SEPARATED GARBAGE WORDS tag.a=0( tag.b=02

示例代码

import java.util.regex.Pattern;
import java.util.regex.Matcher;
class Module1{
public static void main(String[] asd){
String sourcestring = "source string to match with pattern";
Pattern re = Pattern.compile("(\\S*?)(?<!\\\\)=(\\S*.*?)(?=\\S*(?<!\\\\)=|\\Z)",Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL);

Matcher m = re.matcher(sourcestring);
int mIdx = 0;
while (m.find()){
for( int groupIdx = 0; groupIdx < m.groupCount()+1; groupIdx++ ){
System.out.println( "[" + mIdx + "][" + groupIdx + "] = " + m.group(groupIdx));
}
mIdx++;
}
}
}

匹配

第 0 组将包含整个子串第 1 组将具有名称字段第 2 组将具有值字段

[0][0] = serial_no=hello world 
[0][1] = serial_no
[0][2] = hello world

[1][0] = website.address=\SO.com=/question
[1][1] = website.address
[1][2] = \SO.com=/question

[2][0] = tag1=573
[2][1] = tag1
[2][2] = 573

[3][0] = tag2=ABC 0nuif6d Saturn 0i899 AA 0 (WORD) LOWER 0 (WORD2) HH 0 BB 0 CC 1 LL 0 D 0 FF 0 AB 0 UPPER 0 (ONCOLD) UPPER 1 PART: Sold \= 88vb JJ number\= 0 String "String_here" ANDND JUJFNG fill EXTRA SUNSET: empty
[3][1] = tag2
[3][2] = ABC 0nuif6d Saturn 0i899 AA 0 (WORD) LOWER 0 (WORD2) HH 0 BB 0 CC 1 LL 0 D 0 FF 0 AB 0 UPPER 0 (ONCOLD) UPPER 1 PART: Sold \= 88vb JJ number\= 0 String "String_here" ANDND JUJFNG fill EXTRA SUNSET: empty

[4][0] = tag3=/Informational
[4][1] = tag3
[4][2] = /Informational

[5][0] = tag4=/Value
[5][1] = tag4
[5][2] = /Value

[6][0] = tag5=Value1/Value2
[6][1] = tag5
[6][2] = Value1/Value2

[7][0] = tag6=/AB/Acs Sy/Api Afg Hold Cones/HHH+11: 4.3.2-4.3.4
[7][1] = tag6
[7][2] = /AB/Acs Sy/Api Afg Hold Cones/HHH+11: 4.3.2-4.3.4

[8][0] = tag6=11123
[8][1] = tag6
[8][2] = 11123

[9][0] = tag7=Hello World
[9][1] = tag7
[9][2] = Hello World

[10][0] = tag8=a-dfdAds\=\=
[10][1] = tag8
[10][2] = a-dfdAds\=\=

[11][0] = tag9=Value3
[11][1] = tag9
[11][2] = Value3

[12][0] = tag.9=Space separated words \= 88 , cold 87 Goal Run\=2, LOT OF SPACE SEPARATED GARBAGE WORDS
[12][1] = tag.9
[12][2] = Space separated words \= 88 , cold 87 Goal Run\=2, LOT OF SPACE SEPARATED GARBAGE WORDS

[13][0] = tag.a=0(
[13][1] = tag.a
[13][2] = 0(

[14][0] = tag.b=02
[14][1] = tag.b
[14][2] = 02

关于java - 将配置单元函数转换为 java - 翻译和 regexp_replace,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/17896339/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com