gpt4 book ai didi

java - 为什么我的文本解析器会进入无限循环,尽管循环已明确中断?

转载 作者:行者123 更新时间:2023-12-01 21:11:15 24 4
gpt4 key购买 nike

我一直在开发一个实用程序,用于解析 Paradox Interactive 在其大战略游戏中使用的格式的文本文件,以便与我也在开发的基于视觉的修改工具一起使用。我写出了一个大部分实现的、粗糙的、早期版本的解析器,它基本上按预期工作。这是我第二次尝试编写文本解析器(第一次,最终工作得很好,解析了 XML 的子集)。

我在 9 号快速编写了我的解析器,并花了整个周末尝试调试它,但我所有的努力都失败了。我已将问题追溯到 nextChar() 的第三行。它抛出一个 ArrayIndexOutOfBounds 错误,错误的数字非常小(-2 百万)。添加边界检查后,程序就......继续。它根据需要读取所有信息,只是永远不会退出解析循环。

格式基本上是这样的:

car = {
model_year = 1966
model_name = "Chevy"
components = {
"engine", "frame", "muffler"
}
}

虽然我还没有像我计划的那样添加对嵌套列表的支持,所以我的测试字符串是:

car = {
model_year = 1966
model_name = "Chevy"
}

为了我的理解和任何会看到我的代码的人,我尝试在我认为可能有必要的地方慷慨地评论我的代码,但如果需要任何澄清,我很乐意提供。

我的代码:

/**
* Parses text files in the format used by Paradox Interactive in their computer games EUIV, CK2, and Stellaris.
*
* @author DJMethaneMan
* @date 12/9/2016
*/
public class Parser
{
private int pos, line, len, depth;
public String text;
private char[] script; //TODO: Initialize in the parse method

public Parser()
{
pos = 0;
line = 1;
len = 0;
depth = 0;
text = "car = {\n" +
" model_year = 1966 \n" +
" model_name = \"Chevy\"\n" +
"}\u0003";
//text = "Hello World";
//Car c = new Car();
//parse(text, c);
}

public static void main()
{
Car c = new Car();
Parser p = new Parser();
p.parse(p.text, c);
System.out.println("The model name is " + c.model_name);
System.out.println("The model year is " + c.model_year);
}

//TODO: Work
public void parse(String text, Parseable parsed)
{
char[] script = text.toCharArray();
this.script = script;
boolean next_char = false;
PARSE_LOOP:while(true)
{
char c;
if(next_char)
{
c = nextChar();
}
else
{
c = script[0];
next_char = true;
}

switch(c)
{
case 'A':
case 'a':
case 'B':
case 'b':
case 'C':
case 'c':
case 'D':
case 'd':
case 'E':
case 'e':
case 'F':
case 'f':
case 'G':
case 'g':
case 'H':
case 'h':
case 'I':
case 'i':
case 'J':
case 'j':
case 'K':
case 'k':
case 'L':
case 'l':
case 'M':
case 'm':
case 'N':
case 'n':
case 'O':
case 'o':
case 'P':
case 'p':
case 'Q':
case 'q':
case 'R':
case 'r':
case 'S':
case 's':
case 'T':
case 't':
case 'U':
case 'u':
case 'V':
case 'v':
case 'W':
case 'w':
case 'X':
case 'x':
case 'Y':
case 'y':
case 'Z':
case 'z':
case '_'://TODO: HERE
if(depth > 0) //
{
parsed.parseRead(buildWordToken(true), this);//Let the class decide how to handle this information. Best solution since I do not know how to implement automatic deserialization.
}
continueUntilChar('=', false); //A value must be assigned because it is basically a key value pair with {} or a string or number as the value
skipWhitespace();//Skip any trailing whitespace straight to the next token.
break;
case '{':
depth++;
break;
case '}':
depth--;
break;
case '\n':
line++;
break;
case ' ':
case '\t':
skipWhitespace();
break;
case '\u0003': //End of Text Character... Not sure if it will work in a file...
break PARSE_LOOP;
}
}
}

//Returns a string from the next valid token
public String parseString()
{
String retval = "";
continueUntilChar('=', false);
continueUntilChar('"', false);
retval = buildWordToken(false);
continueUntilChar('"', false); //Don't rewind because we want to skip over the quotation and not append it.
return retval;
}

//Returns a double from the next valid token
public double parseNumber()
{
double retval = 0;
continueUntilChar('=', false); //False because we don't want to include the = in any parsing...
skipWhitespace(); //In case we encounter whitespace.
try
{
retval = Double.parseDouble(buildNumberToken(false));
}
catch(Exception e)
{
System.out.println("A token at line " + line + " is not a valid number but is being passed as such.");
}
return retval;
}


/**********************************Utility Methods for Parsing****************************************/

protected void continueUntilChar(char target, boolean rewind)
{
while(true)
{
char c = nextChar();
if(c == target)
{
break;
}
}
if(rewind)
{
pos--;
}
}

protected void skipWhitespace()
{
while(true)
{
char c = nextChar();
if(!Character.isWhitespace(c))
{
break;
}
}
pos--;//Rewind because by default parse increments pos by 1 one when fetching nextChar each iteration.
}

protected String buildNumberToken(boolean rewind)
{
StringBuilder token = new StringBuilder();
String retval = "INVALID_NUMBER";
char token_start = script[pos];
System.out.println(token_start + " is a valid char for a word token."); //Print it.
token.append(token_start);
while(true)
{
char c = nextChar();
if(Character.isDigit(c) || (c == '.' && (Character.isDigit(peek(1)) || Character.isDigit(rewind(1))))) //Makes sure things like 1... and ...1234 don't get parsed as numbers.
{
token.append(c);
System.out.println(c + " is a valid char for a word token."); //Print it for debugging
}
else
{
break;
}
}
return retval;
}

protected String buildWordToken(boolean rewind)
{
StringBuilder token = new StringBuilder(); //Used to build the token
char token_start = script[pos]; //The char the parser first found would make this a valid token
token.append(token_start); //Add said char since it is part of the token
System.out.println(token_start + " is a valid char for a word token."); //Print it.
while(true)
{
char c = nextChar();
if(Character.isAlphabetic(c) || Character.isDigit(c) || c == '_')//Make sure it is a valid token for a word
{
System.out.println(c + " is a valid char for a word token."); //Print it for debugging
token.append(c); //Add it to the token since its valid
}
else
{
if(rewind)//If leaving the method will make this skip over a valid token set this to true.
{
//Rewind by 1 because the main loop in parse() will still check pos++ and we want to check the pos of the next char after the end of the token.
pos--;
break; //Leave the loop and return the token.
}
else //Otherwise
{
break; //Just leave the loop and return the token.
}
}
}
return token.toString(); //Get the string value of the token and return it.
}

//Returns the next char in the script by amount but does not increment pos.
protected char peek(int amount)
{
int lookahead = pos + amount; //pos + 1;
char retval = '\u0003'; //End of text character
if(lookahead < script.length)//Make sure lookahead is in bounds.
{
retval = script[lookahead]; //Return the char at the lookahead.
}
return retval; //Return it.
}

//Returns the previous char in the script by amount but does not decrement pos.
//Basically see peek only this is the exact opposite.
protected char rewind(int amount)
{
int lookbehind = pos - amount; //pos + 1;
char retval = '\u0003';
if(lookbehind > 0)
{
retval = script[lookbehind];
}
return retval;
}

//Returns the next character in the script.
protected char nextChar()
{
char retval = '\u0003';
pos++;
if(pos < script.length && !(pos < 0))
{
retval = script[pos]; //It says this is causing an ArrayIndexOutOfBoundsException with the following message. Shows a very large (small?) negative number.
}
return retval;
}
}

//TODO: Extend
interface Parseable
{
public void parseRead(String token, Parser p);
public void parseWrite(ParseWriter writer);
}


//TODO: Work on
class ParseWriter
{

}

class Car implements Parseable
{
public String model_name;
public int model_year;

@Override
public void parseRead(String token, Parser p)
{
if(token.equals("model_year"))
{
model_year = (int)p.parseNumber();
}
else if(token.equals("model_name"))
{
model_name = p.parseString();
}
}

@Override
public void parseWrite(ParseWriter writer)
{
//TODO: Implement along with the ParseWriter
}
}

最佳答案

使用带标签的break语句break PARSE_LOOP;通常被认为是不好的做法。您本质上是在编写一个“goto”语句:每当满足 break PARSE_LOOP; 条件时,它就会跳回到 while 循环的开头(因为那是您编写 PARSE_LOOP: 的地方) >)。这可能就是你无限循环的原因。我也不明白为什么你要重新启动一个已经无限的 while 循环(while true)。

将代码更改为:

 public void parse(String text, Parseable parsed)
{
char[] script = text.toCharArray();
this.script = script;
boolean next_char = false;
boolean parsing = true;

while(parsing)
{
char c;
if(next_char)
{
c = nextChar();
}
else
{
c = script[0];
next_char = true;
}

switch(c)
{
case 'A':
case 'a':
case 'B':
case 'b':
case 'C':
case 'c':
case 'D':
case 'd':
case 'E':
case 'e':
case 'F':
case 'f':
case 'G':
case 'g':
case 'H':
case 'h':
case 'I':
case 'i':
case 'J':
case 'j':
case 'K':
case 'k':
case 'L':
case 'l':
case 'M':
case 'm':
case 'N':
case 'n':
case 'O':
case 'o':
case 'P':
case 'p':
case 'Q':
case 'q':
case 'R':
case 'r':
case 'S':
case 's':
case 'T':
case 't':
case 'U':
case 'u':
case 'V':
case 'v':
case 'W':
case 'w':
case 'X':
case 'x':
case 'Y':
case 'y':
case 'Z':
case 'z':
case '_'://TODO: HERE
if(depth > 0) //
{
parsed.parseRead(buildWordToken(true), this);//Let the class decide how to handle this information. Best solution since I do not know how to implement automatic deserialization.
}
continueUntilChar('=', false); //A value must be assigned because it is basically a key value pair with {} or a string or number as the value
skipWhitespace();//Skip any trailing whitespace straight to the next token.
break;
case '{':
depth++;
break;
case '}':
depth--;
break;
case '\n':
line++;
break;
case ' ':
case '\t':
skipWhitespace();
break;
case '\u0003': //End of Text Character... Not sure if it will work in a file...
parsing = false;
break;
}
}
}

关于java - 为什么我的文本解析器会进入无限循环,尽管循环已明确中断?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41107323/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com