gpt4 book ai didi

java - java中换行符如何影响System.in.read()

转载 作者:行者123 更新时间:2023-12-02 09:32:40 24 4
gpt4 key购买 nike

我正在尝试创建一个词法分析器类,主要对输入流字符进行标记,并且我使用 System.in.read() 来读取字符。该文档说,当到达流末尾时,它返回 -1 ,但是,当它具有不同的输入时,这种行为有何不同,我无法理解这一点。例如delete.txt 有输入:

1. I have
2. bulldoz//er

然后 Lexer 的正确标记为:

[I=257, have=257, false=259, er=257, bulldoz=257, true=258]  

但是现在如果我使用 enter 插入一些空行,那么代码会进入无限循环,代码会检查输入的换行符和空格,但是,它是如何被绕过的呢? :

1. I have
2. bulldoz//er
3.

完整代码为:

package lexer;

import java.io.*;
import java.util.*;
import lexer.Token;
import lexer.Num;
import lexer.Tag;
import lexer.Word;

class Lexer{
public int line = 1;
private char null_init = ' ';

private char tab = '\t';
private char newline = '\n';
private char peek = null_init;
private char comment1 = '/';
private char comment2 = '*';
private Hashtable<String, Word> words = new Hashtable<>();

//no-args constructor
public Lexer(){
reserve(new Word(Tag.TRUE, "true"));
reserve(new Word(Tag.FALSE, "false"));
}

void reserve(Word word_obj){
words.put(word_obj.lexeme, word_obj);
}

char read_buf_char() throws IOException {
char x = (char)System.in.read();
return x;
}

/*tokenization done here*/
public Token scan()throws IOException{


for(; ; ){
// while exiting the loop, sometime the comment
// characters are read e.g. in bulldoz//er,
// which is lost if the buffer is read;
// so read the buffer i
peek = read_buf_char();
if(peek == null_init||peek == tab){
peek = read_buf_char();
System.out.println("space is read");
}else if(peek==newline){
peek = read_buf_char();
line +=1;
}
else{
break;
}
}

if(Character.isDigit(peek)){
int v = 0;
do{
v = 10*v+Character.digit(peek, 10);
peek = read_buf_char();
}while(Character.isDigit(peek));
return new Num(v);
}

if(Character.isLetter(peek)){
StringBuffer b = new StringBuffer(32);
do{
b.append(peek);
peek = read_buf_char();
}while(Character.isLetterOrDigit(peek));

String buffer_string = b.toString();
Word reserved_word = (Word)words.get(buffer_string);//returns null if not found

if(reserved_word != null){
return reserved_word;
}

reserved_word = new Word(Tag.ID, buffer_string);
// put key value pair in words hashtble
words.put(buffer_string, reserved_word);
return reserved_word;
}

// if character read is not a digit or a letter,
// then the character read is a new token

Token t = new Token(peek);
peek = ' ';
return t;

}

private char get_peek(){
return (char)this.peek;
}

private boolean reached_buf_end(){
// reached end of buffer
if(this.get_peek() == (char)-1){
return true;
}
return false;
}

public void run_test()throws IOException{
//loop checking variable
//a token object is initialized with dummy value
Token new_token = null;
// while end of stream has not been reached
while(this.get_peek() != (char)-1){
new_token = this.scan();

}

System.out.println(words.entrySet());
}


public static void main(String[] args)throws IOException{
Lexer tokenize = new Lexer();
tokenize.run_test();
}

}

get_peek 函数获取具有当前输入缓冲区字符的 peek 的值。
检查是否到达缓冲区末尾是在 run_test 函数中完成的。
主要处理在scan()函数中完成。

我使用了以下命令:cat delete.txt|java lexer/Lexer 提供该文件作为编译后的 java 类的输入。请告诉我这个带有添加换行符的输入文件的代码是如何进入无限循环的?

最佳答案

我不确定您如何检查流的结尾(-1)。在 scan() 的末尾,您将“peek”分配给空间,我认为当您有空行时,这会造成困惑,您无法捕获 -1。

关于java - java中换行符如何影响System.in.read(),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57822658/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com