I've written an LALR1 parser which seems to work. It can be parameterised for the type of symbols it receives. The only binding is Eq
.
我已经编写了一个LALR1解析器,该解析器似乎可以工作。它可以根据它接收到的符号类型进行参数化。唯一的结合是等式。
module LALR.Parser where
data Rule symbolT = Rule
{
lhs :: symbolT,
rhs :: Int
}
data Action symbolT = Accept
| Goto (State symbolT)
| Shift (State symbolT)
| Reduce (Rule symbolT)
data State symbolT = State
{
transitions :: [ (symbolT, Action symbolT) ]
}
data Parser symbolT = Parser
{
states :: [State symbolT],
input :: [symbolT]
}
data Result symbolT = UnexpectedSymbol symbolT
| Accepted
| Pending (Parser symbolT)
| EmptyInput
| EmptyStack
build :: State symbolT -> [symbolT] -> Parser symbolT
build state input = Parser
{
states = [state],
input = input
}
step :: Eq symbolT => Parser symbolT -> Result symbolT
step (Parser [] _) = EmptyStack
step (Parser _ []) = EmptyInput
step parser@(Parser states@(state : statesTail) input@(lookAhead : inputTail) ) = case (action) of
Nothing -> UnexpectedSymbol lookAhead
Just Accept -> Accepted
Just (Shift target) -> Pending (Parser (target : states) inputTail)
Just (Reduce rule@(Rule lhs rhs) ) -> Pending (Parser (drop rhs states) (lhs : input) )
Just (Goto target) -> Pending (Parser (target : states) inputTail )
where
action = lookup lookAhead (transitions state)
run :: Eq symbolT => Parser symbolT -> Result symbolT
run parser = case (step parser) of
Pending parser -> run parser
result -> result
However, now I want to transform this parser into a compiler, which not only accepts or rejects a given input, but actually converts it into something useful. Basically, I am thinking to add a handler which can handle the different actions taken. Now my question is the following:
然而,现在我想将这个解析器转换为编译器,它不仅接受或拒绝给定的输入,而且实际上将其转换为有用的东西。基本上,我正在考虑添加一个处理程序,可以处理采取的不同行动。现在我的问题是:
The compiler will require not only to receive mere Symbols, but Symbols with bells and whistles, e.g. a payload (whatever the lexer lexed), or a line number, a file name, or something else.
编译器不仅需要接收简单的符号,还需要接收带有花哨的符号,例如有效负载(无论Lexer词法是什么)、行号、文件名或其他东西。
In an object oriented language I would recur to inheritance and derive a class from Symbol. But what is the way to go in Haskell?
在面向对象的语言中,我会求助于继承并从符号派生一个类。但在哈斯克尔,该怎么走呢?
Should I define a class Token
which requires a function symbol :: Token -> Symbol
and then I would create some data structure that instances that class? Changing the bindings from Eq symbolT
to Token symbolT
and then calling symbol
when I need to lookup the symbol?
我是否应该定义一个需要函数Symbol::Token->符号的类令牌,然后创建一些实例化该类的数据结构?将绑定从等式符号T更改为令牌符号T,然后在需要查找符号时调用符号?
(I am trying to study the language limiting myself mostly to the Prelude.)
(我正在努力学习这门语言,把自己主要局限于前奏曲。)
UPDATE:
最新情况:
Your entire code is parameterised by symbolT. Why just not assume that symbolT is a symbol with a payload? Do you want different types of payload for different symbols?
Yes, I want different payloads. The Lexer produces a [Token]
.
是的,我想要不同的有效载荷。词法分析器生成一个[标记]。
data Token = Assignment | Name String | Number Int
lex :: String -> Maybe [Token]
I could derive Eq
so that it ignores the payload, e.g.
我可以推导出等式,这样它就忽略了有效载荷,例如
instance Eq Token where
(==) (Name _) (Name _) = True
-- etc
but then in the definition of the transitions
of the states I would need to give it a dummy token e.g. [ (Number 0, Accept), (Name "", Shift state3) ]
which looks kind of wrong.
但是,在状态转换的定义中,我需要给它一个虚拟标记,例如[(数字0,接受),(名称“”,移位状态3)],这看起来有点错误。
I was thinking about something like:
我在想一些类似的事情:
class Tkn a symbolT where
symbol :: a -> symbolT
and then when I lookup the transitions, I would call that method to extract the symbol.
然后,当我查找转换时,我会调用该方法来提取符号。
I am not sure if I am explaining myself or if any of this makes any sense.
我不确定我是否在解释自己,或者这一切是否有任何意义。
更多回答
You might look at regex-applicative for inspiration here. An RE s a
is a regex that parses symbols of type s
and produces payloads of type a
.
您可以在这里查看regex-Applicative以获得灵感。RE S a是一个正则表达式,它解析类型S的符号并生成类型a的有效负载。
Your entire code is parameterised by symbolT
. Why just not assume that symbolT
is a symbol with a payload? Do you want different types of payload for different symbols?
您的整个代码都由SYMBOL T进行了参数化。为什么不假设符号T是一个带有有效载荷的符号呢?您是否希望为不同的符号提供不同类型的有效载荷?
@n.m.couldbeanAI Thank you very much for you comment. I have updated my question in case you have the time to take another look.
@n.m.couldbeanAI非常感谢您的评论。我已经更新了我的问题,如果你有时间再看一看的话。
@DanielWagner Thank you very much. I will look into it.
@DanielWagner非常感谢。我会调查一下的。
If you want different types of payloads for different symbols, then perhaps classes are not what you need, because a signature like run :: Token symbolT => Parser symbolT -> Result symbolT
still works with the same type of symbolT. You can instantiate different parsers with different symbol payloads but that's probably not what you want.
如果您想为不同的符号使用不同类型的有效负载,那么类可能不是您所需要的,因为像Run::Token Symbol T=>Parser symbol T->Result Symbol T这样的签名仍然适用于相同类型的符号T。您可以用不同的符号负载实例化不同的解析器,但这可能不是您想要的。
我是一名优秀的程序员,十分优秀!