如何解析Haskell中的前缀而不是前缀?

问题描述

我需要用Haskell编写的该程序的帮助。我已经写了大部分,这基本上是我要做的:

  1. 我写的时候

解析“ a + b”

在终端中,我希望将其作为输出:

加号(单词“ a”)(单词“ b”)

  1. 我写的时候

解析“ a-2 * b + c”

在终端中,我希望将其作为输出:

减号(单词“ a”)(加号(多(数字2)(单词“ b”))(单词“ c”))

到目前为止,我的代码:

data Ast
    = Word String
    | Num Int
    | Mult Ast Ast
    | Plus Ast Ast
    | Minus Ast Ast
    deriving (Eq,Show)

tokenize :: [Char] -> [String]
tokenize [] = []
tokenize (' ' : s) = tokenize s
tokenize ('+' : s) = "+" : tokenize s
tokenize ('*' : s) = "*" : tokenize s
tokenize (c : s)
  | isDigit c =
    let (cs,s') = collectWhile isDigit s
     in (c : cs) : tokenize s'
  | isAlpha c =
    let (cs,s') = collectWhile isAlpha s
     in (c : cs) : tokenize s'
  | otherwise = error ("unexpected character " ++ show c)

collectWhile :: (Char -> Bool) -> String -> (String,String)
collectWhile p s = (takeWhile p s,dropWhile p s)

isDigit,isAlpha :: Char -> Bool
isDigit c = c `elem` ['0' .. '9']
isAlpha c = c `elem` ['a' .. 'z'] ++ ['A' .. 'Z']

parseU :: [String] -> (Ast,[String])
parseU ("+" : s0) =
  let (e1,s1) = parseU s0
      (e2,s2) = parseU s1
   in (Plus e1 e2,s2)
parseU ("*" : s0) =
  let (e1,s2) = parseU s1
   in (Mult e1 e2,s2)
parseU (t : ts)
  | isNumToken t = (Num (read t),ts)
  | isWordToken t = (Word t,ts)
  | otherwise = error ("unrecognized token " ++ show t)
parseU [] = error "unexpected end of input"

isNumToken,isWordToken :: String -> Bool
isNumToken xs = takeWhile isDigit xs == xs
isWordToken xs = takeWhile isAlpha xs == xs

parse :: String -> Ast
parse s =
  case parseU (tokenize s) of
    (e,[]) -> e
    (_,t : _) -> error ("unexpected token " ++ show t)

inn :: Ast -> String
inn (Plus x y) = innP x ++ " + " ++ innP y
inn (Mult x y) = innP x ++ " * " ++ innP y
inn ast = innP ast

innP :: Ast -> String
innP (Num n) = show n
innP (Plus x y) = "(" ++ innP x ++ " + " ++ innP y ++ ")"
innP (Mult x y) = "(" ++ innP x ++ " * " ++ innP y ++ ")"
innP (Word w) = w -- 

innfiks :: String -> String
innfiks s = inn (parse s)

现在,我在终端上发布我写的文本时出错,但是当我这样写时:

解析“ + a b”

我得到正确的输出:

加号(单词“ a”)(单词“ b”)

我知道我必须更改代码,以便它接受我以这种形式发送给parse函数的内容:

值运算符值

而不是这种形式:

运算符值

但是我在努力找出需要做些什么和在哪里进行此更改。

解决方法

要处理具有优先级的中缀运算符,一种方法是引入一系列与优先级相对应的解析函数。因此,如果您具有可以乘以创建“项”的“因子”,可以将这些“项”加或减以创建“表达式”,则需要为每个级别创建解析器函数。解析“因数”(即单词或数字)非常容易,因为您已经编写了以下代码:

parseFactor :: [String] -> (Ast,[String])
parseFactor (t : ts)
  | isNumToken t = (Num (read t),ts)
  | isWordToken t = (Word t,ts)
  | otherwise = error ("unrecognized token " ++ show t)
parseFactor [] = error "unexpected end of input"

解析术语比较棘手。您想先解析一个因数,然后解析一个*,然后再解析另一个因数,然后将其视为要进一步可选地乘以另一个因数的项,依此类推。以下是一种方法:

parseTerm :: [String] -> (Ast,[String])
parseTerm ts
  =  let (f1,ts1) = parseFactor ts     -- parse first factor
     in  go f1 ts1
  where go acc ("*":ts2)                -- add a factor to an accumulating term
          = let (f2,ts3) = parseFactor ts2
            in go (Mult acc f2) ts3
        go acc rest = (acc,rest)       -- no more factors: return the term

如果需要,请尝试编写类似的parseExpr来解析用+字符分隔的术语(暂时跳过减法),然后在类似以下内容上进行测试:

parseExpr (tokenize "2 + 3 * 6 + 4 * 8 * 12 + 1")

对于扰流板,这是一个可以处理+-的版本,尽管请注意,您的令牌生成器尚未正确处理减法,因此您必须先对其进行修复。

parseExpr :: [String] -> (Ast,[String])
parseExpr ts
  =  let (f1,ts1) = parseTerm ts
     in  go f1 ts1
  where go acc (op:ts2)
          | op == "+" || op == "-"
          = let (f2,ts3) = parseTerm ts2
            in go ((astOp op) acc f2) ts3
        go acc rest = (acc,rest)
        astOp "+" = Plus
        astOp "-" = Minus

在适当的位置,您可以将parse指向正确的解析器:

parse :: String -> Ast
parse s =
  case parseExpr (tokenize s) of
    (e,[]) -> e
    (_,t : _) -> error ("unexpected token " ++ show t)

您的示例应该可以工作:

λ> parse "a - 2 * b + c"
Plus (Minus (Word "a") (Mult (Num 2) (Word "b"))) (Word "c")

请注意,这与您想要的输出略有不同,但是此顺序对于左关联运算符是正确的(这对于正确处理-很重要)。也就是说,您想要:

5 - 4 + 1

解析为:

(5 - 4) + 1  -- i.e.,(Plus (Minus (Num 5) (Num 4)) (Num 1))

,以便评估者计算出正确的答案2。如果将其解析为:

5 - (4 + 1)  -- i.e.,(Minus (Num 5) (Plus (Num 4) (Num 1)))

您的评估者将计算出错误的答案0。

但是,如果您真的想使用右关联运算符进行解析,请参见下文。

针对左关联运算符的完整修改代码:

data Ast
    = Word String
    | Num Int
    | Mult Ast Ast
    | Plus Ast Ast
    | Minus Ast Ast
    deriving (Eq,Show)

tokenize :: [Char] -> [String]
tokenize [] = []
tokenize (' ' : s) = tokenize s
tokenize ('-' : s) = "-" : tokenize s
tokenize ('+' : s) = "+" : tokenize s
tokenize ('*' : s) = "*" : tokenize s
tokenize (c : s)
  | isDigit c =
    let (cs,s') = collectWhile isDigit s
     in (c : cs) : tokenize s'
  | isAlpha c =
    let (cs,s') = collectWhile isAlpha s
     in (c : cs) : tokenize s'
  | otherwise = error ("unexpected character " ++ show c)

collectWhile :: (Char -> Bool) -> String -> (String,String)
collectWhile p s = (takeWhile p s,dropWhile p s)

isDigit,isAlpha :: Char -> Bool
isDigit c = c `elem` ['0' .. '9']
isAlpha c = c `elem` ['a' .. 'z'] ++ ['A' .. 'Z']

parseFactor :: [String] -> (Ast,ts)
  | otherwise = error ("unrecognized token " ++ show t)
parseFactor [] = error "unexpected end of input"

parseTerm :: [String] -> (Ast,ts1) = parseFactor ts
     in  go f1 ts1
  where go acc ("*":ts2)
          = let (f2,rest)

parseExpr :: [String] -> (Ast,rest)
        astOp "+" = Plus
        astOp "-" = Minus

isNumToken,isWordToken :: String -> Bool
isNumToken xs = takeWhile isDigit xs == xs
isWordToken xs = takeWhile isAlpha xs == xs

parse :: String -> Ast
parse s =
  case parseExpr (tokenize s) of
    (e,t : _) -> error ("unexpected token " ++ show t)

对于右关联运算符,请修改以下定义:

parseTerm :: [String] -> (Ast,[String])
parseTerm ts
  =  let (fct,ts1) = parseFactor ts
     in  case ts1 of
           "*":ts2 -> let (trm,rest) = parseTerm ts2
                      in  (Mult fct trm,rest)
           _       -> (fct,ts1)

parseExpr :: [String] -> (Ast,[String])
parseExpr ts
  =  let (trm,ts1) = parseTerm ts
     in  case ts1 of
           op:ts2 | op == "+" || op == "-"
                   -> let (expr,rest) = parseExpr ts2
                      in  (astOp op trm expr,rest)
           _       -> (trm,ts1)
  where astOp "+" = Plus
        astOp "-" = Minus*

相关问答

依赖报错 idea导入项目后依赖报错,解决方案:https://blog....
错误1:代码生成器依赖和mybatis依赖冲突 启动项目时报错如下...
错误1:gradle项目控制台输出为乱码 # 解决方案:https://bl...
错误还原:在查询的过程中,传入的workType为0时,该条件不起...
报错如下,gcc版本太低 ^ server.c:5346:31: 错误:‘struct...