问题描述
我正在尝试从流中读取“单词”。一个词只是一个数字“1234”、一个常用的字母字符词“test”和常用词的缩写“you're”。
当然有数十亿种方法可以做到这一点。我正在尝试使用 tagbody
来实现诸如 state 机器之类的东西来解析单词。如果给出了正确的输入,我的实现就可以工作,但对于与单词不相似的输入则失败。我试图跳过输入,直到达到新的空格或 eof,但这给了我一个无限循环,我不知道为什么。
谁能解释一下?
这是我的代码
(defun whitespace-or-nil-p (c)
"Returns if a character is whitespace"
(member c '(#\ #\Tab #\Return #\Newline nil)))
(defun read-word (stream)
(let ((c nil))
(with-output-to-string (out)
(tagbody
read-initial
(setf c (read-char stream nil))
(cond
((whitespace-or-nil-p c) (peek-char t stream nil) (go read-initial))
((not (alphanumericp c)) (go skip-til-next))
((digit-char-p c) (go read-number))
((alpha-char-p c) (go read-simple-or-contracted-word))
(t (return-from read-word))
)
skip-til-next
(get-output-stream-string out)
(loop until (whitespace-or-nil-p (peek-char nil stream nil)) do (read-char stream nil))
(go read-initial)
read-number
(write-char c out)
(setf c (read-char stream nil))
(cond
((whitespace-or-nil-p c)
(return-from read-word (get-output-stream-string out)))
((not (digit-char-p c)) (go skip-til-next))
(t (go read-number))
)
read-simple-or-contracted-word
(write-char c out)
(setf c (read-char stream nil))
(cond
((whitespace-or-nil-p c)
(return-from read-word (get-output-stream-string out)))
((and (char/= c #\') (not (alpha-char-p c))) (go skip-til-next))
(t (go read-simple-or-contracted-word))
)
))))
解决方法
这是您的代码,经过修改以防止无限循环,以便对其进行调试。 我在更改代码的地方添加了注释:
(defun read-word (stream)
(let ((c nil)
;; how many times we allow the code to enter dbg function
(counter 10))
(flet ((dbg (symbol &rest args)
;; each time it is called,we decrease counter,when it
;; reaches zero,we stop the state machine
(print (list* symbol args))
(when (<= (decf counter) 0)
(return-from read-word :too-many-loops))))
(with-output-to-string (out)
(tagbody
read-initial
(setf c (read-char stream nil))
(dbg 'read-initial c)
(cond
((whitespace-or-nil-p c) (peek-char t stream nil) (go read-initial))
((not (alphanumericp c)) (go skip-til-next))
((digit-char-p c) (go read-number))
((alpha-char-p c) (go read-simple-or-contracted-word))
(t (return-from read-word))
)
skip-til-next
(dbg 'skip-til-next)
(get-output-stream-string out)
(loop until (whitespace-or-nil-p (peek-char nil stream nil)) do (read-char stream nil))
(go read-initial)
read-number
(dbg 'read-number)
(write-char c out)
(setf c (read-char stream nil))
(cond
((whitespace-or-nil-p c)
(return-from read-word (get-output-stream-string out)))
((not (digit-char-p c)) (go skip-til-next))
(t (go read-number))
)
read-simple-or-contracted-word
(dbg 'read-simple-or-contracted-word)
(write-char c out)
(setf c (read-char stream nil))
(cond
((whitespace-or-nil-p c)
(return-from read-word (get-output-stream-string out)))
((and (char/= c #\') (not (alpha-char-p c))) (go skip-til-next))
(t (go read-simple-or-contracted-word))
)
)))))
这是我能想象到的最简单的测试用例:
* (with-input-from-string (in "") (read-word in))
(READ-INITIAL NIL)
(READ-INITIAL NIL)
(READ-INITIAL NIL)
(READ-INITIAL NIL)
(READ-INITIAL NIL)
(READ-INITIAL NIL)
(READ-INITIAL NIL)
(READ-INITIAL NIL)
(READ-INITIAL NIL)
(READ-INITIAL NIL)
(READ-INITIAL NIL)
:TOO-MANY-LOOPS
因此您需要处理 read-char
返回 nil
的情况。目前,它被视为空格,在这种情况下调用 peek-char
不会消耗来自底层流的字符(它到达文件末尾);例如,您可以观察 peek-char
的返回值以避免无限返回 read-initial
标签。
我也怀疑 (get-output-stream-string out)
应该做什么,特别是当你调用它而不使用它的返回值时。例如,我会接受一个回调函数并在读取每个令牌时调用它。