问题描述
我正在尝试从文件的每一行中分离令牌并打印它们。我使用的delim是“,”。但是,某些行的最后一个字符串在字符串中带有带有“,”的标记。如何获得带有“,”在内的整个字符串?
char *buff = (char*)malloc(sizeof(char)*256);
char *tmp;
while (fgets(buff,256,FILENAME) != NULL) {
tmp = strtok(buff,",");
printf("%s\n",tmp);
tmp = strtok(NULL,tmp);
}
输入文件中的行如下所示:
This,is,a
Code,in,"c,language"
rat,mouse,"rat,mouse"
我正在尝试这样的输出:
This
is
a
Code
in
c,language
rat
mouse
rat,mouse
解决方法
因为有一个免费的星期天是进行简单编码的好时机,所以这是我在短时间内编写的解析器。它不解析转义序列,几乎没有错误处理,但是可以帮助您入门。
#include <stdlib.h>
#include <stdio.h>
#include <stdbool.h>
void parse(FILE *file) {
int c;
// keep track if we are in quotes or not
bool inquote = false;
// keep track if the field started or not
bool infield = false;
while ((c = fgetc(file)) != EOF) {
if (!infield) {
// ignore leading spaces before fields
if (c == ' ') continue;
infield = true;
}
switch(c) {
case '"':
inquote = !inquote;
continue;
case ',':
// if comma is in quotes,just print it
if (inquote) break;
// fallthrough
case '\n':
// comma or newline are field separators
printf("\n");
infield = false;
continue;
}
// output the character
printf("%c",c);
}
}
int main() {
parse(stdin);
return 0;
}
This
is
a
Code
in
c,language
rat
mouse
rat,mouse