之前(作为一个菜鸟)我把它作为一个R包bug提交,让我一起运行吧.我认为以下所有都是好的:
replace_number("123 0 boogie") [1] "one hundred twenty three boogie" replace_number("1;1 foo") [1] "one;one foo" replace_number("47 bar") [1] "forty seven bar" replace_number("0") "zero"
replace_number("1;0 foo") [1] "one; foo" replace_number("00 bar") [1] "bar" replace_number("0x") [1] "x"
解决方法
如果你深入研究replace_number的内脏:
unlist(lapply(lapply(gsub(",([0-9])","\\1",text.var),function(x) { if (!is.na(x) & length(unlist(strsplit(x,"([0-9])",perl = TRUE))) > 1) { num_sub(x,num.paste = num.paste) } else { x } }),function(x) mgsub(0:9,ones,x)))
你可以看到qdap ::: num_sub中出现问题
qdap:::num_sub("101",num.paste = "combine") ## "onehundredone" qdap:::num_sub("0",num.paste = "combine") ## ""
在该函数中挖掘,问题出现在具有内部代码的numb2word中
ones <- c("","one","two","three","four","five","six","seven","eight","nine") names(ones) <- 0:9
它将零值转换为空白.如果我自己面对这个问题,我会分叉qdap repo,转到replace_number.R,并尝试以向后兼容的方式更改它,以便replace_number可以采用逻辑参数blank_zeros = TRUE,它传递给numb2word并做了正确的事情,例如
ones <- c(if (blank_zeros) "" else "zero","nine")
与此同时,我已将其发布在qdap issues list上.