问题描述
我正在尝试从包含大量代码(数字)的文本文件中计算每行出现的次数。
9045,9107,2376,9017
2387,4405,4499,7120
9107,3559,3488
9045,4499
我想比较从文本字段中得到的一组相似的数字,例如:
9107,2387,4499
我正在寻找的唯一结果是它是否包含来自文本文件的 2 个以上的数字(每行)。所以在这种情况下它会是真的,因为:
9045,9107,9017 - 错误 (1)
2387、4405、4499、7120 - 正确 (3)
9107、2387、3559、3488 - 错误 (2)
9045,4425,4490 - 假 (0)
Scanner in = null;
try {
in = new Scanner(new File("areas.txt"));
} catch (FileNotFoundException ex) {
Logger.getLogger(NewJFrame.class.getName()).log(Level.SEVERE,null,ex);
}
List < String[] > lines = new ArrayList < > ();
while ( in .hasNextLine()) {
String line = in .nextLine().trim();
String[] splitted = line.split(",");
lines.add(splitted);
}
String[][] result = new String[lines.size()][];
for (int i = 0; i < result.length; i++) {
result[i] = lines.get(i);
}
System.out.println(Arrays.deepToString(result));
我得到的结果:
[[9045,9017],[2387,7120],[9107,3488],[9045,4499],[],[]]
从这里开始,我有点坚持逐行检查代码。有什么建议或意见吗?二维数组是最好的方法,还是有更简单或更好的方法?
解决方法
预期的输入数量定义了您应该使用的搜索算法类型。
如果您不是搜索数千行,那么一个简单的算法就可以了。如有疑问,请选择简单而不是复杂且难以理解的算法。
虽然它不是一种有效的算法,但在大多数情况下,一个简单的嵌套 for 循环就可以解决问题。
一个简单的实现如下所示:
final int FOUND_THRESHOLD = 2;
String[] comparedCodes = {"9107","4405","2387","4499"};
String[][] allInputs = {
{"9045","9107","2376","9017"},// This should not match
{"2387","4499","7120"},// This should match
{"9107","3559","3488"},// This should not match
{"9045","4499"},// This should match
};
List<String[] > results = new ArrayList<>();
for (String[] input: allInputs) {
int numFound = 0;
// Compare the codes
for (String code: input) {
for (String c: comparedCodes) {
if (code.equals(c)) {
numFound++;
break; // Breaking out here prevents unnecessary work
}
}
if (numFound >= FOUND_THRESHOLD) {
results.add(input);
break; // Breaking out here prevents unnecessary work
}
}
}
for (String[] result: results) {
System.out.println(Arrays.toString(result));
}
它为我们提供了输出:
[2387,4405,4499,7120]
[9045,3559,4499]
,
为了扩展我的评论,以下是您可以做的粗略概述:
String textFieldContents = ... //get it
//build a set of the user input by splitting at commas
//a stream is used to be able to trim the elements before collecting them into a set
Set<String> userInput = Arrays.stream(textFieldContents .split(","))
.map(String::trim).collect(Collectors.toSet());
//stream the lines in the file
List<Boolean> matchResults = Files.lines(Path.of("areas.txt"))
//map each line to true/false
.map(line -> {
//split the line and stream the parts
return Arrays.stream(line.split(","))
//trim each part
.map(String::trim)
//select only those contained in the user input set
.filter(part -> userInput.contains(part))
//count matching elements and return whether there are more than 2 or not
.count() > 2l;
})
//collect the results into a list,each element position should correspond to the zero-based line number
.collect(Collectors.toList());
如果您需要收集匹配的行而不是每行一个标志,您可以将 map()
替换为 filter()
(相同的内容)并将结果类型更改为 List<String>
。