问题描述
我已经阅读了leveldb的源代码。我发现,当数据块的大小达到4KB时,它将刷新数据块并调用FilterBlockBuilder::StartBlock()
来生成过滤器。
void TableBuilder::Add(const Slice& key,const Slice& value) {
// ...
const size_t estimated_block_size = r->data_block.CurrentSizeEstimate();
if (estimated_block_size >= r->options.block_size) {
Flush();
}
}
void TableBuilder::Flush() {
// ...
if (r->filter_block != nullptr) {
r->filter_block->StartBlock(r->offset);
}
}
FilterBlockBuilder::StartBlock()
呼叫GenerateFilter()
block_offset/2KB
次。例如,当block_offset为4KB时,它将调用GenerateFilter()
2次。
但是,在第一次调用GenerateFilter()
之后,keys
和starts
都为空。结果,第二个调用仅生成一个空过滤器,并将相同的过滤器偏移量添加到filter_offsets
。
leveldb是否会生成2个布隆过滤器,每个都用于数据块的2KB部分,或1个布隆过滤器,用于整个4KB数据块以及另一个空过滤器?
void FilterBlockBuilder::StartBlock(uint64_t block_offset) {
uint64_t filter_index = (block_offset / kFilterBase);
assert(filter_index >= filter_offsets_.size());
while (filter_index > filter_offsets_.size()) {
GenerateFilter();
}
}
void FilterBlockBuilder::GenerateFilter() {
const size_t num_keys = start_.size();
// ****** My Comment: will the second call in StartBlock always trigger this?
if (num_keys == 0) {
// Fast path if there are no keys for this filter
filter_offsets_.push_back(result_.size());
return;
}
// Make list of keys from flattened key structure
start_.push_back(keys_.size()); // Simplify length computation
tmp_keys_.resize(num_keys);
for (size_t i = 0; i < num_keys; i++) {
const char* base = keys_.data() + start_[i];
size_t length = start_[i + 1] - start_[i];
tmp_keys_[i] = Slice(base,length);
}
// Generate filter for current set of keys and append to result_.
filter_offsets_.push_back(result_.size());
policy_->CreateFilter(&tmp_keys_[0],static_cast<int>(num_keys),&result_);
tmp_keys_.clear();
keys_.clear();
start_.clear();
}
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)