问题描述
所以我有一个任务,要在c中创建一个程序以读取几个句子(一个140mb的文件),并基于第二个输入(即数字),我需要返回第N个最常见的单词。我的想法是用线性探测构建一个哈希表,每当我得到一个新元素时,我都会根据其位置并基于djb2对它进行哈希处理,否则,如果发生冲突,我会重新哈希。之后,我根据出现的情况应用Quicksort,然后最终按索引访问。 我在用c中的线性探测完成哈希表时遇到问题。我很确定自己已经完成了,但是每次运行时,都会在lldb上出现堆缓冲区溢出。我试图找出问题所在,但仍然无法解决。
我堆栈内存不足了吗?该文件相对较小,占用了大量内存。 我使用了地址消毒器,并且在插入时出现了堆缓冲区溢出。
我认为我没有触及分配区域之外的内存,但我不确定100%。
任何主意出了什么问题吗?这是table.c的实现,在下面可以看到该结构的形式。
这是来自地址sanitiser的更详细的消息:
thread #1: tid = 0x148b44,0x0000000100166b20 libclang_rt.asan_osx_dynamic.dylib`__asan::AsanDie(),queue = 'com.apple.main-thread',stop reason = Heap buffer overflow
{
"access_size": 1,"access_type": 1,"address": 105690555220216,"description": "heap-buffer-overflow","instrumentation_class": "AddressSanitizer","pc": 4294981434,"stop_type": "fatal_error"
}
table.c:
#include "table.h"
#include "entities.h"
static inline entry_t* entryInit(const char* const value){
unsigned int len = strlen(value);
entry_t* entry = malloc(sizeof(entry));
entry->value = malloc(sizeof(char*) * len);
strncpy(entry->value,value,strlen(value));
entry->exists = 1;
entry->occurence = 1;
return entry;
}
table_t* tableInit(const unsigned int size){
table_t* table = malloc(sizeof(table_t));
table->entries = malloc(size*sizeof(entry_t));
table->seed = getPrime();
table->size = size;
table->usedEntries = 0U;
return table;
}
//okay,there is definitely an issue here
table_t* tableResize(table_t* table,const unsigned int newSize){
//most likely wont happen but if there is an overflow then we have a problem
if(table->size > newSize) return NULL;
//create a temp array of the realloced array,then do changes there
entry_t* temp = calloc(newSize,sizeof(entry_t));
table->size = newSize;
//temp pointer to an entry
entry_t *tptr = NULL;
unsigned int pos = 0;
unsigned int index = 0;
while(pos != table->size){
tptr = &table->entries[pos];
if(tptr->exists == 1){
index = hashString(table->seed,tptr->value,table->size,pos);
temp[index] = *entryInit(tptr->value);
temp[index].occurence = tptr->occurence;
break;
}
else pos++;
}
table->entries = temp;
//Todo: change table destroy to free the prevIoUs array from the table
free(temp);
return table;
}
//insert works fine,it is efficient enough to add something in the table
unsigned int tableInsert(table_t* table,const char* const value){
//decide when to resize,might create a large enough array to bloat the memory?
if(table->usedEntries >(unsigned int)(2*(table->size/3))) table = tableResize(table,table->size*2);
entry_t* entry = NULL;
unsigned int index;
auto int position = 0;
while(position != table->size){
//calculate the hash of our string as a function of the current position on the table
index = hashString(table->seed,position);
entry = &table->entries[index];
if(entry->exists == 0){
*entry = *entryInit(value);
table->usedEntries++;
return index;
} else if (entry->exists == 1 && strcmp(entry->value,value) == 0){
entry->occurence++;
return index;
} else{
position++;
}
}
}
//there might be an issue here
static inline void tableDestroy(const table_t* const table){
entry_t* entry = NULL;
for (auto int i = 0; i < table->size; ++i){
entry =&table->entries[i];
//printf("Value: %s Occurence: %d Exists: %d \n",entry->value,entry->occurence,entry->exists );
if(&table->entries[i] !=NULL)free(&table->entries[i]);
}
free(table);
}
entities.h:
#pragma once
typedef struct __attribute__((packed)) __entry {
char *value;
unsigned int exists : 1;
unsigned int occurence;
} entry_t;
typedef struct __table {
int size;
int usedEntries;
entry_t *entries;
unsigned int seed;
} table_t;
这是我从文件中读取和处理文本的方式:
void readFromFile(const char* const fileName,table_t* table){
FILE *fp = fopen(fileName,"r");
if(!fp) fprintf(stderr,"error reading file. \n");
char word[15];//long enough to hold the biggest word in the text?
int position = 0;
char ch;
while((ch = fgetc(fp))!= EOF){
//discard all the ascii chars that are not letters
if(!(ch >= 65 && ch <= 90) && !(ch >= 97 && ch <= 122)){
word[position]= '\0';
if(word[0] == NULL)continue;
tableInsert(table,word);
position = 0;
continue;
}
else word[position++] = ch;
}
}
任何建议我的代码有什么问题吗? 我认为调整大小可能会出现问题,并且由于内存管理存在很多问题,因此我尚未正确删除。
谢谢!
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)