whoosh 搜索不到关键字成功

问题描述

1.我正在编写一个非常简单的 whoosh 项目。首先,我读取一个 txt 文件并使用 read() 方法获取 txt 文件中的所有内容。然后为该内容建立索引。

2.以下是实现代码

对于txt文件内容

#import functions from whoosh
import whoosh
from whoosh.index import create_in
from whoosh.fields import *
from whoosh.qparser import QueryParser


schema = Schema(title=TEXT(stored=True),path=ID(stored=True),content=TEXT)
ix = create_in(".",schema)
writer = ix.writer()
i = 0
f = open("read.txt","r")
print(f.read())
writer.add_document(title=u"document "+str(i),path=u".",content=f.read()) #python iterator i starting from 0
writer.commit(optimize=True)
searcher = ix.searcher()


parser = QueryParser("content",ix.schema)
stringquery = parser.parse("Hello")
results = searcher.search(stringquery)
print ("search 1 result:")
print (results)
for r in results:
    print (r)

对于txt文件内容

Hello this is the test 
I hope you are doing well
I think you can do it without problem 
This is so cool without funciton

'Hello'suppose 存储在索引中,但是当我尝试搜索 hello 时它什么都不返回

search 1 result:
<Top 0 Results for Term('content','hello') runtime=7.878600001731684e-05>

解决方法

您的第一次调用 f.read() 打印文件中的文本,下一次调用 f.read() 没有任何内容可读取且不返回任何内容。存储文本。

file_content = f.read()
print(file_content)
writer.add_document(title=u"document "+str(i),path=u".",content=file_content)

为了进一步证明,

$ cat test.txt
Hello this is the test
I hope you are doing well
I think you can do it without problem
This is so cool without funciton
$ ipython
Python 3.9.0 (default,Dec  2 2020,10:34:08)
Type 'copyright','credits' or 'license' for more information
IPython 7.19.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: f = open("test.txt","r")

In [2]: print(f.read())
Hello this is the test
I hope you are doing well
I think you can do it without problem
This is so cool without funciton


In [3]: print(f.read())


In [4]: quit