问题描述
当给定一个字符串时,我有可以构建特里数据结构的代码。当我尝试传递字符串列表时,它将单词组合成一个
class TrieNode:
def __init__(self):
self.end = False
self.children = {}
def all_words(self,prefix):
if self.end:
yield prefix
for letter,child in self.children.items():
yield from child.all_words(prefix + letter)
class Trie:
def __init__(self):
self.root = TrieNode()
def __init__(self):
self.root = TrieNode()
def insert(self,words):
curr = self.root
#the line I added to read the words from a list is below
for word in words:
for letter in word:
node = curr.children.get(letter)
if not node:
node = TrieNode()
curr.children[letter] = node
curr = node
curr.end = True
def all_words_beginning_with_prefix(self,prefix):
cur = self.root
for c in prefix:
cur = cur.children.get(c)
if cur is None:
return # No words with given prefix
yield from cur.all_words(prefix)
lst = ['foo','foob','foobar','foof']
trie = Trie()
trie.insert(lst)
我得到的输出是
['foo','foofoob','foofoobfoobar','foofoobfoobarfoof']
我想要的输出是
['foo','foof']
这是我用来获取输出的行(为了重现性,以防您需要运行代码)-它返回所有以特定前缀开头的单词:
print(list(trie.all_words_beginning_with_prefix('foo')))
我该如何解决?
解决方法
每次插入后,您无需将curr
重置为根,因此您要在上一个中断的位置插入下一个单词。您想要类似的东西:
def insert(self,words):
curr = self.root
for word in words:
for letter in word:
node = curr.children.get(letter)
if not node:
node = TrieNode()
curr.children[letter] = node
curr = node
curr.end = True
curr = self.root # Reset back to the root
不过我会分解的。我认为您的insert
函数做得太多,并且不应该处理多个字符串。我将其更改为:
def insert(self,word):
curr = self.root
for letter in word:
node = curr.children.get(letter)
if not node:
node = TrieNode()
curr.children[letter] = node
curr = node
curr.end = True
def insert_many(self,words):
for word in words:
self.insert(word) # Just loop over self.insert
现在这不是问题,因为每个insert
是一个独立的调用,您不能忘记重置curr
。