使用后缀树从子串构建

问题描述

我即将参加考试,我正在努力复习。我遇到了这个问题,但我不知道如何进行。问题如下:

给定函数 build_from_substrings(S,T),如果您能够从 S 的子字符串构建 T,该函数应返回一个元组,其中包含用于构建 T 的子字符串的第一个和最后一个索引。例如“bcc " 是从 "abbcc" 的 (2,4) 索引创建的。如果无法创建子字符串,则该函数返回 false。

build_from_substrings 必须在 O(N^2 + M) 中运行,其中:

  1. N 是 S 中的字符数
  2. M 是 T 的长度

我已经成功创建了一个后缀trie来存储S的后缀。但是,我没有掌握问题的第二部分,遍历和子字符串搜索。我可以得到一些指导吗?

这是我试过的。

class Node:
    def __init__(self,level = None,size = 27,data = None):
        self.link = [None] * size
        self.level = level
        self.data = data
        self.end = False

class Trie:
    def __init__(self):
        self.root = Node()

    def insert(self,key,data):
        level = 0
        current = self.root
        for char in key:
            index = ord(char) - 97 + 1
            if current.link[index] is not None:
                current = current.link[index]
            else:
                current.link[index] = Node(level=level)
                current = current.link[index]
            level += 1
        index = 0
        if current.link[index] is not None:
            current = current.link[index]
        else:
            current.link[index] = Node(level=level)
            current = current.link[index]
        current.data = data

    def search(self,key):
        current = self.root
        for char in key:
            index = ord(char) - 97 + 1
            if current.link[index] is not None:
                current = current.link[index]
            else:
                return False
        index = 0
        if current.link[index] is not None:
            current = current.link[index]
            return current.data
        else:
            return False

def build_from_substring(S,T):
    suffix_trie = Trie()
    length = len(S)
    for i in range(len(S)):
        list = [i,0]
        word = ""
        word += S[i]
        if i == length-1:
            list[1] = i
        for j in range(i+1,length):
            word += S[j]
            list[1] = length-1
        suffix_trie.insert(word,list)

解决方法

我没有完全理解你的问题,但我希望这会有所帮助。我开始为 Trie 重新编写您的整个类,我认为这可能更有用。

class Trie:
    def __init__(self):
        self.root = {}
        self.end = '*'

    def insert(self,word):
        '''Traverses the string and inserts each character into the Trie'''
        current = self.root
        for char in word:
            if char not in current:
                current[char] = {}
            current = current[char]
        current[self.end] = self.end

    def search(self,word):
        '''Returns True if word is in the Trie and False if the word is not in the Trie. The search word must not be a substring in the Trie.'''
        current = self.root
        for char in word:
            if char not in current:
                return False
            current = current[char]
        return True if self.end in current else False

然后建立了一些什么时候返回一对索引和什么时候返回 False 的情况。

trie = Trie()
trie.insert('word')
build_from_substrings('have you heard the word in the worlds.',trie)
(19,22)
build_from_substrings('have these words today.',trie)
(11,14)
build_from_substrings('have you heard.',trie)
False

遍历输入字符串,如果 Trie 中有匹配项,则向下标记索引。如果它们匹配,直到 Trie 中的单词完成 markdown 结束索引并返回。如果它们在遍历字符串时不匹配,则返回 False。

def build_from_substrings(string,trie):
    current = trie.root
    start = None
    end = None
    for idx,char in enumerate(string):
        if char in current:
            if start is None:
            '''recored the starting index where they first match'''
                start = idx
            current = current[char]
        elif start and trie.end in current:
        '''If start is not none and our Trie has an end symbol we have reached a word'''
            return (start,idx-1)
        else:
        '''if the character(char) is not in current then we need to start over by resetting "current" value and "start" value'''
            current = trie.root
            start = None
    if start is None or trie.end not in current:
    '''If our start value is None or there is no end symbol then that means we have no substrings to report'''
        return False
    return (start,end) if end else (start,len(string)-1)

相关问答

依赖报错 idea导入项目后依赖报错,解决方案:https://blog....
错误1:代码生成器依赖和mybatis依赖冲突 启动项目时报错如下...
错误1:gradle项目控制台输出为乱码 # 解决方案:https://bl...
错误还原:在查询的过程中,传入的workType为0时,该条件不起...
报错如下,gcc版本太低 ^ server.c:5346:31: 错误:‘struct...