为什么会出现 IndexError 以及如何使此代码更简洁

问题描述

所以我正在努力完成 CodeWars 中的编码挑战,并且我在 CodeWars 上收到了 IndexError。当我在 PyCharm 上运行代码时,输​​出很好。

Coding Prompt

Get-ADUser -Filter * -Properties Created | Where-Object { $_.Created -gt $startDate -and $_.Created -le $endDate }

Error message

样本测试:

def alphabet_position(sentence):
    alphabet_cap = ["A","B","C","D","E","F","G","H","I","J","K","L","M","N","O","P","Q","R","S","T","U","V","W","X","Y","Z"]
    alphabet_low = ["a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z"]

    sentence_list = list(sentence)

    # remove any non-letters from sentence
    i = 0
    while i < len(sentence_list):
        if sentence_list[i].isalpha() == False:
            sentence_list.remove(sentence_list[i])
        i += 1

# I had to create a specific remove(" ") because for some reason the "'" would be turned into a " " instead of being removed?
    i = 0
    while i < len(sentence_list):
        if sentence_list[i] == " ":
            sentence_list.remove(" ")
        i += 1

    # finding position of alphabet
    alpha_index = []
    k = 0
    j = 0
    while k < len(sentence_list):
        if sentence_list[k] == alphabet_low[j] or sentence_list[k] == alphabet_cap[j]:
            alpha_index.append(str((alphabet_low.index(alphabet_low[j])) + 1))
            j = 0
            k += 1
        else:
            j += 1

    sentence_index = " ".join(alpha_index)

    return sentence_index

解决方法

我无法真正复制 IndexError,但我可以帮助您解决代码的紧凑性。我能想到的完成工作的最快方法是这样的:

import re # for the regular expression

def alphabet_position(sentence):

    # regular expression to filter out everything that is not a letter
    regex = re.compile('[^a-zA-Z]')

    # removing everything that doesn't match the regex
    sentence = regex.sub('',sentence)

    alpha_index = []
    for char in sentence.upper():
        # ord() returns the ASCII code,# we subtract by 64 to get the alphabetical index,# then we convert the integer to string
        # because the final result needs to be a string

        alpha_index.append(str(ord(char) - 64))
    
    return " ".join(alpha_index)

对正在发生的事情的解释:

  1. 我们不需要将我们需要的所有可能的字符存储到两个列表中。我们可以使用正则表达式来删除所有字母字符:[^a-zA-Z]
  2. 不需要对字符串迭代 3 次:正则表达式负责过滤,然后我们可以对每个字符使用一次迭代来找出它们在字母表中的索引。为此,首先我们使用 sentence.upper() 将所有内容转换为大写,然后检查当前字符的 ASCII 值并减去 64。这样做是因为 ASCII 值遵循字母顺序,并且第一个字符 A,值为 65。Take a look at the table

如果你不知道如何使用正则表达式,另一种解决问题的方法是

def alphabet_position(sentence):
    alpha_index = []
    for char in sentence.upper():
        if char.isalpha() and 65 <= ord(char) <= 90:    # 65: A ascii value,90: Z ascii value
            alpha_index.append(str(ord(char) - 64))
    
    return " ".join(alpha_index)

这种方法的一个缺点,正如卢克伍德沃德在他对您的问题的评论中指出的那样,如果我们有非标准字符,如重音字母,它们会被过滤掉。如果是这种情况,您可能需要扩展 regex 或 ASCII 过滤。

,

带有内联注释的代码:

alphabet_low = "abcdefghijklmnopqrstuvwxyz"

def alphabet_position(sentence):
    result = []  # initialize result
    for char in sentence.lower():  # iterate over lowercase string
        if char in alphabet_low:  # is character in alphabet?
            position = alphabet_low.find(char) + 1
            result.append(str(position))
    return " ".join(result)

sentence = "The sunset sets at twelve o' clock."
print(sentence)
print(alphabet_position(sentence))

输出:

The sunset sets at twelve o' clock.
20 8 5 19 21 14 19 5 20 19 5 20 19 1 20 20 23 5 12 22 5 15 3 12 15 3 11