使用频率分析/密码分析技术破解密码文本

问题描述

您将如何编写程序（最好使用Java或Python编写）以破坏随机密文，而无法通过移位确定密钥，即密钥替换是随机的。

此网站（https://www.guballa.de/substitution-solver）已完成。

我必须通过频率分析（https://en.wikipedia.org/wiki/Frequency_analysis）

我面临的主要问题是，当我替换时，请检查这些单词是否看起来像英语单词。

请指导我如何解决此问题

谢谢哈基德

解决方法

这可能是一个较晚的答案，但是该代码可以作为您的起点。


from operator import itemgetter

letterFrequency = [
                   [12.00,'E'],[9.10,'T'],[8.12,'A'],[7.68,'O'],[7.31,'I'],[6.95,'N'],[6.28,'S'],[6.02,'R'],[5.92,'H'],[4.32,'D'],[3.98,'L'],[2.88,'U'],[2.71,'C'],[2.61,'M'],[2.30,'F'],[2.11,'Y'],[2.09,'W'],[2.03,'G'],[1.82,'P'],[1.49,'B'],[1.11,'V'],[0.69,'K'],[0.17,'X'],[0.11,'Q'],[0.10,'J'],[0.07,'Z']]


plain_to_cipher = {
       "a": "l","b": "f","c": "w","d": "o","e": "a","f": "y","g": "u","h": "i","i": "s","j": "v","k": "z","l": "m","m": "n","n": "x","o": "p","p": "b","q": "d","r": "c","s": "r","t": "j","u": "t","v": "q","w": "e","x": "g","y": "h","z": "k",}
cipher_to_plain = {v: k for k,v in plain_to_cipher.items()}
alphabet = "qwertyuioplkjhgfdsazxcvbnm"


message = input("Enter message to encrypt: ")
message = message.lower()
ciphertext = ""


for c in message:
    if c not in alphabet:
        ciphertext += c
    else:
        ciphertext += plain_to_cipher[c]
print("\nRandom substitution Encryption is: \n\t{}".format(ciphertext))

# .......................................................................
# calculate letter frequency of ciphertext

letter_list = []
cipher_len = 0
for c in ciphertext:
    if c in alphabet:
        cipher_len += 1
        if c not in letter_list:
            letter_list.append(c)

letter_freq = []
for c in letter_list:
    letter_freq.append([round(ciphertext.count(c) / cipher_len * 100,2),c])

# ....................................................................................
# Now sort list and decrypt each instance of ciphertext according to letter frequency

letter_freq = sorted(letter_freq,key=itemgetter(0),reverse=True)
decrypted_plaintext = ciphertext

index = 0
for f,c in letter_freq:
    print("Replacing {} of freq {} with {}.".format(c,f,letterFrequency[index][1]))
    decrypted_plaintext = decrypted_plaintext.replace(c,letterFrequency[index][1])
    index += 1
print("\nThe Plaintext after decryption using frequency analysis: \n\t{}".format(decrypted_plaintext))

侧面说明：该程序在大多数情况下都可以成功解密大多数使用的字母，例如e,t,a,o，但是将无法成功映射使用较少的字母（因为频率差开始减小，从而使结果难以预测）。通过分析英语最常用的双字母组（例如th）并使用结果进行更准确的预测，可以稍微克服此问题。您可以利用的另一个注意事项是，字母a易于破解，使字母i的痛苦程度降低，因为任何介于两者之间的具有密文字符的句子都可能对应于a （例如：一本书）或i（例如：我去过）（而且我们已经推论出a，因此任何其他单个密文字符都可能是i）