输出未排序，无法对第二个值进行排序是否有特殊方法对第二个值进行排序

问题描述

输出未排序，因此无法在第二列上进行排序。是否有特殊的方法可以对第二个值进行排序。

该程序获取文本并计算单词在文本中的出现次数

import string
with open("romeo.txt") as file:  # opens the file with text
    lst = []
    d = dict ()
    uniquewords = open('romeo_unique.txt','w')
    for line in file:
        words = line.split()
        for word in words:  # loops through all words
            word = word.translate(str.maketrans('','',string.punctuation)).upper()  #removes the punctuations
            if word not in d:
                d[word] =1
            else:
                d[word] = d[word] +1

            if word not in lst:
                lst.append(word)    # append only this unique word to the list
                uniquewords.write(str(word) + '\n') # write the unique word to the file
print(d)

解决方法

具有默认值的词典

代码段：

d = dict()
...
if word not in d:
    d[word] =1
else:
    d[word] = d[word] +1

在python中变得如此普遍，以至于创建了dict的子类来摆脱它。 It goes by the name defaultdict and can be found in module collections.

因此，我们可以将您的代码段简化为：

from collections import defaultdict

d = defaultdict(int)
...
d[word] = d[word] + 1

无需进行手动if/else测试；如果word不在defaultdict中，它将以初始值0自动添加。

计数器

计数发生次数也是经常有用的事情；如此之多，以至于存在subclass of dict called Counter in module collections。它将为您完成所有艰苦的工作。

from collections import Counter
import string

with open('romeo.txt') as input_file:
    counts = Counter(word.translate(str.maketrans('','',string.punctuation)).upper() for line in input_file for word in line.split())

with open('romeo_unique.txt','w') as output_file:
  for word in counts:
    output_file.write(word + '\n')

据我所知，默认情况下，不能保证按出现次数对计数器进行排序；但是：

当我在交互式python解释器中使用它们时，它们总是以减少的出现次数打印；
他们提供了一种方法.most_common()，可以保证返回的次数减少。

在Python中，标准词典是未排序的数据类型，但是您可以查看here，假设对输出进行排序意味着d

首先有几点评论：

您不是在根据给定属性进行显式排序（例如，使用sorted）。根据每个键值对的值部分的字母数字值，字典可能被认为具有“自然”顺序，并且它们在迭代时可能正确排序（例如，用于打印），但是最好对字典进行明确排序。
您要检查lst变量中是否存在单词，这很慢，因为检查列表需要检查所有条目，直到找到（或不发现）某些东西为止。检查字典中是否存在会更好。
我假设“第二列”是指每个单词的信息，该信息计算单词首次出现的顺序。

为此，我将更改代码以同时记录每个单词的第一个出现的单词索引，然后可以对其进行精确排序。

编辑：修复了代码。 sorted产生的排序按键而不是值排序。这就是我在发布答案之前不测试代码的结果。

import string
from operator import itemgetter

with open("romeo.txt") as file: # opens the file with text
    first_occurence = {}
    uniqueness = {}
    word_index = 1
    uniquewords = open('romeo_unique.txt','w')

    for line in file:
        words = line.split()

        for word in words: # loops through all words
            word = word.translate(str.maketrans('',string.punctuation)).upper() #removes the punctuations

            if word not in uniqueness:
                uniqueness[word] = 1
            else:
                uniqueness[word] += 1

            if word not in first_occurence:
                first_occurence[word] = word_index
                uniquewords.write(str(word) + '\n') # write the unique word to the file

            word_index += 1

    print(sorted(uniqueness.items(),key=itemgetter(1)))
    print(sorted(first_occurence.items(),key=itemgetter(1)))

counting counting python sorting