在列表中查找连续的月份

问题描述

我想在一个无序列月的列表中找到最大的集合,它可以作为不同的、连续的个月的有序列表返回。

例如:

consecutive_months(["December","January","February","April"])

输出

"December","February"

还有:

consecutive_months(["February","December","January"])

输出

"December","February"

以下有效,但我很好奇是否有人对更优雅的方式有想法:

MONTHS = ["January","march","April","May","June","July","August","September","October","November","December"] 

def consecutive_months(lst_of_months):
    # create two years of months to handle going from Dec to Jan        
    results = []
    for m in set(lst_of_months):
        results.append((m,MONTHS.index(m)))
        results.append((m,MONTHS.index(m)+12))            
    results = sorted(results,key=lambda x: x[1])
    
    # find the longest series of consecutive months
    this_series = []    
    longest_series = []
    for this_month in results:
        if len(this_series) > 0:
            last_month = this_series[-1]
            if last_month[1] + 1 == this_month[1]:
                this_series.append(this_month)
            else:
                this_series = [this_month]
        else:
            this_series = [this_month]           
        if len(this_series) > len(longest_series):
            longest_series = [m for (m,i) in this_series]

    return longest_series

Here一个带有示例输入和预期输出的粘贴箱。

解决方法

我注意到您的代码存在一个问题:当所有 12 个月都出现在输入中时,输出会列出所有月份两次。这很容易解决,只需:

return longest_series[:12]

我会寻求一种解决方案,将输入转换为一种 12 位的“位图”,其中 1 表示输入中对应的月份,而 0 表示不是。

如果表示为 12 个字符的字符串,则可以使用正则表达式轻松识别“1”的序列。

我还会对月份列表进行一些预处理,这样你就有了列表和它的字典版本,并将列表翻了一番,这样你就可以从它的 12 边界切片。

这是建议的代码:

import re

months = ["January","February","March","April","May","June","July","August","September","October","November","December"]

# Also create a dictionary to get a month's index
month_nums = { month: num for num,month in enumerate(months) }
# ... and double the months list,to ease slicing across the 12-boundary
months += months

def consecutive_months(given_months):
    # Deal with boundary case
    if not given_months:
        return []

    # Convert input to 12 bits in string format
    lst = ["0"] * 12
    for m in given_months:
        lst[month_nums[m]] = "1"
    bits = "".join(lst)
    
    # Identify the longest chunk of consecutive "1" in that doubled string
    _,start,end = max((j-i,i,j) 
        for i,j in (match.span(0)
            for match in re.finditer("1+",bits + bits)
        )
    )
 
    # Using the found span,extract the corresponding month names
    return months[start:end][:12]
,

以下是一位也研究过该问题的朋友提供的两种可行方法。第一个是高效的并且使用模运算符,因此列表不需要复制到自身上。

month_names = [
    'January','February','March','April','May','June','July','August','September','October','November','December'
]
​
# Looks like: {'January': 0,'February': 1...}
month_name_to_index = {
  value: index
  for index,value
  in enumerate(month_names)
}
​
​
def consecutive_months(list_of_months_by_name):
  if not list_of_months_by_name:
    # If the list is empty,return None.
    return None
​
  month_was_seen = [False] * 12  # Looks like: [False,False,...]
  
  for month_name in list_of_months_by_name: 
    month_was_seen[month_name_to_index[month_name]] = True
​
  # Seek to first missing month:
  for start_index in range(12):
    if not month_was_seen[start_index]:
      break
​
  # If there is no missing month,return the whole year.
  if month_was_seen[start_index]:
    return {"from": "January","to": "December","length": 12}
​
  # Starting from the first missing month,loop around the year
  # and keep track of the longest run using one boolean and four
  # integers.
  running = False
  longest_run_index = None
  longest_run_length = 0
  current_run_index = None
  current_run_length = None
  for offset in range(1,13):
    index = (start_index + offset) % 12
    if month_was_seen[index]:
      # Continue a run or begin a run.
      if running:
        current_run_length += 1
        continue
      running = True
      current_run_index = index 
      current_run_length = 1
      continue
    if running:
      # End the run.
      running = False
      if current_run_length > longest_run_length:
        longest_run_index = current_run_index 
        longest_run_length = current_run_length
​
  return {
    "from": month_names[longest_run_index],"to": month_names[(longest_run_index + longest_run_length - 1) % 12],"length": longest_run_length
  }

第二个是巧妙的单线:

MONTH_NAMES = [
    'January','December'
]
​
def consecutive_months(list_of_months_by_name):
  return max(
    (
      len(segment)-segment.index(":")-1,(MONTH_NAMES*2)[
          int(segment[:segment.index(":")])+1
          :
          int(segment[:segment.index(":")]) + len(segment) - segment.index(":")
      ]
    )
    for segment in 
    "".join([
      "x" if month_name in list_of_months_by_name else f",{index}:"
      for index,month_name in enumerate(MONTH_NAMES*2)
    ]).split(",")
    if ":" in segment
  )[1] if set(MONTH_NAMES) - set(list_of_months_by_name) else MONTH_NAMES

两种算法都返回测试数据的预期结果。谢谢影音!