问题描述
我事先道歉,因为我不知道如何正确地问这个问题。每个星期,我都会收到有关股票更新的收益报告。他们看起来像这样
Alexandria Real Estate Equities,Inc. (ARE),Beyond Meat,Inc. (BYND),Brown & Brown,Inc. (BRO),Canon Inc. (CAJ),Chegg,Inc. (CHGG),Cincinnati Financial Co. (CINF),Ecopetrol SA (EC),Hasbro,Inc. (HAS),HCA Healthcare,Inc. (HCA),HSBC Holdings plc (HSBC),NXP Semiconductors (NXPI),Otis Worldwide (OTIS),Packaging Co. of America (PKG),Petróleo Brasileiro S.A. - Petrobras (PBR),Principal Financial Group Inc (PFG),Principal Financial Group,Inc. (PFG),SAP SE (SAP),Twilio Inc (TWLO)
并且由于该消息是通过电子邮件发送的,因此我想找到一种方法来将其存储在看起来很吸引人的.txt
中。如果可能的话,我想让它看起来像
[1] Alexandria real estate equities,inc. (ARE)
[2] Beyond meat,inc. (BYND)
[3] Brown & brown,inc. (BRO)
[4] Canon inc. (CAJ)
[5] Chegg,inc. (CHGG)
[6] Cincinnati financial co. (CINF)
[7] Ecopetrol sa (EC)
[8] Hasbro,inc. (HAS)
以此类推。尽管我不断思考并提出了不同的选择,但我仍然陷于困境,不知道该如何处理。任何帮助将不胜感激。
解决方法
我会这样。
import re
stocks = re.split(r'(?<=\)),\s',stocks_string.replace('\n','')
for index,stock in enumerate(stocks):
print(f'[{index+1}] {stock}'
我正在使用正则表达式split在股票名称的末尾保留右括号。 (?
,我们可以看到每只股票之间都用,
隔开。但是,由于相同的,
字符也用于“,Inc.” ,我们需要用),
分隔股票名称。
让我们使用Python的内置split()
函数拆分文本。我们从文本中获取字符串列表。像这样:
text = "Alexandria Real Estate Equities,Inc. (ARE),Beyond Meat,Inc. (BYND),Brown & Brown,Inc. (BRO),Canon Inc. (CAJ),Chegg,Inc. (CHGG),Cincinnati Financial Co. (CINF),Ecopetrol SA (EC),Hasbro,Inc. (HAS),HCA Healthcare,Inc. (HCA),HSBC Holdings plc (HSBC),NXP Semiconductors (NXPI),Otis Worldwide (OTIS),Packaging Co. of America (PKG),Petróleo Brasileiro S.A. - Petrobras (PBR),Principal Financial Group Inc (PFG),Principal Financial Group,Inc. (PFG),SAP SE (SAP),Twilio Inc (TWLO)"
split_text = text.split("),")
输出:
['Alexandria Real Estate Equities,Inc. (ARE','Beyond Meat,Inc. (BYND','Brown & Brown,Inc. (BRO','Canon Inc. (CAJ','Chegg,Inc. (CHGG','Cincinnati Financial Co. (CINF','Ecopetrol SA (EC','Hasbro,Inc. (HAS','HCA Healthcare,Inc. (HCA','HSBC Holdings plc (HSBC','NXP Semiconductors (NXPI','Otis Worldwide (OTIS','Packaging Co. of America (PKG','Petróleo Brasileiro S.A. - Petrobras (PBR','Principal Financial Group Inc (PFG','Principal Financial Group,Inc. (PFG','SAP SE (SAP','Twilio Inc (TWLO)']
让我们在末尾附加丢失的)
字符。
要获取每只股票的指数,请循环使用enumerate()
函数。
最终代码:
split_text = text.split("),")
for idx,stock_name in enumerate(split_text):
print(f"[{idx+1}] {stock_name})")
输出:
[1] Alexandria Real Estate Equities,Inc. (ARE)
[2] Beyond Meat,Inc. (BYND)
[3] Brown & Brown,Inc. (BRO)
[4] Canon Inc. (CAJ)
[5] Chegg,Inc. (CHGG)
[6] Cincinnati Financial Co. (CINF)
[7] Ecopetrol SA (EC)
[8] Hasbro,Inc. (HAS)
[9] HCA Healthcare,Inc. (HCA)
[10] HSBC Holdings plc (HSBC)
[11] NXP Semiconductors (NXPI)
[12] Otis Worldwide (OTIS)
[13] Packaging Co. of America (PKG)
[14] Petróleo Brasileiro S.A. - Petrobras (PBR)
[15] Principal Financial Group Inc (PFG)
[16] Principal Financial Group,Inc. (PFG)
[17] SAP SE (SAP)
[18] Twilio Inc (TWLO)
,
如果文本文件始终采用以下格式:“此处全名(SHORT)”,那么我们可以使用简单的python拆分,因为我们知道每个术语之间总会有“)”。此拆分将返回您要查找的所有值的数组。
,要提供所需的格式,我会做:
res = [f"[{index+1}] {i})".replace(",","").replace("\n","") for index,i in enumerate(text.split(")"))]
res.pop()
output = "".join([f"{i}\n" for i in res])
print(output)
打印出
[1] Alexandria Real Estate EquitiesInc. (ARE)
[2] Beyond MeatInc. (BYND)
[3] Brown & BrownInc. (BRO)
[4] Canon Inc. (CAJ)
[5] CheggInc. (CHGG)
[6] Cincinnati Financial Co. (CINF)
[7] Ecopetrol SA (EC)
[8] HasbroInc. (HAS)
[9] HCA HealthcareInc. (HCA)
[10] HSBC Holdings plc (HSBC)
[11] NXP Semiconductors (NXPI)
[12] Otis Worldwide (OTIS)
[13] Packaging Co. of America (PKG)
[14] Petróleo Brasileiro S.A. - Petrobras (PBR)
[15] Principal Financial Group Inc (PFG)
[16] Principal Financial GroupInc. (PFG)
[17] SAP SE (SAP)
[18] Twilio Inc (TWLO)