使用rdkit或其他python模块将SMILES转换为化学名称或IUPAC名称

问题描述

是否可以使用RDKit或其他python模块将SMILES转换为化学名称或IUPAC名称？

在其他帖子中我找不到非常有用的东西。

非常感谢您！

解决方法

据我所知，使用rdkit是不可能的，而且我不知道具有此功能的任何python模块。如果您可以使用网络服务，则可以使用NCI resolver。

这是从SMILES字符串中检索IUPAC标识符的函数的简单实现：

import requests


CACTUS = "https://cactus.nci.nih.gov/chemical/structure/{0}/{1}"


def smiles_to_iupac(smiles):
    rep = "iupac_name"
    url = CACTUS.format(smiles,rep)
    response = requests.get(url)
    response.raise_for_status()
    return response.text


print(smiles_to_iupac('c1ccccc1'))
print(smiles_to_iupac('CC(=O)OC1=CC=CC=C1C(=O)O'))

[Out]:
BENZENE
2-acetyloxybenzoic acid

您可以轻松地扩展它以转换多种不同的格式，尽管该功能并不是很快...

另一种解决方案是使用PubChem。您可以将API与python软件包pubchempy一起使用。请记住，这可能会返回多种化合物。

import pubchempy


# Use the SMILES you provided
smiles = 'O=C(NCc1ccc(C(F)(F)F)cc1)[C@@H]1Cc2[nH]cnc2CN1Cc1ccc([N+](=O)[O-])cc1'
compounds = pubchempy.get_compounds(smiles,namespace='smiles')
match = compounds[0]
print(match.iupac_name)

[Out]:
(6S)-5-[(4-nitrophenyl)methyl]-N-[[4-(trifluoromethyl)phenyl]methyl]-3,4,6,7-tetrahydroimidazo[4,5-c]pyridine-6-carboxamide

最近我使用 pubchempy 管理了这个转换。这是尝试的代码。


filename = open("inif.txt","r")

for line in filename :
    event = line
    compounds = pcp.get_compounds(event,namespace='smiles') 
    match = compounds[0]
    print(i,'$$$','the CID is',compounds,'The IUPAC name is',match.iupac_name,'for the SMILE',event)
    i+=1```

python rdkit