如何编写可以读取 .xyz 文件并计算原子之间距离的代码?

问题描述

我想制作一个可以加载 xyz 文件的 python 脚本。从xyz参数中,我需要找到原子之间的距离,原子之间的角度和二面角。我在这种矩阵形式中有一个xyz坐标:

 60  Buckminsterfullerene  (C60 Bucky Ball)
 1  C      0.56182991    1.03720708   -3.34153745     2     2     6    16
 2  C     -0.02867841   -0.20922332   -3.53733326     2     1     3    15
 3  C      0.68868170   -1.41687735   -3.17419275     2     2     4     7
 4  C      1.96798784   -1.33001731   -2.62971513     2     3     5     8
 5  C      2.58298268   -0.03190141   -2.42580025     2     4     6    12
 6  C      1.89418484    1.12766893   -2.77448204     2     1     5    11
 7  C     -0.26911535   -2.36054907   -2.62920302     2     3     9    21
 8  C      2.34254586   -2.18322716   -1.51766993     2     4    10    19
 9  C      0.09052931   -3.17978756   -1.56143494     2     7    10    43
10  C      1.42288437   -3.08932569   -0.99437947     2     8     9    36
11  C      1.93147043    2.28439304   -1.89955106     2     6    14    20
12  C      3.33762840   -0.08283139   -1.18772874     2     5    13    19
13  C      3.37342937    1.02783673   -0.34763393     2    12    14    32
14  C      2.65606922    2.23549084   -0.71077448     2    11    13    26
15  C     -1.42982841   -0.40652370   -3.21677674     2     2    17    21
16  C     -0.22432522    2.13802273   -2.81706602     2     1    18    20
17  C     -2.18468203    0.65046195   -2.71318765     2    15    18    27
18  C     -1.56968716    1.94857800   -2.50927274     2    16    17    22
19  C      3.18903027   -1.41242386   -0.62647337     2     8    12    34
20  C      0.62215914    2.90882602   -1.92586946     2    11    16    23
21  C     -1.57842655   -1.73611615   -2.65552140     2     7    15    29
22  C     -2.12435266    2.52208091   -1.29751971     2    18    24    28
23  C      0.08957811    3.45949446   -0.76236333     2    20    24    25
24  C     -1.31157175    3.26219407   -0.44180686     2    22    23    37
25  C      0.84422386    3.40856444    0.47570821     2    23    26    42
26  C      2.10140370    2.80899380    0.50097868     2    14    25    31
27  C     -3.11943529    0.42168510   -1.62746090     2    17    28    30
28  C     -3.08214969    1.57840920   -0.75252997     2    22    27    39
29  C     -2.47596183   -1.95578408   -1.61302381     2    21    30    44
30  C     -3.26211692   -0.85496848   -1.08855240     2    27    29    40
31  C      2.47596171    1.95578403    1.61302382     2    26    32    46
32  C      3.26211681    0.85496841    1.08855240     2    13    31    33
33  C      3.11943527   -0.42168513    1.62746090     2    32    34    57
34  C      3.08214970   -1.57840928    0.75252996     2    19    33    35
35  C      2.12435273   -2.52208091    1.29751964     2    34    36    56
36  C      1.31157182   -3.26219401    0.44180679     2    10    35    47
37  C     -1.42288432    3.08932572    0.99437953     2    24    38    42
38  C     -2.34254574    2.18322716    1.51766996     2    37    39    52
39  C     -3.18903022    1.41242378    0.62647331     2    28    38    41
40  C     -3.37342947   -1.02783679    0.34763388     2    30    41    48
41  C     -3.33762837    0.08283136    1.18772868     2    39    40    49
42  C     -0.09052935    3.17978761    1.56143497     2    25    37    45
43  C     -0.84422391   -3.40856443   -0.47570819     2     9    44    47
44  C     -2.10140382   -2.80899386   -0.50097868     2    29    43    48
45  C      0.26911532    2.36054910    2.62920311     2    42    46    54
46  C      1.57842653    1.73611617    2.65552148     2    31    45    60
47  C     -0.08957818   -3.45949441    0.76236329     2    36    43    53
48  C     -2.65606930   -2.23549092    0.71077442     2    40    44    50
49  C     -2.58298263    0.03190137    2.42580019     2    41    51    52
50  C     -1.93147042   -2.28439306    1.89955101     2    48    51    53
51  C     -1.89418482   -1.12766893    2.77448197     2    49    50    58
52  C     -1.96798776    1.33001737    2.62971509     2    38    49    54
53  C     -0.62215921   -2.90882598    1.92586939     2    47    50    55
54  C     -0.68868164    1.41687744    3.17419279     2    45    52    59
55  C      0.22432525   -2.13802261    2.81706605     2    53    56    58
56  C      1.56968719   -1.94857793    2.50927279     2    35    55    57
57  C      2.18468202   -0.65046198    2.71318771     2    33    56    60
58  C     -0.56182981   -1.03720706    3.34153744     2    51    55    59
59  C      0.02867849    0.20922332    3.53733334     2    54    58    60
60  C      1.42982839    0.40652370    3.21677684     2    46    57    59

我曾尝试使用 pyGSM,但它给出了一个错误 'nonetype' object has no attribute 'group',我不知道如何解决这个问题,因为我从未在 Python 中使用过组。

from pygsm.utilities.manage_xyz import read_xyz,xyz_to_np
from pygsm.coordinate_systems.slots import distance,Angle,Dihedral


bond = distance(0,1)
angle = Angle(0,1,2)
dihedral=Dihedral(0,2,3)

geom = read_xyz('bucky.xyz')
xyz = xyz_to_np(geom)

bond_value = bond.value(xyz)
angle_value = angle.value(xyz)
dihedral_value = dihedral.value(xyz)

print(bond_value,angle_value,dihedral_value)

    import numpy as np
try:
    from . import units 
except:
    import units

import re
#import openbabel as ob

# => XYZ File Utility <= #

def read_xyz(
    filename,scale=1.):

    lines = open(filename).readlines()
    lines = lines[2:]
    geom = []
    for line in lines:
        mobj = re.match(r'^\s*(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s*$',line)
        geom.append((
            scale*float(mobj.group(1),scale*float(mobj.group(2)),scale*float(mobj.group(3)),scale*float(mobj.group(4)),)))
    return geom

谁能帮我做这件事?

解决方法

我对python的了解并不多,但我认为你可以这样做:

filename = "filename.xyz"
xyz = open(filename,"r")
line = xyz.readline()
atom,x,y,z = line.split()
# Calculate what you need here

xyz.close()

或者使用 for 循环:

xyz = open(filename,"r")
for line in xyz:
    print(line)
    # Calculate what you need here
xyz.close()
,

按照@MateusJunges 的建议手动解析文件可能是个好主意,因为这些 xyz 文件有 no generally accepted format。然而,一个常见的问题可能是许多程序都期望这样的事情:

60
Buckminsterfullerene  (C60 Bucky Ball)
 C      0.56182991    1.03720708   -3.34153745     2     2     6    16
 C     -0.02867841   -0.20922332   -3.53733326     2     1     3    15
 C      0.68868170   -1.41687735   -3.17419275     2     2     4     7
 ...

所以第一行中的原子数,然后是注释行,然后是原子条目(第一列中没有索引)。

我不熟悉 PyGSM,但有很多软件包可以读取 xyz 文件。以MDAnalysis为例。

import MDAnalysis

u = MDAnalysis.Universe("structure.xyz")
atom_positions = u.coord.positions

这将为您提供 atom_positions 作为形状 (n_atoms,3) 的 NumPy 数组,您可以将其用于计算。