我有一个 pdb 文件,它代表文件的轨迹
REMARK GENERATED BY TRJCONV
TITLE Protein in water t= 400.00000
REMARK THIS IS A SIMULATION BOX
CRYST1 99.547 99.547 99.547 90.00 90.00 90.00 P 1 1
MODEL 1
ATOM 1 N PRO A 1 46.850 67.380 57.030 1.00 0.00
ATOM 2 H1 PRO A 1 46.230 66.770 56.500 1.00 0.00
ATOM 3 H2 PRO A 1 46.420 68.290 56.940 1.00 0.00
ATOM 4 CD PRO A 1 47.060 66.780 58.360 1.00 0.00
TER
ENDMDL
REMARK GENERATED BY TRJCONV
TITLE Protein in water t= 800.00000
REMARK THIS IS A SIMULATION BOX
MODEL 10
ATOM 1 N PRO A 1 46.850 67.380 57.030 1.00 0.00
ATOM 2 H1 PRO A 1 46.230 66.770 56.500 1.00 0.00
ATOM 3 H2 PRO A 1 46.420 68.290 56.940 1.00 0.00
ATOM 4 CD PRO A 1 47.060 66.780 58.360 1.00 0.00
TER
ENDMDL
REMARK GENERATED BY TRJCONV
TITLE Protein in water t= 1200.00000
REMARK THIS IS A SIMULATION BOX
MODEL 100
ATOM 1 N PRO A 1 46.850 67.380 57.030 1.00 0.00
ATOM 2 H1 PRO A 1 46.230 66.770 56.500 1.00 0.00
ATOM 3 H2 PRO A 1 46.420 68.290 56.940 1.00 0.00
ATOM 4 CD PRO A 1 47.060 66.780 58.360 1.00 0.00
TER
ENDMDL
我要打印信息
MODEL 1
[all info]
TER
ENDMDL
适用于所有型号。并保留文件的格式。我试过这个
awk '/MODEL 1/,/ENDMDL/' test.pdb
但是我的文件太大了,无法手动完成。我想将每个模型保存为 model1、model2 等及其坐标信息,直到 ENDMDL
最佳答案
$ awk '/MODEL/{f="model" $2 ".pdb"} f{print > f} /ENDMDL/ {close(f);f=""}' file
$ cat model1.pdb
MODEL 1
ATOM 1 N PRO A 1 46.850 67.380 57.030 1.00 0.00
ATOM 2 H1 PRO A 1 46.230 66.770 56.500 1.00 0.00
ATOM 3 H2 PRO A 1 46.420 68.290 56.940 1.00 0.00
ATOM 4 CD PRO A 1 47.060 66.780 58.360 1.00 0.00
TER
ENDMDL
解释:
/MODEL/ { # @ MODEL
f="model" $2 ".pdb" # use f as flag and target filename
}
f { # when there is an f
print > f # output to file in f
}
/ENDMDL/ { # at the ENDMDL
close(f) # close the file
f="" # unset f
}
关于linux - 从 pdb 轨迹中提取每个文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42855912/