数值原子轨道生成代码简称为SIAB(Systematically Improvable Atomic Basis)
环境部署
Method 1. 拉取Bohrium镜像
如果已经在ABACUS (ID: 717)项目中,建立新的容器节点:
img
之后选择镜像registry.dp.tech/dptech/prod-16047/apns:orbgen,选择c32_m64_cpu机器启动:
img
通过如下命令激活conda虚拟环境:
conda activate orbgen
Method 2:从零开始
为支持PyTorch的SWAT优化器优化Spillage函数,需要合理配置PyTorch,保证链接Intel-MKL库以充分提高性能。对于Bohrium用户,可以使用镜像:ubuntu:22.04-py3.10-intel2022,选择c32_m64_cpu机器。
创建conda虚拟环境
conda create -n pytorch # create virtual environment called "pytorch"
# EVERYTIME BEFORE ORBITAL GENERATION, DO THE FOLLOWING
source activate pytorch # activate virtual environment
conda虚拟环境安装PyTorch
# make sure you have activated pytorhc
conda install pytorch torchvision torchaudio cpuonly -c pytorch
pip3 install --user scipy numpy
pip3 install --user torch_optimizer
从仓库拉取ABACUS和SIAB
使用git命令拉取位于开发分支的仓库
git clone https://github.com/kirk0830/abacus_orbital_generation.git
使用pip进行安装
pip install -e .
记得也需要安装ABACUS,目前推荐安装最新版本:
git clone https://github.com/deepmodeling/abacus-develop.git
cd abacus-develop
cmake -B build
cmake --build build -j16
cmake --install build
输入文件准备
在下载的仓库目录中,共有SIAB_INPUT_old、SIAB_INPUT_new和SIAB_INPUT.json三个意义几乎相同的不同组织方式输入文件,其中SIAB_INPUT_old为旧版输入文件,SIAB_INPUT_new为新版输入文件,目前不开放使用,SIAB_INPUT.json为普适的版本输入文件。
BASIC -方法一:旧版输入文件(不推荐)
计算环境配置
#--------------------------------------------------------------------------------
#1. CMD & ENV
EXE_mpi mpirun -np 8
EXE_pw abacus
EXE_mpi:MPI并行的执行方式
EXE_pw:ABACUS的调用命令,如ABACUS所在目录并未在环境变量中,需要具体指定可执行文件位置
ABACUS pw计算参数设置
#--------------------------------------------------------------------------------
#2. Electronic calculatation
element Si # element name
Ecut 60 # cutoff energy (in Ry)
Rcut 6 7 8 9 10 # cutoff radius (in a.u.)
Pseudo_dir /root/abacus-develop/pseudopotentials/sg15_oncv_upf_2020-02-06/1.0
Pseudo_name Si_ONCV_PBE-1.0.upf
sigma 0.01 # energy range for gauss smearing (in Ry)
element:生成轨道所属元素
Ecut:平面波计算ecutwfc参数,平面波动能截断。随着ABACUS赝势轨道库(APNS)的上线,推荐使用APNS中推荐数值(对每种赝势,见https://kirk0830.github.io/ABACUS-Pseudopot-Nao-Square/pseudopotential/pseudopotential.html,单击元素即可跳转至`ecutwfc`收敛性测试结果界面)
参考结构定义
#--------------------------------------------------------------------------------
#3. Reference structure related parameters for PW calculation
#For the built-in structure types (including 'dimer', 'trimer' and 'tetramer'):
#STRU Name #STRU Type #nbands #MaxL #nspin #Bond Length list
STRU1 dimer 8 2 1 1.62 1.82 2.22 2.72 3.22
STRU2 trimer 10 2 1 1.9 2.1 2.6
接下来的部分定义用于拟合数值原子轨道的参考(平面波)波函数所属的几何构型(define reference structures whose wavefunctions are used as reference for fitting numerical atomic orbitals,以下简称为参考结构)。对于dimer或trimer,多个键长采样使得数值原子轨道具有描述非平衡几何结构的信息,对于增强轨道的**可迁移性(transferability)**具有重要意义。
第一列定义了分别名为STRU1和STRU2的两个参考结构,其结构类别分别为dimer和trimer。需要注意的是,过于特殊的几何构型可能对轨道质量具有负面影响。
对于不同的结构,可以通过指定nbands来设置平面波计算中待求得能带数量。
MaxL指定当前参考结构所生成数值原子轨道的最大角动量,例如对于dimer,如果期望以dimer为参考结构所生成的数值原子轨道包含最高角动量的轨道为d轨道,则应赋值为2
nspin指定当前参考结构中考虑的spin channel数量。对于部分原子,例如Co, Mn,目前以nspin = 2生成轨道。不同的nspin理论上对轨道的可适用范围应有影响,但该影响实际依赖于参考结构最终的自旋态(即若波函数对称性未破缺,则nspin为1或2并不应该具有差别)。
最后一列定义了参考结构的特征键长。对于dimer,其对应于两原子之间距离,对于trimer,其构型考虑为平面正三角形,特征键长对应于任意两原子间距离。
SIAB计算参数设置与轨道定义
#--------------------------------------------------------------------------------
#4. SIAB calculatation
max_steps 1000
#Orbital configure and reference target for each level
#LevelIndex #Ref STRU name #Ref Bands #InputOrb #OrbitalConf
Level1 STRU1 4 none 1s1p
Level2 STRU1 4 fix 2s2p1d
Level3 STRU2 6 fix 3s3p2d
max_steps指定了最小化Spillage函数的最大步数。
之后的三行则如同STRU1,定义了三个等级的需要生成的轨道,其中Level1以STRU1为参考结构,Level2以STRU1为参考结构,...。
Ref Bands为选取能级数量,即对于每个参考结构所得到的电子结构,可选定一定数量的态包含进数值原子轨道。
InputOrb则考虑了层级优化。若该参数指定为none,则每次优化所有的用于构造数值原子轨道参数,若指定为fix,则默认复制上一level的数值原子轨道,仅优化比上一级轨道多出来的参数,以此类推。
#--------------------------------------------------------------------------------
#5. Save Orbitals
#Index #LevelNum #OrbitalType
Save1 Level1 SZ
Save2 Level2 DZP
Save3 Level3 TZDP
最后如同STRU*和Level*,Save*创建了三个轨道保存任务,第一个任务将Level1轨道保存为SZ标记,...,以此类推。
因此输入文件整体如下:
#--------------------------------------------------------------------------------
#1. CMD & ENV
EXE_mpi mpirun -np 8
EXE_pw abacus
#--------------------------------------------------------------------------------
#2. Electronic calculatation
element Si # element name
Ecut 60 # cutoff energy (in Ry)
Rcut 6 7 8 9 10 # cutoff radius (in a.u.)
Pseudo_dir /root/abacus-develop/pseudopotentials/sg15_oncv_upf_2020-02-06/1.0
Pseudo_name Si_ONCV_PBE-1.0.upf
sigma 0.01 # energy range for gauss smearing (in Ry)
#--------------------------------------------------------------------------------
#3. Reference structure related parameters for PW calculation
#For the built-in structure types (including 'dimer', 'trimer' and 'tetramer'):
#STRU Name #STRU Type #nbands #MaxL #nspin #Bond Length list
STRU1 dimer 8 2 1 1.62 1.82 2.22 2.72 3.22
STRU2 trimer 10 2 1 1.9 2.1 2.6
#--------------------------------------------------------------------------------
#4. SIAB calculatation
max_steps 1000
#Orbital configure and reference target for each level
#LevelIndex #Ref STRU name #Ref Bands #InputOrb #OrbitalConf
Level1 STRU1 4 none 1s1p
Level2 STRU1 4 fix 2s2p1d
Level3 STRU2 6 fix 3s3p2d
#--------------------------------------------------------------------------------
#5. Save Orbitals
#Index #LevelNum #OrbitalType
Save1 Level1 SZ
Save2 Level2 DZP
Save3 Level3 TZDP
BASIC -方法二:json输入文件(推荐)
对于使用Bohrium镜像registry.dp.tech/dptech/prod-16047/apns:orbgen的用户可以参考/root/document/orbgen/目录下SIAB_INPUT.json。
实际上我们发现旧版输入文件具有如下冗余方面:
- 赝势中有元素信息,因此元素并不需要显式在输入文件中指定
- 赝势中有价电子布居信息,因此
OrbitalConf信息不需要显式指定,通过SZ、DZP和TZDP,结合赝势可以推断出OrbitalConf - 轨道的保存信息不需要额外声明
对于旧版本而言,ABACUS pw计算的设置不够灵活,尤其对于希望更换对角化方法、改变scf最大步数、更改mixing相关设置以提高收敛性等需求需要扩展。因此连同新版输入文件,我们对输入文件进行了许些改动:
计算环境配置
{
"environment": "",
"mpi_command": "mpirun -np 16",
"abacus_command": "/path/to/your/abacus",
此部分和原版相同,几乎无改动。
ABACUS pw计算参数设置
"pseudo_dir": "/path/to/dir/you/store/pseudopotential",
"pseudo_name": "Si_ONCV_PBE-1.0.upf",
"ecutwfc": 60,
"bessel_nao_rcut": [6, 7, 8, 9, 10],
"smearing_sigma": 0.01,
在这部分中,我们实际上支持了ABACUS INPUT中的所有参数。推荐ecutwfc的设置参考赝势轨道库测定值:https://kirk0830.github.io/ABACUS-Pseudopot-Nao-Square/pseudopotential/pseudopotential.html
SIAB计算参数设置
"optimizer": "pytorch.SWAT",
"max_steps": 1000,
"spill_coefs": [0.0, 1.0],
"spill_guess": "random",
"nthreads_rcut": 4,
"jY_type": "reduced"
对于现行最新版本,optimizer支持pytorch.SWAT和bfgs。由于前者的收敛限不明确因此一般设置为较大步数(~5000),后者的优化时间相对确定因而轨道生成时间较短。
spill_coefs参数用于调整"optimizer": "pytorch.SWAT"时Spillage函数中PSI与DPSI两项的权重(注意:程序实现中此项未经过归一化),默认值为[0.0, 1.0]。
spill_guess参数用于指定对于Spillage函数中球贝塞尔函数系数的初猜方法。
对于"optimizer": "pytorch.SWAT",目前支持random和identity。
对于"optimizer": "bfgs",支持random和atomic。atomic会对单原子进行一次pw计算,得到$$\langle jY|jY\rangle$$等矩阵元。注意:对于单原子的pw计算可能会以小概率无法收敛。
nthreads_rcut用于指定优化每个rcut系列轨道所使用线程数量。
对于"optimizer": "pytorch.SWAT",如果总线程数/nthreads_rcut>=2,则会以进程并行方式进行轨道生成,如果未指定/指定数量超过总线程数,则会自动切换至串行方式优化,每个轨道使用所有可用线程。
对于"optimizer": "bfgs",由于rcut间仍然是串行关系,因此nthreads_rcut直接指定scipy优化器的并行线程数。
jY_type仅对"optimizer": "bfgs"有效。在新版的轨道生成代码中,参考ONCV赝势赝波函数生成时使用的基函数,reduced(默认)将线性组合球贝塞尔函数,使得r = rcut处的一阶与二阶导数平滑纳入了Spillage函数。normalized不推荐使用。
参考结构定义
"reference_systems": [
{
"shape": "dimer",
"nbands": 8,
"nspin": 1,
"bond_lengths": [1.62, 1.82, 2.22, 2.72, 3.22]
},
{
"shape": "trimer",
"nbands": 10,
"nspin": 1,
"bond_lengths": [1.9, 2.1, 2.6]
}
],
和旧版输入文件相比,我们删除了STRU*等定义,只保留了必需信息。其中:
shape指定了提取轨道信息的参考结构,可以有如下选择:
dimer:原子二聚体
trimer:原子三聚体,正三角形
tetrahedron:正四面体
square:正方形
triangular_bipyramid:三角双锥
octahedron:正八面体
cube:立方体
,建议根据所需轨道的对称性和原子的电子组态进行选择。
nbands被指定为auto,则取值总电子数量,即占据和非占据能带数比为1(以RKS情况考虑)
bond_lengths被指定为
scan:首先进行一定范围内键长扫描,以Morse potential拟合,得到距离能量最低点最近的,和两侧与最低能量比高约1.0 - 1.5 eV(每原子)的两个点
default:对于dimer和trimer的情况,使用内置的键长数据,对于其他形状则无法使用这一参数
auto:对于dimer/trimer,使用default,对于其他形状,使用scan。
轨道定义
"orbitals": [
{
"zeta_notation": "Z",
"shape": "dimer",
"nbands_ref": 4,
"orb_ref": "none"
},
{
"zeta_notation": "DZP",
"shape": "dimer",
"nbands_ref": 4,
"orb_ref": "Z"
},
{
"zeta_notation": "TZDP",
"shape": "trimer",
"nbands_ref": 6,
"orb_ref": "DZP"
}
]
}
zeta_notation可以指定类似于SZ、DZP、TZDP、QZTP、8Z5P等参数,在最新版本中支持了如下格式:- 传统格式:
SZ:single zeta,如赝势中价电子有2个s shell和1个p shell,1个d shell,则SZ= 2s1p1d,DZP=4s2p2d1f,TZDP=6s3p3d2f,QZTP=8s4p4d3f,QZTPDP=8s4p3d3f2g - shell格式:任何以SsPpDdFf...格式指定的字符串,其中大写字母应当被替换为数字
- list格式:基于shell格式,直接以[S, P, D, F, ...] list赋值
- 传统格式:
shape则指定当前轨道的信息提取于上面“参考结构定义”中的哪个结构。如果所有轨道都未link到某种结构,该结构的lmaxmax会被指定为1。nbands_ref指定了参考能级数量,- 对于
"optimizer": "pytorch.SWAT",如果指定为auto,则对于当前版本仅包含所有占据态。 - 对于
"optimizer": "bfgs"可以指定为具体数字、"all"或者"occ+/-%d",其中%d代表任意数字
- 对于
orb_ref等同于旧版本中InputOrb参数,可以指定为前一个level的轨道。
因此输入文件整体如下:
{
"environment": "",
"mpi_command": "mpirun -np 8",
"abacus_command": "abacus",
"pseudo_dir": "/root/abacus-develop/pseudopotentials/sg15_oncv_upf_2020-02-06/1.0",
"pseudo_name": "Si_ONCV_PBE-1.0.upf",
"ecutwfc": 60,
"bessel_nao_rcut": [6, 7, 8, 9, 10],
"smearing_sigma": 0.01,
"optimizer": "pytorch.SWAT",
"max_steps": 1000,
"spill_coefs": [0.0, 1.0],
"spill_guess": "atomic",
"nthreads_rcut": 4,
"jY_type": "reduced"
"reference_systems": [
{
"shape": "dimer",
"nbands": 8,
"nspin": 1,
"bond_lengths": [1.62, 1.82, 2.22, 2.72, 3.22]
},
{
"shape": "trimer",
"nbands": 10,
"nspin": 1,
"bond_lengths": [1.9, 2.1, 2.6]
}
],
"orbitals": [
{
"zeta_notation": "Z",
"shape": "dimer",
"nbands_ref": 4,
"orb_ref": "none"
},
{
"zeta_notation": "DZP",
"shape": "dimer",
"nbands_ref": 4,
"orb_ref": "Z"
},
{
"zeta_notation": "TZDP",
"shape": "trimer",
"nbands_ref": 6,
"orb_ref": "DZP"
}
]
}
BASIC -方法三:新版输入文件(未充分支持)
因目前未开放使用,仅展示,其内容和json相符
# PROGRAM CONFIGURATION
mpi_command mpirun -np 8
abacus_command abacus
# ELECTRONIC STRUCTURE CALCULATION
pseudo_dir /root/abacus-develop/pseudopotentials/sg15_oncv_upf_2020-02-06/1.0
pesudo_name Si_ONCV_PBE-1.0.upf
ecutwfc 60
bessel_nao_rcut 6 7 8 9 10
smearing_sigma 0.01 # optional, default 0.015
# SIAB PARAMETERS
optimizer pytorch.SWAT # optimizers, can be pytorch.SWAT, SimulatedAnnealing, ...
spillage_coeff 0.5 0.5 # order of derivatives of wavefunction to include in Spillage, can be 0 or 1.
max_steps 1000
# REFERENCE SYSTEMS
# shape nbands nspin bond_lengths
dimer 8 1 1.62 1.82 2.22 2.72 3.22
trimer 10 1 1.9 2.1 2.6
# ORBITALS
# zeta_notation shape nbands_ref orb_ref
SZ dimer 4 none
DZP dimer 4 SZ
TZDP trimer 6 DZP
EXTEND -方法四:APNS(ABACUS赝势轨道库)-SIAB-ABACUS联用
对于使用Bohrium镜像registry.dp.tech/dptech/prod-16047/apns:orbgen构建的环境来讲,在熟练使用方法二的基础上,本方法自动化了大批量轨道的生成流程。准备APNS输入文件orbgen.json(可在镜像的/root/deepmodeling/ABACUS-Pseudopot-Nao-Square目录下找到示例文件:
{
"global": {
"mode": "orbgen",
"pseudo_dir": "./download/pseudopotentials",
"cache_dir": "./apns_cache",
"out_dir": "./output",
"siab_dir": "/root/deepmodeling/abacus_orbital_generation/SIAB"
},
"ppsets": [
{
"elements": ["Hf", "W", "Ta"],
"tags": ["sg15", "1.0", "sr"]
}
],
"strusets": [
[
{
"shape": "dimer",
"nbands": "auto",
"bond_lengths": "auto",
"nspin": 1
},
{
"shape": "trimer",
"nbands": "auto",
"bond_lengths": "auto",
"nspin": 1
}
]
],
"orbsets": [
[{"conf": "Z", "shape": "dimer", "dep": "none", "states": "occ"},
{"conf": "DZP", "shape": "dimer", "dep": "Z", "states": "all"},
{"conf": "TZDP", "shape": "trimer", "dep": "DZP", "states": "all"}]
],
"pwsets": [
{"smearing_sigma": 0.01}
],
"siabsets": [
{
"rcuts": [6, 7, 8, 9, 10],
"optimizer": "bfgs",
"max_steps": 5000,
"spill_coefs": [0.0, 1.0],
"spill_guess": "atomic",
"nthreads_rcut": 4,
"jY_type": "reduced"
}
],
"tasks": [
{"orb": 0, "pp": 0, "stru": 0, "pw": 0, "siab": 0}
]
}
程序将针对tasks key,依次执行每个value代表的任务(orb: 0映射到orbsets索引为0的设置,pp: 0映射到ppsets索引为0的设置,以此类推,stru->strusets, pw->pwsets, siab->siabsets)。注意:ppsets的tags给出了用于检索本地可用赝势文件的标签,如"sg15", "1.0", "sr",对于元素Hf,将获得"Hf_ONCV_PBE-1.0.upf"的赝势文件用于生成轨道。但如果只在tags中指定"sr",则会得到所有带有"sr"标签的赝势文件,相应地会生成所有赝势对应的轨道。
具体生成轨道使用的ecutwfc值为APNS内置数据库自动设置。运行命令:
python3 /root/deepmodeling/ABACUS-Pseudopot-Nao-Sqaure/main.py -i orbgen.json
可以在output目录下发现生成了轨道生成的工作文件夹,以及一个自动化串行脚本autorun.py。
使用命令:
nohup python3 autorun.py > log&
将开始批量生成轨道。
EXTEND -方法五:abacustest-SIAB-ABACUS联用
由@赵天琦 探索的使用方法:轨道测试工作流
程序启动与输出内容举例(以Si SG15-V1.0为例)
以如下命令启动轨道生成程序
python3 SIAB/SIAB_nouvelle.py -i SIAB_INPUT.json
首先将任务间串行地进行MPI并行ABACUS pw计算,工作目录的命名格式为[element]-[shape]-[bond_length],之后在作业目录生成一系列输出文件。
img
img
之后进行轨道的优化。
轨道优化(BFGS)
屏幕输出如下信息(可通过设置stdout重定向到文件来存储,且避免太多信息干扰其他工作):
...
ORBGEN: Optimizing orbitals for rcut = 6 au
ORBGEN: jy_jy, mo_jy and mo_mo matrices loaded from Si-dimer-1.82/orb_matrix_rcut6deriv0.dat and Si-dimer-1.82/orb_matrix_rcut6deriv1.dat
ORBGEN: jy_jy, mo_jy and mo_mo matrices loaded from Si-dimer-3.22/orb_matrix_rcut6deriv0.dat and Si-dimer-3.22/orb_matrix_rcut6deriv1.dat
ORBGEN: jy_jy, mo_jy and mo_mo matrices loaded from Si-dimer-1.62/orb_matrix_rcut6deriv0.dat and Si-dimer-1.62/orb_matrix_rcut6deriv1.dat
ORBGEN: jy_jy, mo_jy and mo_mo matrices loaded from Si-dimer-2.22/orb_matrix_rcut6deriv0.dat and Si-dimer-2.22/orb_matrix_rcut6deriv1.dat
ORBGEN: jy_jy, mo_jy and mo_mo matrices loaded from Si-dimer-4.22/orb_matrix_rcut6deriv0.dat and Si-dimer-4.22/orb_matrix_rcut6deriv1.dat
ORBGEN: Y*Y (jy_mo*mo_jy) eigval diagnosis:
l = 0: [9.97641561e-01 3.20387961e-09 3.27052074e-11]
ORBGEN: Y*Y (jy_mo*mo_jy) eigval diagnosis:
l = 1: [2.91184263e+00 1.98372807e-10 3.04279899e-12]
ORBGEN: Y*Y (jy_mo*mo_jy) eigval diagnosis:
l = 2: [8.30238311e-11 3.34364213e-13]
ORBGEN: optimization on level 1 (with # of zeta functions for each l: [1, 1]),
based on orbital (None)
ORBGEN: End optimization on level 1 orbital, merge with previous orbital shell(s).
ORBGEN: optimization on level 2 (with # of zeta functions for each l: [2, 2, 1]),
based on orbital ([1, 1])
RUNNING THE L-BFGS-B CODE
* * *
Machine precision = 2.220D-16
N = 26 M = 20
At X0 0 variables are exactly at the bounds
At iterate 0 f= 7.55744D-02 |proj g|= 6.61322D-01
At iterate 1 f= 5.01914D-02 |proj g|= 4.20864D-01
At iterate 2 f= 4.23787D-02 |proj g|= 1.39568D-01
At iterate 3 f= 3.72751D-02 |proj g|= 9.85885D-02
At iterate 4 f= 3.35225D-02 |proj g|= 1.47411D-01
...
Tit = total number of iterations
Tnf = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip = number of BFGS updates skipped
Nact = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F = final function value
* * *
N Tit Tnf Tnint Skip Nact Projg F
26 34 39 35 0 0 5.393D-07 2.590D-02
F = 2.5899660929261149E-002
...
At iterate 76 f= 1.01617D-02 |proj g|= 8.31381D-07
* * *
Tit = total number of iterations
Tnf = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip = number of BFGS updates skipped
Nact = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F = final function value
* * *
N Tit Tnf Tnint Skip Nact Projg F
ORBGEN: End optimization on level 2 orbital, merge with previous orbital shell(s).
ORBGEN: optimization on level 3 (with # of zeta functions for each l: [3, 3, 2]),
based on orbital ([2, 2, 1])
38 76 81 77 0 0 8.314D-07 1.016D-02
F = 1.0161683391606767E-002
CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL
RUNNING THE L-BFGS-B CODE
* * *
Machine precision = 2.220D-16
N = 38 M = 20
At X0 0 variables are exactly at the bounds
...
At iterate 60 f= 1.26812D-02 |proj g|= 1.18432D-06
ORBGEN: End optimization on level 3 orbital, merge with previous orbital shell(s).
orbital saved as Si_gga_6au_60Ry_1s1p.orb
orbital saved as Si_gga_6au_60Ry_2s2p1d.orb
orbital saved as Si_gga_6au_60Ry_3s3p2d.orb
...
ORBGEN: Optimizing orbitals for rcut = 10 au
ORBGEN: jy_jy, mo_jy and mo_mo matrices loaded from Si-dimer-1.82/orb_matrix_rcut10deriv0.dat and Si-dimer-1.82/orb_matrix_rcut10deriv1.dat
ORBGEN: jy_jy, mo_jy and mo_mo matrices loaded from Si-dimer-3.22/orb_matrix_rcut10deriv0.dat and Si-dimer-3.22/orb_matrix_rcut10deriv1.dat
ORBGEN: jy_jy, mo_jy and mo_mo matrices loaded from Si-dimer-1.62/orb_matrix_rcut10deriv0.dat and Si-dimer-1.62/orb_matrix_rcut10deriv1.dat
ORBGEN: jy_jy, mo_jy and mo_mo matrices loaded from Si-dimer-2.22/orb_matrix_rcut10deriv0.dat and Si-dimer-2.22/orb_matrix_rcut10deriv1.dat
ORBGEN: jy_jy, mo_jy and mo_mo matrices loaded from Si-dimer-4.22/orb_matrix_rcut10deriv0.dat and Si-dimer-4.22/orb_matrix_rcut10deriv1.dat
ORBGEN: Y*Y (jy_mo*mo_jy) eigval diagnosis:
l = 0: [9.99996849e-01 1.40425913e-08 4.00189674e-11]
ORBGEN: Y*Y (jy_mo*mo_jy) eigval diagnosis:
l = 1: [2.99913781e+00 4.76063982e-10 2.60327394e-11]
ORBGEN: Y*Y (jy_mo*mo_jy) eigval diagnosis:
l = 2: [1.40387006e-10 1.41805460e-12]
ORBGEN: optimization on level 1 (with # of zeta functions for each l: [1, 1]),
based on orbital (None)
...
ORBGEN: End optimization on level 3 orbital, merge with previous orbital shell(s).
orbital saved as Si_gga_10au_60Ry_1s1p.orb
orbital saved as Si_gga_10au_60Ry_2s2p1d.orb
orbital saved as Si_gga_10au_60Ry_3s3p2d.orb
====================================================================================
If SIAB package is used in your project, please cite the following paper:
Chen M, Guo G C, He L.
Systematically improvable optimized atomic basis sets for ab initio calculations[J].
Journal of Physics: Condensed Matter, 2010, 22(44): 445501.
Li P, Liu X, Chen M, et al.
Large-scale ab initio simulations based on systematically improvable atomic basis[J].
Computational Materials Science, 2016, 112: 503-517.
Lin P, Ren X, He L.
Strategy for constructing compact numerical atomic orbital basis sets by
incorporating the gradients of reference wavefunctions[J].
Physical Review B, 2021, 103(23): 235131.
If wannierization is used in your project, please cite the following paper:
Chen M, Guo G C, He L.
Electronic structure interpolation via atomic orbitals[J].
Journal of Physics: Condensed Matter, 2011, 23(32): 325501.
====================================================================================
TIME STATISTICS
---------------
initialize 0.00 s
run 13.93 s
finalize 0.00 s
total 13.93 s
轨道优化(串行,Pytorch.SWAT)
屏幕输出如下信息:
--------------------------------------------------
Module Spillage - find the most similar space to the target spanned planewave wavefunction:
SIAB.pytorch_swat starts, numerical atomic orbitals are optimized.
--------------------------------------------------
SEED INITIALIZATION: due to optimization method is local, random seed is somehow preferred. Present seed: 3333759634
WORKFLOW: use on-the-fly information pass from front-end to back-end.
Read file: Si-dimer-1.62/orb_matrix_rcut6deriv0.dat
atom symbol: Si
number of l for present structure: 3
number of l for present coefficients: 3
# ... OMIT SIMILAR INFORMATION
--------------------------------------------------------------------------------
INFORMATION CHECK - Please check every detail of the information below:
--------------------------------------------------------------------------------
PRINT INFO_KST INFORMATION
--------------------------
General Information:
All atom types: Si
Orbital configuration for each atom type:
Symbol, l: 0, 1, 2, 3, ...
Si: 1, 1, 0
Realspace cutoff radius (rcut), grid (dr), kinetic cutoff (ecutwfc) and maximal angularmomentum (lmax) for each atom type:
Atom Rcut dr ecutwfc lmax
Si 6.00 0.01 60.00 3
Optimizer Learning Rate: 0.03
Including additional kinetic term in Spillage: False
Gaussian smoothing for orbitals at rcut: True
Max steps for optimization: 1000
lmax for each atom type:
Si: 3
Structure specific information:
Number of reference structure: 5
Atom type for each reference structure:
Structure 0: Si
# ... OMIT SIMILAR INFORMATION
Number of atoms for each atom type for each reference structure:
Structure 0: Si: 2
# ... OMIT SIMILAR INFORMATION
Number of bands selected to learn for each reference structure:
Struectures: 0: 8 1: 8 2: 8 3: 8 4: 8
Spherical Bessel function:
Number of Spherical Bessel functions (Sphbes) for each atom type:
Si: 14
PRINT INFO_KST INFORMATION END.
PRINT INFO_STRU INFORMATION
--------------------------
Structure 0:
Number of atoms for each type:
Si: 2
Number of bands calculated for present structure: 8
Number of bands taken INFO consideration for learning: 4
Detailed weight information for each band:
Band 0: 5.0000e-02
Band 1: 5.0000e-02
Band 2: 5.0000e-02
Band 3: 5.0000e-02
Band 4: 0.0000e+00
Band 5: 0.0000e+00
Band 6: 0.0000e+00
Band 7: 0.0000e+00
# ... OMIT SIMILAR INFORMATION
PRINT INFO_STRU INFORMATION END.
PRINT INFO_ELEMENT INFORMATION
--------------------------
Element-wise information:
Element Si:
nsphbes: 14
Number of subshells: 3
Orbital configuration: 1s, 1p, 0d
rcut: 6
dr: 0.01
atomic index: 0
PRINT INFO_ELEMENT INFORMATION END.
PRINT INFO_OPT INFORMATION
--------------------------
Optimizer information:
Calculate kinetic term: False
Calculate smooth term: True
Optimizer learning rate: 0.03
Max steps: 1000
PRINT INFO_OPT INFORMATION END.
PRINT INFO_MAX INFORMATION
--------------------------
The data dimension information for each reference structure:
Structure 0:
Number of atom types: 1
Number of atoms: 2
Number of bands: 8
Number of Sphbes: 14
Number of subshells: 3
Maximal number of magnetic channels: 5
# ... OMIT SIMILAR INFORMATION
PRINT INFO_MAX INFORMATION END.
--------------------------------------------------------------------------------
DATA IMPORT - read_QSV
Reading OVERLAP_Q, OVERLAP_Sq and OVERLAP_V from ABACUS.
For PTG_dpsi formulation that kinetic term is included,
will read both orb_matrix*.dat of both order 0 and 1.
# ... OMIT SIMILAR INFORMATION
Optimization of the orbital starts.
torch_optimizer.SWATS (Improving Generalization Performance by Switching from Adam to SGD) optimizer is used.
Parameters are listed below
Learning rate: 0.03
Epsilon: 1e-20
Max steps: 1000
Optimization on Spillage function starts, check "Spillage.dat" for detailed trajectory.
------------------------------------------------------------
Step Spillage deltaSpill Time
------------------------------------------------------------
0 7.8619708181e+00 7.8619708181e+00 0.0060
100 5.2165701084e-02 -2.1325277866e-06 0.0047
200 5.2135961515e-02 -1.4288620981e-10 0.0052
300 5.2135960518e-02 -4.8155923693e-15 0.0080
400 5.2135960518e-02 1.3877787808e-16 0.0066
500 5.2135960518e-02 1.5265566589e-16 0.0064
600 5.2135960518e-02 9.7144514655e-17 0.0066
700 5.2135960518e-02 0.0000000000e+00 0.0050
800 5.2135960518e-02 0.0000000000e+00 0.0047
900 5.2135960518e-02 1.3877787808e-17 0.0099
...
---------------------------------
Optimization of the orbital ends.
Several files generated:
Spillage.dat: detailed trajectory of the optimization
ORBITAL_RESULTS.txt: optimized orbital coefficients
ORBITAL_*U.dat: numerical atomic orbital before renaming
ORBITAL_PLOTU.dat: for plot, the first column is the r, latter colomns are the orbitals
TOTAL TIME (PyTorch): 22.117316961288452
CHECKPOINT: handling on temporary files:
Spillage.dat : 0a9572548679359e972276e5cd4208cf.dat
ORBITAL_RESULTS.txt : 12f817955db736bea04d690d202342fe.txt
ORBITAL_PLOTU.dat : 2bbf4ae1ca9e333799f318eac0c6f676.dat
ORBITAL.dat : 198efe57ecb73421a525bfc7297cfee3.dat
CHECKPOINT: folder Si_1s1p/6au_60Ry created.
CHECKPOINT: folder 338ea4fc-dac4-39ac-a958-a25e58a043b5 created.
Orbital file Si_1s1p/6au_60Ry/Si_gga_60Ry_6au_1s1p.orb generated.
Report: quality of the orbital Si_1s1p/6au_60Ry/Si_gga_60Ry_6au_1s1p.orb is:
l = 0: 5.70212019e-01
l = 1: 9.23294999e-01
l = 2:
# ... OMIT SIMILAR INFORMATION
====================================================================================
If SIAB package is used in your project, please cite the following paper:
Chen M, Guo G C, He L.
Systematically improvable optimized atomic basis sets for ab initio calculations[J].
Journal of Physics: Condensed Matter, 2010, 22(44): 445501.
Li P, Liu X, Chen M, et al.
Large-scale ab initio simulations based on systematically improvable atomic basis[J].
Computational Materials Science, 2016, 112: 503-517.
Lin P, Ren X, He L.
Strategy for constructing compact numerical atomic orbital basis sets by
incorporating the gradients of reference wavefunctions[J].
Physical Review B, 2021, 103(23): 235131.
If wannierization is used in your project, please cite the following paper:
Chen M, Guo G C, He L.
Electronic structure interpolation via atomic orbitals[J].
Journal of Physics: Condensed Matter, 2011, 23(32): 325501.
====================================================================================
TIME STATISTICS
---------------
initialize 0.00 s
run 458.60 s
finalize 0.00 s
total 458.60 s
轨道优化(并行,Pytorch.SWAT)
和串行所不同地,并行时为了避免不同进程在屏幕上输出内容混合在一起,因此各进程输出到文件中,以log.[iproc].txt和err.[iproc].txt命名方式分别存储stdout和stderr内容。以nthreads_rcut: 4设置运行,在主进程上将屏幕输出如下内容:
Parallelization - RUNTIME
Number of threads for each rcut: 4
Number of rcuts that can be parallelized: 3
Total number of threads available: 12
----------------------------------
NOTE: for parallelized run, the stdout and stderr will be redirected to log.[iproc].txt and err.[iproc].txt respectively.
Finish level 0 orbital generation (in total 3).
Finish level 1 orbital generation (in total 3).
Finish level 2 orbital generation (in total 3).
All processes finish, see stdout and stderr in log.[iproc].txt and err.[iproc].txt respectively.
# REFERENCE INFORMATION OMITTED
TIME STATISTICS
---------------
initialize 0.00 s
run 185.57 s
finalize 0.00 s
total 185.57 s
生成如下文件在工作目录:
img
并行加速效率曲线
轨道生成任务的进程级并行和PyTorch内部的线程并行紧密相关。如果设置nthreads_rcut过小,则会同时以低效率并行大量rcut对应轨道系列,如果nthreads_rcut设置过大,则只会串行生成轨道。对于PyTorch的线程并行,或在并行效率上具有“并行收益明显、加速比增益平台期早”的特点,因此最理想的情况是在PyTorch接近线程加速平台时使用进程并行。以个人电脑(总线程数12)进行测试:
| Param: nthreads_rcut | nrcuts_toparallelize | TimeAVG (s) | Time1 (s) | Time2 (s) | Time3 (s) |
|---|---|---|---|---|---|
| 1 | 5 | 106.59 | 108.34 | 105.37 | 106.07 |
| 2 | 5 | 110.82 | 112.35 | 109.68 | 110.42 |
| 3 | 4 | 193.61 | 194.04 | 194.13 | 192.68 |
| 4 | 3 | 186.87 | 185.57 | 188.75 | 186.30 |
| 6 | 2 | 270.95 | 267.67 | 270.21 | 274.97 |
| 12 | 1 | 458.60 | 458.60 |
img
轨道质量诊断(简易)
请参阅@彭星亮 开发abacustest工作流:测试工作流使用:reuse已有测试
断点续算(Checkpoint & RESTART)
对于大型串行任务,必须保证有尽可能多的存档点,以方便任务能在意外中断时能从最近位置重启,继续之前中断的任务,而非每次必须重新开始。目前断点续算的检查节点为:
- 每次ABACUS pw计算结束后
img
- 对于
optimizer pytorch.SWAT:每次轨道优化产生输入文件后
img