After switching over the head node to an X11SDV-4C-TLN2F board (my original X10SDV-4C did not have 10G ports, and I wanted the PCI-E slot for a graphics card), I figured putting together a couple compute nodes using X11SDV-12C-TLN2F models would be a good idea for having a larger number of cores (all of the previous compute nodes have 8) and testing out AVX-512.
The one hitch to using the new X11SDV offerings is the lack of a M.2 slot, and a U.2 port + Oculink in its place. Said U.2 port did not work with M.2 adapters, so I got a couple Intel Optane 905P drives having U.2 connectors. The first board worked flawlessly, and I used it to set up both Optane drives. The second X11SDV would never see the Optane drive no matter what BIOS options I tried (EFI only and dual were both attempted as first measures), though the USB thumb drive with Ubuntu 18.04.2 server worked just fine. Even more maddening was the Ubuntu installer saw the U.2 drive just fine, only the BIOS was myopic when it came to detecting this boot drive.
After much fussing (and swearing), I finally remembered the most recent version of the BIOS was 1.1a, but the second board was only blessed with 1.0b. Why these two boards, from the same vendor and bought at the same, had different BIOS versions is beyond me. Long story short, updating the BIOS on the second board to 1.1a solved the boot issue.
Tuesday, July 30, 2019
Thursday, September 14, 2017
Cooling X10SDV (Xeon D-1540/1541)
This all began with the purchase of an X10SDV-TLN4F-O for Gaussian09 and MOLPRO calculations (using tmpfs, so IO should not be limiting). When running calculations on all 8 cores I noticed that the cores were rarely above 2100 MHz and the temperatures were in the 60 - 75 °C. For comparison, my two X99 systems with Core i7 5960X processors having water cooling rarely get above 47 °C and alway ran at maximum turbo speed of 3500 MHz.
Time for a couple related notes. I changed the governor to performance using cpufreq and intel_pstate. Disabling pstate was not helpful because the BIOS does not deal properly with turbo (so far as I can tell) because the maximum speed was 2100 MHz when I tried that. This led me to believe thermal throttling was causing the issues.
What to do about the temperatures? Would changing the heatsink compound be enough? Should I just change the fan? Do I need a complete HSF overhaul? Finally, how would I test and assess performance of different solutions? The short answers were I wanted to try everything and HPCC is a decent stand in for Gaussian09 and MOLPRO because it has DGEMM and Linpack sections pretty much represent the computations in those programs (and previous testing with AMD systems showed it produced similar temperatures). HPCC with N = 24000 provided 25 minutes of run time (most of that being in the linpack section). Ambient temperatures fluctuated 1 - 2 °C, and were measured using a calibrated thermistor accurate to better than 0.1 °C. The CPU temperatures were measured using the built-in temperature sensor, and all reported temperatures are differentials. A bash script recorded the CPU temperatures every minute.
As for the heatsink, Alpha Nova Tech produced some custom 70 mm x 70 mm x 40 mm aluminum heatsinks having the same footprint as the stock heatsink (so I could reuse the screws and springs). Next up came the fan ducts, which I designed using OnShape. The two basic geometries involve one fan blowing down or a push-pull arrangement (at 50 ° from horizontal). These were printed using HIPS on a Lulzbot Mini.
Enough details, now on to the results.
Time for a couple related notes. I changed the governor to performance using cpufreq and intel_pstate. Disabling pstate was not helpful because the BIOS does not deal properly with turbo (so far as I can tell) because the maximum speed was 2100 MHz when I tried that. This led me to believe thermal throttling was causing the issues.
What to do about the temperatures? Would changing the heatsink compound be enough? Should I just change the fan? Do I need a complete HSF overhaul? Finally, how would I test and assess performance of different solutions? The short answers were I wanted to try everything and HPCC is a decent stand in for Gaussian09 and MOLPRO because it has DGEMM and Linpack sections pretty much represent the computations in those programs (and previous testing with AMD systems showed it produced similar temperatures). HPCC with N = 24000 provided 25 minutes of run time (most of that being in the linpack section). Ambient temperatures fluctuated 1 - 2 °C, and were measured using a calibrated thermistor accurate to better than 0.1 °C. The CPU temperatures were measured using the built-in temperature sensor, and all reported temperatures are differentials. A bash script recorded the CPU temperatures every minute.
As for the heatsink, Alpha Nova Tech produced some custom 70 mm x 70 mm x 40 mm aluminum heatsinks having the same footprint as the stock heatsink (so I could reuse the screws and springs). Next up came the fan ducts, which I designed using OnShape. The two basic geometries involve one fan blowing down or a push-pull arrangement (at 50 ° from horizontal). These were printed using HIPS on a Lulzbot Mini.
Enough details, now on to the results.
HSF | Max | Avg |
Stock | 42 | 38 |
Stock-AS5 | 39 | 37 |
60 mm Everflow | 23 | 20 |
60 mm Fractal | 32 | 31 |
60 mm Panaflo | 28 | 25 |
60 mm San Ace | 30 | 28 |
60 mm YS Tech | 28 | 25 |
60 mm Everflow | 24 | 22 |
60 mm Fractal | 32 | 31 |
60 mm San Ace | 56 | 53 |
60 mm YS Tech | 55 | 53 |
80 mm Akasa | 29 | 27 |
80 mm Noctua | 32 | 30 |
80 mm Sunon | 23 | 21 |
80 mm Vantec | 32 | 30 |
80 mm Sunon | 24 | 22 |
92 mm Everflow | 28 | 26 |
92 mm Gelid | 30 | 27 |
92 mm Noctua | 31 | 29 |
PP 60mm Everflow/San Ace | 21 | 19 |
PP 60 mm Fractal/Noctua | 29 | 26 |
PP 60 mm Noctua/Fractal | 37 | 35 |
PP 60 mm Noctua/San Ace | 30 | 27 |
PP 60 mm San Ace/Everflow | 23 | 21 |
PP 60 mm San Ace/Fractal | 27 | 26 |
PP 60 mm San Ace/Noctua | 27 | 25 |
PP 70 mm Everflow15/Everflow15 | 23 | 20 |
PP 70 mm Everflow15/Everflow25 | 23 | 21 |
PP 70 mm Everflow25/Everlfow15 | 23 | 21 |
PP 70 mm Everflow25/Everflow25 | 23 | 21 |
Ok, so that tells us a couple things. The heatsink compound does reduce the temperature by a few degrees. The custom heatsink lowers the temperatures by approximately 10 °C, and the fan choice provides another 10 °C drop.
High static pressure fans reduce the temperature more than high flow rate fans (no real surprise here). Hence, the 60 mm San Ace and YS Tech, 80 mm Sunon, and 92 mm Everflow are the winners for single fan setups. The push-pull setup definitely lowers the temperatures drastically, especially the maxima. The Everflow 25 mm fans having the lowest noise output.
The losers were the Noctua and Gelid fans, though these were also the lowest noise models.
During all of the tests the processor frequency (as monitored by cat /proc/cpuinfo | grep MHz remained near 2600 MHz. Returning to the Gaussian09 and MOLPRO calculations still showed throttling, though not as bad as before (on average the frequencies remain around 2400 MHz). Hmmmmm...perhaps it is not thermal throttling after all, but at least I feel much better about running these X10SDV units full throttle.
Thursday, August 6, 2015
Getting useful XYZ geometries from G09 and transforming them (to angstroms)
I finally got to the point when optimizing geometries of many, many molecules led me to needing more automation to create input files for the subsequent single-point calculations. Methinks this is as good of a time as any to learn python (and by learn I mean cobble together something functional, but ugly). Ok, with that truth out of the way, on to the goal of this exercise. I wanted to optimize geometries in G09, export the geometry (some of the geometry optimizations were in cartesian space and some used an input Z-matrix, so XYZ coordinates are the common output), and use the exported geometries to calculate single-point energies using MOLPRO.
1) First hurdle, getting a useful XYZ coordinates from G09 (also known as, why is there no easy way to grep a final set of cartesian coordinates from the log file?). There are lots of punch options and IOPs, but none that dump the geometry to the log file. This means using "punch=coord" and dealing with the fort.7 file that is generated. Here is the line from my bash script that submits all of the geometry optimizations.
$ for i in $( ls *pbe1pbe-vtz*.com ) ; do echo ${i}; NAME=$(basename $i ".com" ) ; /cluster/software/g09/g09 <./${NAME}.com >./${NAME}.log ; mv fort.7 ${NAME}.xyz ; done
2) Ok, that really was not so terrible, and I will stop complaining now. Second hurdle, transforming the fort.7 file into something useful because it writes the XYZ coordinates using the atomic number (instead of the associated element abbreviation) and the units are bohr. The basic parts of the python script that follow are reading and parsing the input file (this includes substituting "C" for "6", etc), finding the center of mass, shifting all of the atoms so the center of mass is at the origin, converting the distances to angstroms, and writing a new XYZ file. HUGE NOTE: Any improvements would be greatly appreciated.
#!/bin/python
bohr_per_ang = 1.88971616463207
ang_per_bohr = 0.5291772109217
from sys import argv
script, filename = argv
infile1 = open(filename,"r") #opens file with name of "test.txt"
geom1 = []
count = 0
for temp_line in infile1 :
temp_line = temp_line.strip()
line = temp_line.split()
if int(line[0]) == 1 :
atom = "H"
elif int(line[0]) == 5 :
atom = "B"
elif int(line[0]) == 6 :
atom = "C"
parsed_line = [ atom, int(line[0]), float(line[1].replace("D","E")), float(line[2].replace("D","E")), float(line[3].replace("D","E")) ]
count += 1
geom1.append(parsed_line)
infile1.close()
#calculate the center of mass in x, y, and z directions
mx = 0
my = 0
mz = 0
mass_total = 0
for i in range(0,count) :
mx = mx + geom1[i][1]*geom1[i][2]
my = my + geom1[i][1]*geom1[i][3]
mz = mz + geom1[i][1]*geom1[i][4]
mass_total = mass_total + geom1[i][1]
com1 = [mx/mass_total, my/mass_total, mz/mass_total]
#shift all atoms so the center of mass is at 0,0,0
geom1_shifted=[[0 for j in range(0,3)] for i in range(0,count)]
for i in range(0, count) :
geom1_shifted[i][0] = geom1[i][2] - com1[0]
geom1_shifted[i][1] = geom1[i][3] - com1[1]
geom1_shifted[i][2] = geom1[i][4] - com1[2]
outfile1 = open(filename, "w")
outfile1.write("%d\n" % (count) )
outfile1.write("\n")
for i in range(0, count) :
outfile1.write( "%s %14.10f %14.10f %14.10f\n" % (geom1[i][0] , geom1[i][2]*ang_per_bohr , geom1[i][3]*ang_per_bohr , geom1[i][4]*ang_per_bohr ))
outfile1.close()
And the bash line:
$ for i in $( ls c6h7-int*ub3lyp-6311ppg*.xyz ) ; do echo $i ; python bohr_to_ang.py ./${i} ; done
3) Now that we have that out of the way all we need to do is create a MOLPRO input file. The top part of my template is below, along with the bash line I use to create all of the input files for a given set of geometries (usually belonging to a particular method/basis set).
***,template
memory,800,M
gthresh,oneint=1.d-14,twoint=1.d-14,zero=1.d-14
angstrom
symmetry,nosym
geomtyp=xyz
geom={
}
basis=6-31G*;
{multi;canon,3100.2;
occ,22;closed,21}
basis=6-311G**;
{multi;canon,3101.2;
occ,22;closed,21}
basis=aug-cc-pvtz;
{multi;canon,3102.2;
occ,22;closed,21}
basis={
default,vtz-f12
...
$ for i in $( ls geom-method-tests/c6h7-int*ub3lyp-6311ppg*.xyz ) ; do outname="$( basename $i "-a.xyz" )-uccsdtf12-vtzf12-ad.inp" ; echo $outname; cp c6h7-template-uccsdtf12-vtzf12-ad.inp $outname ; tail -n13 $i >tmpfile ; sed -i -e '/geom={/r tmpfile' $outname ; done
This last bash line uses the bulk of the coordinate filename and concatenates the single-point method and basis set onto the end when making the MOLPRO input file name. I then use sed to put the geometry into the newly minted template just after geom={. Voila!
1) First hurdle, getting a useful XYZ coordinates from G09 (also known as, why is there no easy way to grep a final set of cartesian coordinates from the log file?). There are lots of punch options and IOPs, but none that dump the geometry to the log file. This means using "punch=coord" and dealing with the fort.7 file that is generated. Here is the line from my bash script that submits all of the geometry optimizations.
$ for i in $( ls *pbe1pbe-vtz*.com ) ; do echo ${i}; NAME=$(basename $i ".com" ) ; /cluster/software/g09/g09 <./${NAME}.com >./${NAME}.log ; mv fort.7 ${NAME}.xyz ; done
#!/bin/python
bohr_per_ang = 1.88971616463207
ang_per_bohr = 0.5291772109217
from sys import argv
script, filename = argv
infile1 = open(filename,"r") #opens file with name of "test.txt"
geom1 = []
count = 0
for temp_line in infile1 :
temp_line = temp_line.strip()
line = temp_line.split()
if int(line[0]) == 1 :
atom = "H"
elif int(line[0]) == 5 :
atom = "B"
elif int(line[0]) == 6 :
atom = "C"
parsed_line = [ atom, int(line[0]), float(line[1].replace("D","E")), float(line[2].replace("D","E")), float(line[3].replace("D","E")) ]
count += 1
geom1.append(parsed_line)
infile1.close()
#calculate the center of mass in x, y, and z directions
mx = 0
my = 0
mz = 0
mass_total = 0
for i in range(0,count) :
mx = mx + geom1[i][1]*geom1[i][2]
my = my + geom1[i][1]*geom1[i][3]
mz = mz + geom1[i][1]*geom1[i][4]
mass_total = mass_total + geom1[i][1]
com1 = [mx/mass_total, my/mass_total, mz/mass_total]
#shift all atoms so the center of mass is at 0,0,0
geom1_shifted=[[0 for j in range(0,3)] for i in range(0,count)]
for i in range(0, count) :
geom1_shifted[i][0] = geom1[i][2] - com1[0]
geom1_shifted[i][1] = geom1[i][3] - com1[1]
geom1_shifted[i][2] = geom1[i][4] - com1[2]
outfile1 = open(filename, "w")
outfile1.write("%d\n" % (count) )
outfile1.write("\n")
for i in range(0, count) :
outfile1.write( "%s %14.10f %14.10f %14.10f\n" % (geom1[i][0] , geom1[i][2]*ang_per_bohr , geom1[i][3]*ang_per_bohr , geom1[i][4]*ang_per_bohr ))
outfile1.close()
And the bash line:
$ for i in $( ls c6h7-int*ub3lyp-6311ppg*.xyz ) ; do echo $i ; python bohr_to_ang.py ./${i} ; done
3) Now that we have that out of the way all we need to do is create a MOLPRO input file. The top part of my template is below, along with the bash line I use to create all of the input files for a given set of geometries (usually belonging to a particular method/basis set).
***,template
memory,800,M
gthresh,oneint=1.d-14,twoint=1.d-14,zero=1.d-14
angstrom
symmetry,nosym
geomtyp=xyz
geom={
}
basis=6-31G*;
{multi;canon,3100.2;
occ,22;closed,21}
basis=6-311G**;
{multi;canon,3101.2;
occ,22;closed,21}
basis=aug-cc-pvtz;
{multi;canon,3102.2;
occ,22;closed,21}
basis={
default,vtz-f12
...
$ for i in $( ls geom-method-tests/c6h7-int*ub3lyp-6311ppg*.xyz ) ; do outname="$( basename $i "-a.xyz" )-uccsdtf12-vtzf12-ad.inp" ; echo $outname; cp c6h7-template-uccsdtf12-vtzf12-ad.inp $outname ; tail -n13 $i >tmpfile ; sed -i -e '/geom={/r tmpfile' $outname ; done
This last bash line uses the bulk of the coordinate filename and concatenates the single-point method and basis set onto the end when making the MOLPRO input file name. I then use sed to put the geometry into the newly minted template just after geom={. Voila!
Thursday, September 11, 2014
Installing 7zip on CentOS 7.0
CentOS does not come with 7zip installed, and it is not even readily available on the installation media or default repository. Looks like the next step is to tell yum about another repository and install 7zip from there. Here we go:
# wget http://packages.sw.be/rpmforge-release/rpmforge-release-0.5.3-1.el7.rf.x86_64.rpm
# rpm -ivh rpmforge-release-0.5.3-1.el7.rf.x86_64.rpm
# yum install p7zip
That's it (other than typing "y" when asked if you really want to install 7zip), and now you should have access to the binary, 7za.
# wget http://packages.sw.be/rpmforge-release/rpmforge-release-0.5.3-1.el7.rf.x86_64.rpm
# rpm -ivh rpmforge-release-0.5.3-1.el7.rf.x86_64.rpm
# yum install p7zip
That's it (other than typing "y" when asked if you really want to install 7zip), and now you should have access to the binary, 7za.
Wednesday, September 10, 2014
Inconsistent booting on Intel 5960X/Asus X99-deluxe
I just put together a 5960X node because 8 cores and AVX2 should do very well at electronic structure calculations (or anything using matrix-matrix and matrix-vector multiplication). I ran into a problem after it had run flawlessly for a day with half of the RAM. I added in the rest of the RAM (G.Skill Ripjaws DDR4 2400, 8 x 8 GB) and two more SSDs, and then it would no longer POST. After removing the drives and the RAM it was fine, but various combinations of the new hardware did not work, and the Qcode would stop at 00 (not used) or Ad (not even listed). I found out that I had routinely bumped the video card while putting in the RAM (this is not in a case, so nudging the card can lead to poor contact), and it was this improper seating of the vid card in its PCI-E slot that was causing the boot process to hang before getting to POST.
Wednesday, April 16, 2014
segmented basis sets
Some days a generally contracted basis set won't do, such as when you are trying to calculate CADPAC energy gradients using MCSCF in MOLPRO (namely, you want to use state-averaged calculations). In that case I suffer through using the Pople basis sets for a bit until I feel they are not cutting it, in which case I supply an external basis set using the below library developed in Japan.
http://sapporo.center.ims.ac.jp/sapporo/Welcome.do
Here is an example for C, using their 2012 DZP basis set.
http://sapporo.center.ims.ac.jp/sapporo/Welcome.do
Here is an example for C, using their 2012 DZP basis set.
basis={ !****************************************************************************** ! Element : C ! Basis : Sapporo-DZP-2012 = gtf non-relativistic ([4s3p1d]{5211/131/2}) ! Term : 3P Valence configuration : 1s(2)2s(2)2p(2) ! SCF energy : -37.67927012 a.u. ! Valence Correlation energy : -0.11425744 a.u. ! Reference ! Authors : T. Noro, M. Sekiya, T. Koga, ! Journal : Theoret. Chem. Acc. 131, 1124 (2012) !****************************************************************************** s,C,2.804431e+003,4.212028e+002,9.581016e+001,2.692019e+001,8.564436e+000,5.147134e+000,4.760360e-001,2.873528e+000,1.497890e-001 c,1.5,2.7210000e-003,2.0759000e-002,1.0106300e-001,3.3470600e-001,6.4691300e-001 c,6.7,-1.5724400e-001,1.0550390e+000 c,8.8,1.0000000e+000 c,9.9,1.0000000e+000 p,C,1.208140e+001,9.439681e+000,2.000405e+000,5.449160e-001,1.514610e-001 c,1.1,1.0000000e+000 c,2.4,5.6939000e-002,3.1332600e-001,7.6035500e-001 c,5.5,1.0000000e+000 d,C,1.251463e+000,3.377940e-001 c,1.2,3.5720487e-001,7.7365173e-001 }
Saturday, February 15, 2014
follow up to MOLPRO ROHF convergence difficulties (optimizing a geometry using procedures)
The end result of the last post was that I found level shifts of -0.3, 0.3 worked for my system. A brief tangent for those of you who have perused the manual and saw their suggestion of -1.0,-0.5, I did try that one as well and it does not work in this particular case. Ok, back to the question at hand, how do I optimize a geometry when ROHF won't even converge for a structure that is a local minimum? One way around this is to use MCSCF with one open orbital and run the converged orbitals through ROHF using only one iteration. However, that is a story for another day, in this case I will use the level shifts I listed above. My basic thought was that I really only want to use the level shifts if they are needed, otherwise I will use the default values. These two ideas lead to using a procedure and boolean logic within said procedure.
basis=aug-cc-pvdz
proc autoshift
rhf
if(status.lt.0) then
status,rhf,clear !make sure calc does not abort when stock rohf does not converge
{rhf,shifta=-0.3,shiftb=0.3; !shifts found using grid search
start,atden} !start with atomic densities again, otherwise rohf will use same "bad" orbitals
endif
uccsd(t)
endproc
{optg,proc=autoshift,root=1,energy=1.d-11,step=1.d-5,gradient=1.d-5,displace=symm,numhess=2,hessproc=b3lyphess}
proc b3lyphess
rhf
if(status.lt.0) then
status,rhf,clear !same as for autoshift
{rhf,shifta=-0.3,shiftb=0.3;
start,atden}
endif
ks,b3lyp
endproc
basis=aug-cc-pvdz
proc autoshift
rhf
if(status.lt.0) then
status,rhf,clear !make sure calc does not abort when stock rohf does not converge
{rhf,shifta=-0.3,shiftb=0.3; !shifts found using grid search
start,atden} !start with atomic densities again, otherwise rohf will use same "bad" orbitals
endif
uccsd(t)
endproc
{optg,proc=autoshift,root=1,energy=1.d-11,step=1.d-5,gradient=1.d-5,displace=symm,numhess=2,hessproc=b3lyphess}
proc b3lyphess
rhf
if(status.lt.0) then
status,rhf,clear !same as for autoshift
{rhf,shifta=-0.3,shiftb=0.3;
start,atden}
endif
ks,b3lyp
endproc
Here I have made two procedures, one for the optimization using CCSD(T) and one for the Hessian (MOLPRO does far better with frequent numerical hessians when optimizing open-shell doublet molecules) using B3LYP. I will talk about the first one because they are almost identical. Beginning with the if statement, I hate it check if the stock ROHF calculation finished properly. If it did not, then I run ROHF with the level shifts I found in the previous post, taking care to start with the atomic densities and not the orbitals from the previous ROHF calculation. I could put in several of these if statements with various values for the level shifts to be on the safe side, but I wanted to keep this example on the simple side. If this ROHF calculation fails then I still want MOLPRO to error out, so I do not have a status,rhf,clear statement in the if structure. After that I run the UCCSD(T) calculation. The only other bit to include is telling optg that I want to use a procedure, which is accomplished with the proc=autoshift. This is currently running, but it made it past the initial calculation which is progress.
Subscribe to:
Posts (Atom)