Tuesday, July 30, 2019

X11SDV U.2 boot challenges

After switching over the head node to an X11SDV-4C-TLN2F board (my original X10SDV-4C did not have 10G ports, and I wanted the PCI-E slot for a graphics card), I figured putting together a couple compute nodes using X11SDV-12C-TLN2F models would be a good idea for having a larger number of cores (all of the previous compute nodes have 8) and testing out AVX-512.

The one hitch to using the new X11SDV offerings is the lack of a M.2 slot, and a U.2 port + Oculink in its place. Said U.2 port did not work with M.2 adapters, so I got a couple Intel Optane 905P drives having U.2 connectors. The first board worked flawlessly, and I used it to set up both Optane drives. The second X11SDV would never see the Optane drive no matter what BIOS options I tried (EFI only and dual were both attempted as first measures), though the USB thumb drive with Ubuntu 18.04.2 server worked just fine. Even more maddening was the Ubuntu installer saw the U.2 drive just fine, only the BIOS was myopic when it came to detecting this boot drive.

After much fussing (and swearing), I finally remembered the most recent version of the BIOS was 1.1a, but the second board was only blessed with 1.0b. Why these two boards, from the same vendor and bought at the same, had different BIOS versions is beyond me. Long story short, updating the BIOS on the second board to 1.1a solved the boot issue.

Thursday, September 14, 2017

Cooling X10SDV (Xeon D-1540/1541)

This all began with the purchase of an X10SDV-TLN4F-O for Gaussian09 and MOLPRO calculations (using tmpfs, so IO should not be limiting). When running calculations on all 8 cores I noticed that the cores were rarely above 2100 MHz and the temperatures were in the 60 - 75 °C. For comparison, my two X99 systems with Core i7 5960X processors having water cooling rarely get above 47 °C and alway ran at maximum turbo speed of 3500 MHz.

Time for a couple related notes. I changed the governor to performance using cpufreq and intel_pstate. Disabling pstate was not helpful because the BIOS does not deal properly with turbo (so far as I can tell) because the maximum speed was 2100 MHz when I tried that. This led me to believe thermal throttling was causing the issues.

What to do about the temperatures? Would changing the heatsink compound be enough? Should I just change the fan? Do I need a complete HSF overhaul? Finally, how would I test and assess performance of different solutions? The short answers were I wanted to try everything and HPCC is a decent stand in for Gaussian09 and MOLPRO because it has DGEMM and Linpack sections pretty much represent the computations in those programs (and previous testing with AMD systems showed it produced similar temperatures). HPCC with N = 24000 provided 25 minutes of run time (most of that being in the linpack section). Ambient temperatures fluctuated 1 - 2 °C, and were measured using a calibrated thermistor accurate to better than 0.1 °C. The CPU temperatures were measured using the built-in temperature sensor, and all reported temperatures are differentials. A bash script recorded the CPU temperatures every minute.

As for the heatsink, Alpha Nova Tech produced some custom 70 mm x 70 mm x 40 mm aluminum heatsinks having the same footprint as the stock heatsink (so I could reuse the screws and springs). Next up came the fan ducts, which I designed using OnShape. The two basic geometries involve one fan blowing down or a push-pull arrangement (at 50 ° from horizontal). These were printed using HIPS on a Lulzbot Mini.

Enough details, now on to the results.

HSF Max Avg
Stock 42 38
Stock-AS5 39 37
     
60 mm Everflow 23 20
60 mm Fractal 32 31
60 mm Panaflo 28 25
60 mm San Ace 30 28
60 mm YS Tech 28 25
     
60 mm Everflow 24 22
60 mm Fractal 32 31
60 mm San Ace 56 53
60 mm YS Tech 55 53
     
80 mm Akasa 29 27
80 mm Noctua 32 30
80 mm Sunon 23 21
80 mm Vantec 32 30
 
80 mm Sunon 24 22
92 mm Everflow 28 26
92 mm Gelid 30 27
92 mm Noctua 31 29
     
PP 60mm Everflow/San Ace 21 19
PP 60 mm Fractal/Noctua 29 26
PP 60 mm Noctua/Fractal 37 35
PP 60 mm Noctua/San Ace 30 27
PP 60 mm San Ace/Everflow 23 21
PP 60 mm San Ace/Fractal 27 26
PP 60 mm San Ace/Noctua 27 25
     
PP 70 mm Everflow15/Everflow15 23 20
PP 70 mm Everflow15/Everflow25 23 21
PP 70 mm Everflow25/Everlfow15 23 21
PP 70 mm Everflow25/Everflow25 23 21


Ok, so that tells us a couple things. The heatsink compound does reduce the temperature by a few degrees. The custom heatsink lowers the temperatures by approximately 10 °C, and the fan choice provides another 10 °C drop.

High static pressure fans reduce the temperature more than high flow rate fans (no real surprise here). Hence, the 60 mm San Ace and YS Tech, 80 mm Sunon, and 92 mm Everflow are the winners for single fan setups. The push-pull setup definitely lowers the temperatures drastically, especially the maxima. The Everflow 25 mm fans having the lowest noise output.

The losers were the Noctua and Gelid fans, though these were also the lowest noise models.

During all of the tests the processor frequency (as monitored by cat /proc/cpuinfo | grep MHz remained near 2600 MHz. Returning to the Gaussian09 and MOLPRO calculations still showed throttling, though not as bad as before (on average the frequencies remain around 2400 MHz). Hmmmmm...perhaps it is not thermal throttling after all, but at least I feel much better about running these X10SDV units full throttle.

Thursday, August 6, 2015

Getting useful XYZ geometries from G09 and transforming them (to angstroms)

I finally got to the point when optimizing geometries of many, many molecules led me to needing more automation to create input files for the subsequent single-point calculations. Methinks this is as good of a time as any to learn python (and by learn I mean cobble together something functional, but ugly). Ok, with that truth out of the way, on to the goal of this exercise. I wanted to optimize geometries in G09, export the geometry (some of the geometry optimizations were in cartesian space and some used an input Z-matrix, so XYZ coordinates are the common output), and use the exported geometries to calculate single-point energies using MOLPRO.

1) First hurdle, getting a useful XYZ coordinates from G09 (also known as, why is there no easy way to grep a final set of cartesian coordinates from the log file?). There are lots of punch options and IOPs, but none that dump the geometry to the log file. This means using "punch=coord" and dealing with the fort.7 file that is generated. Here is the line from my bash script that submits all of the geometry optimizations.

$ for i in $( ls *pbe1pbe-vtz*.com ) ; do echo ${i}; NAME=$(basename $i ".com" ) ; /cluster/software/g09/g09 <./${NAME}.com >./${NAME}.log ; mv fort.7 ${NAME}.xyz ; done

2) Ok, that really was not so terrible, and I will stop complaining now. Second hurdle, transforming the fort.7 file into something useful because it writes the XYZ coordinates using the atomic number (instead of the associated element abbreviation) and the units are bohr. The basic parts of the python script that follow are reading and parsing the input file (this includes substituting "C" for "6", etc), finding the center of mass, shifting all of the atoms so the center of mass is at the origin, converting the distances to angstroms, and writing a new XYZ file. HUGE NOTE: Any improvements would be greatly appreciated.

#!/bin/python

bohr_per_ang = 1.88971616463207
ang_per_bohr = 0.5291772109217

from sys import argv

script, filename = argv

infile1 = open(filename,"r") #opens file with name of "test.txt"

geom1 = []

count = 0
for temp_line in infile1 :
        temp_line = temp_line.strip()
        line = temp_line.split()
        if int(line[0]) == 1 :
                atom = "H"
        elif int(line[0]) == 5 :
                atom = "B"
        elif int(line[0]) == 6 :
                atom = "C"
        parsed_line = [ atom, int(line[0]), float(line[1].replace("D","E")), float(line[2].replace("D","E")), float(line[3].replace("D","E")) ]
        count += 1
        geom1.append(parsed_line)

infile1.close()

#calculate the center of mass in x, y, and z directions
mx = 0
my = 0
mz = 0
mass_total = 0

for i in range(0,count) :
        mx = mx + geom1[i][1]*geom1[i][2]
        my = my + geom1[i][1]*geom1[i][3]
        mz = mz + geom1[i][1]*geom1[i][4]
        mass_total = mass_total + geom1[i][1]

com1 = [mx/mass_total, my/mass_total, mz/mass_total]

#shift all atoms so the center of mass is at 0,0,0
geom1_shifted=[[0 for j in range(0,3)] for i in range(0,count)]

for i in range(0, count) :
        geom1_shifted[i][0] = geom1[i][2] - com1[0]
        geom1_shifted[i][1] = geom1[i][3] - com1[1]
        geom1_shifted[i][2] = geom1[i][4] - com1[2]


outfile1 = open(filename, "w")
outfile1.write("%d\n" % (count) )
outfile1.write("\n")
for i in range(0, count) :
        outfile1.write( "%s %14.10f %14.10f %14.10f\n" % (geom1[i][0]  ,  geom1[i][2]*ang_per_bohr  ,  geom1[i][3]*ang_per_bohr  ,  geom1[i][4]*ang_per_bohr ))

outfile1.close()

And the bash line:

$ for i in $( ls c6h7-int*ub3lyp-6311ppg*.xyz ) ; do echo $i ; python bohr_to_ang.py ./${i} ; done

3) Now that we have that out of the way all we need to do is create a MOLPRO input file. The top part of my template is below, along with the bash line I use to create all of the input files for a given set of geometries (usually belonging to a particular method/basis set).

***,template
memory,800,M
gthresh,oneint=1.d-14,twoint=1.d-14,zero=1.d-14

angstrom
symmetry,nosym
geomtyp=xyz
geom={
}
    basis=6-31G*;
 {multi;canon,3100.2;
 occ,22;closed,21}

    basis=6-311G**;
 {multi;canon,3101.2;
 occ,22;closed,21}

    basis=aug-cc-pvtz;
 {multi;canon,3102.2;
 occ,22;closed,21}

basis={
default,vtz-f12
...

$ for i in $( ls geom-method-tests/c6h7-int*ub3lyp-6311ppg*.xyz ) ; do outname="$( basename $i "-a.xyz" )-uccsdtf12-vtzf12-ad.inp" ; echo $outname; cp c6h7-template-uccsdtf12-vtzf12-ad.inp $outname ; tail -n13 $i >tmpfile ; sed -i -e '/geom={/r tmpfile' $outname ; done

This last bash line uses the bulk of the coordinate filename and concatenates the single-point method and basis set onto the end when making the MOLPRO input file name. I then use sed to put the geometry into the newly minted template just after geom={. Voila!

Thursday, September 11, 2014

Installing 7zip on CentOS 7.0

CentOS does not come with 7zip installed, and it is not even readily available on the installation media or default repository. Looks like the next step is to tell yum about another repository and install 7zip from there. Here we go:

# wget http://packages.sw.be/rpmforge-release/rpmforge-release-0.5.3-1.el7.rf.x86_64.rpm

# rpm -ivh rpmforge-release-0.5.3-1.el7.rf.x86_64.rpm

# yum install p7zip

That's it (other than typing "y" when asked if you really want to install 7zip), and now you should have access to the binary, 7za.

Wednesday, September 10, 2014

Inconsistent booting on Intel 5960X/Asus X99-deluxe

I just put together a 5960X node because 8 cores and AVX2 should do very well at electronic structure calculations (or anything using matrix-matrix and matrix-vector multiplication). I ran into a problem after it had run flawlessly for a day with half of the RAM. I added in the rest of the RAM (G.Skill Ripjaws DDR4 2400, 8 x 8 GB) and two more SSDs, and then it would no longer POST. After removing the drives and the RAM it was fine, but various combinations of the new hardware did not work, and the Qcode would stop at 00 (not used) or Ad (not even listed). I found out that I had routinely bumped the video card while putting in the RAM (this is not in a case, so nudging the card can lead to poor contact), and it was this improper seating of the vid card in its PCI-E slot that was causing the boot process to hang before getting to POST.

Wednesday, April 16, 2014

segmented basis sets

Some days a generally contracted basis set won't do, such as when you are trying to calculate CADPAC energy gradients using MCSCF in MOLPRO (namely, you want to use state-averaged calculations). In that case I suffer through using the Pople basis sets for a bit until I feel they are not cutting it, in which case I supply an external basis set using the below library developed in Japan.

http://sapporo.center.ims.ac.jp/sapporo/Welcome.do

Here is an example for C, using their 2012 DZP basis set.

basis={
!******************************************************************************
! Element : C
! Basis : Sapporo-DZP-2012 = gtf non-relativistic ([4s3p1d]{5211/131/2})
! Term : 3P   Valence configuration : 1s(2)2s(2)2p(2)
! SCF energy : -37.67927012 a.u.   
! Valence Correlation energy : -0.11425744 a.u.
! Reference
! Authors : T. Noro, M. Sekiya, T. Koga,  
! Journal : Theoret. Chem. Acc. 131, 1124 (2012)
!******************************************************************************
s,C,2.804431e+003,4.212028e+002,9.581016e+001,2.692019e+001,8.564436e+000,5.147134e+000,4.760360e-001,2.873528e+000,1.497890e-001
c,1.5,2.7210000e-003,2.0759000e-002,1.0106300e-001,3.3470600e-001,6.4691300e-001
c,6.7,-1.5724400e-001,1.0550390e+000
c,8.8,1.0000000e+000
c,9.9,1.0000000e+000
p,C,1.208140e+001,9.439681e+000,2.000405e+000,5.449160e-001,1.514610e-001
c,1.1,1.0000000e+000
c,2.4,5.6939000e-002,3.1332600e-001,7.6035500e-001
c,5.5,1.0000000e+000
d,C,1.251463e+000,3.377940e-001
c,1.2,3.5720487e-001,7.7365173e-001
}

Saturday, February 15, 2014

follow up to MOLPRO ROHF convergence difficulties (optimizing a geometry using procedures)

The end result of the last post was that I found level shifts of -0.3, 0.3 worked for my system. A brief tangent for those of you who have perused the manual and saw their suggestion of -1.0,-0.5, I did try that one as well and it does not work in this particular case. Ok, back to the question at hand, how do I optimize a geometry when ROHF won't even converge for a structure that is a local minimum? One way around this is to use MCSCF with one open orbital and run the converged orbitals through ROHF using only one iteration. However, that is a story for another day, in this case I will use the level shifts I listed above. My basic thought was that I really only want to use the level shifts if they are needed, otherwise I will use the default values. These two ideas lead to using a procedure and boolean logic within said procedure.

basis=aug-cc-pvdz

proc autoshift
   rhf
   if(status.lt.0) then
     status,rhf,clear                          !make sure calc does not abort when stock rohf does not converge
     {rhf,shifta=-0.3,shiftb=0.3;        !shifts found using grid search
     start,atden}                              !start with atomic densities again, otherwise rohf will use same "bad" orbitals
  endif
  uccsd(t)
endproc

{optg,proc=autoshift,root=1,energy=1.d-11,step=1.d-5,gradient=1.d-5,displace=symm,numhess=2,hessproc=b3lyphess}

proc b3lyphess
  rhf
  if(status.lt.0) then
    status,rhf,clear                           !same as for autoshift
    {rhf,shifta=-0.3,shiftb=0.3;                                                   
    start,atden}
  endif
  ks,b3lyp
endproc

Here I have made two procedures, one for the optimization using CCSD(T) and one for the Hessian (MOLPRO does far better with frequent numerical hessians when optimizing open-shell doublet molecules) using B3LYP. I will talk about the first one because they are almost identical. Beginning with the if statement, I hate it check if the stock ROHF calculation finished properly. If it did not, then I run ROHF with the level shifts I found in the previous post, taking care to start with the atomic densities and not the orbitals from the previous ROHF calculation. I could put in several of these if statements with various values for the level shifts to be on the safe side, but I wanted to keep this example on the simple side. If this ROHF calculation fails then I still want MOLPRO to error out, so I do not have a status,rhf,clear statement in the if structure. After that I run the UCCSD(T) calculation. The only other bit to include is telling optg that I want to use a procedure, which is accomplished with the proc=autoshift. This is currently running, but it made it past the initial calculation which is progress.