Extending QE to read Avogadro Files (cml files)

A large part of computational material science is simply visualizing the input and output of our calculations. I use a variety of tools to do this including VESTA and Avogadro. While both have their strengths and weaknesses (VESTA is great for making publication-quality images, for example), Avogadro is very simple to use and quickly visualize your structures.

However, saving files in Avogadro does not allow an easy way to rerun them in QE. Avogadro saves their files in a ‘.cml’ format which neatly saves the crystallographic information including the a,b, and c side lengths as well as the alpha,beta, and gamma angles. Further, it stores the positions in fractional coordinates as shown below.

<molecule>
 <crystal>
  <scalar title="a" units="units:angstrom">8.672686</scalar>
  <scalar title="b" units="units:angstrom">5.000515</scalar>
  <scalar title="c" units="units:angstrom">19.225313</scalar>
  <scalar title="alpha" units="units:degree">89.995590</scalar>
  <scalar title="beta" units="units:degree">100.097768</scalar>
  <scalar title="gamma" units="units:degree">90.000692</scalar>
  <symmetry spaceGroup="P 1">
   <transform3>1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1</transform3>
  </symmetry>
 </crystal>
 <atomArray>
  <atom id="a1" elementType="P" xFract="0.609549" yFract="0.390867" zFract="0.487471"/>
  <atom id="a2" elementType="H" xFract="0.722219" yFract="0.231824" zFract="0.594411"/>
  <atom id="a3" elementType="H" xFract="0.327128" yFract="0.699623" zFract="0.200812"/>
  <atom id="a4" elementType="N" xFract="0.362637" yFract="0.878370" zFract="0.225001"/>
  <atom id="a5" elementType="O" xFract="0.240903" yFract="0.716682" zFract="0.464116"/>
  <atom id="a6" elementType="Sn" xFract="0.238695" yFract="0.380652" zFract="0.399504"/>
  <atom id="a7" elementType="P" xFract="0.108839" yFract="0.884258" zFract="0.487300"/>
  <atom id="a8" elementType="H" xFract="0.930945" yFract="0.314658" zFract="0.204357"/>
  <atom id="a9" elementType="H" xFract="0.249175" yFract="0.873678" zFract="0.598925"/>
  <atom id="a10" elementType="N" xFract="0.638140" yFract="0.364276" zFract="0.574848"/>
  <atom id="a11" elementType="N" xFract="0.135566" yFract="0.858096" zFract="0.574436"/>
  <atom id="a12" elementType="H" xFract="0.537312" yFract="0.339741" zFract="0.594010"/>
  <atom id="a13" elementType="O" xFract="0.743379" yFract="0.228629" zFract="0.463162"/>
  <atom id="a14" elementType="H" xFract="0.840210" yFract="0.605429" zFract="0.210810"/>
  <atom id="a15" elementType="O" xFract="0.735419" yFract="0.559324" zFract="0.333506"/>
  <atom id="a16" elementType="N" xFract="0.842522" yFract="0.408594" zFract="0.223959"/>
  <atom id="a17" elementType="O" xFract="0.942587" yFract="0.771165" zFract="0.461903"/>
  <atom id="a18" elementType="O" xFract="0.119302" yFract="0.178496" zFract="0.465925"/>
  <atom id="a19" elementType="P" xFract="0.372946" yFract="0.880004" zFract="0.312433"/>
  <atom id="a20" elementType="P" xFract="0.867696" yFract="0.386086" zFract="0.312113"/>
  <atom id="a21" elementType="Sn" xFract="0.739585" yFract="0.888994" zFract="0.400354"/>
  <atom id="a22" elementType="O" xFract="0.355212" yFract="0.586271" zFract="0.331764"/>
  <atom id="a23" elementType="O" xFract="0.532312" yFract="1.011308" zFract="0.342344"/>
  <atom id="a24" elementType="O" xFract="0.852226" yFract="0.088899" zFract="0.328542"/>
  <atom id="a25" elementType="O" xFract="0.620067" yFract="0.686831" zFract="0.468153"/>
  <atom id="a26" elementType="H" xFract="0.063394" yFract="0.979772" zFract="0.597245"/>
  <atom id="a27" elementType="O" xFract="0.444219" yFract="0.273920" zFract="0.461720"/>
  <atom id="a28" elementType="O" xFract="0.237602" yFract="0.052403" zFract="0.331723"/>
  <atom id="a29" elementType="H" xFract="0.461875" yFract="0.948117" zFract="0.210290"/>
  <atom id="a30" elementType="O" xFract="0.031252" yFract="0.502811" zFract="0.341730"/>
 </atomArray>
 <bondArray>
  <bond atomRefs2="a3 a4" order="1"/>
  <bond atomRefs2="a8 a16" order="1"/>
  <bond atomRefs2="a29 a4" order="1"/>
  <bond atomRefs2="a14 a16" order="1"/>
  <bond atomRefs2="a16 a20" order="1"/>
  <bond atomRefs2="a4 a19" order="1"/>
  <bond atomRefs2="a20 a24" order="1"/>
  <bond atomRefs2="a20 a15" order="1"/>
  <bond atomRefs2="a19 a22" order="1"/>
  <bond atomRefs2="a19 a23" order="1"/>
  <bond atomRefs2="a22 a6" order="1"/>
  <bond atomRefs2="a15 a21" order="1"/>
  <bond atomRefs2="a30 a6" order="1"/>
  <bond atomRefs2="a23 a21" order="1"/>
  <bond atomRefs2="a6 a27" order="1"/>
  <bond atomRefs2="a6 a18" order="1"/>
  <bond atomRefs2="a21 a17" order="1"/>
  <bond atomRefs2="a21 a25" order="1"/>
  <bond atomRefs2="a27 a1" order="1"/>
  <bond atomRefs2="a13 a1" order="2"/>
  <bond atomRefs2="a5 a7" order="1"/>
  <bond atomRefs2="a25 a1" order="1"/>
  <bond atomRefs2="a7 a11" order="1"/>
  <bond atomRefs2="a1 a10" order="1"/>
  <bond atomRefs2="a11 a26" order="1"/>
  <bond atomRefs2="a11 a9" order="1"/>
  <bond atomRefs2="a10 a12" order="1"/>
  <bond atomRefs2="a10 a2" order="1"/>
 </bondArray>
</molecule>

Now having this file, if I wanted to run it again in QE I would have to open it in Avogadro and copy/past the information into a QE input file. However, I have my QE.Struct python module that can directly print the atoms and crystal to use in the ibrav = 0 setting. This means a quick extension of the QE module will allow us to take a cml file and directly print the QE positions. The code to do this is below:

import QE
from lxml import etree as ET
import numpy as np

class cml(QE.Struct):

  def __init__(self,fname=''):
    QE.Struct.__init__(self)
    if fname != '':
      self.CML_Process(fname)
  
  def CML_Process(self,fname):
    root = ET.parse(fname)
    crystal = root.findall('.//crystal')[0]
    crystal_ = {}
    atom_array = root.findall('.//atomArray')[0].findall('.//atom')
    scalar = crystal.findall('.//scalar')
    for i in scalar:
      crystal_[i.attrib['title']] = float(i.text)
    for i in ['alpha','beta','gamma']:
      self.angles[i] = float(crystal_[i])
    for i in ['a','b','c']:
      self.norms[i] = float(crystal_[i])
    conversion = self.From_Crystal()
    for i in atom_array:
      id_ = i.attrib['elementType']
      pos = np.array(np.dot(conversion,np.array([float(i.attrib[jj]) for jj in 'xFract yFract zFract'.split()])))
      self.atoms[self.atomindex.key(id_)] = pos
    conv = np.pi/180
    omega = self.norms['a']*self.norms['b']*self.norms['c']*np.sqrt(1 - np.cos(self.angles['alpha']*conv)**2 - np.cos(self.angles['beta']*conv)**2 - np.cos(self.angles['gamma']*conv)**2 + 2*np.cos(self.angles['alpha']*conv)*np.cos(self.angles['beta']*conv)*np.cos(self.angles['gamma']*np.pi*180))
    conversion = np.linalg.inv(conversion)
    self.lattice['a'] = np.array([self.norms['a'],0,0])
    self.lattice['b'] = np.array([self.norms['b']*np.cos(self.angles['gamma']*conv),self.norms['b']*np.sin(self.angles['gamma']*conv),0])
    self.lattice['c'] = np.array([self.norms['c']*np.cos(self.angles['beta']*conv),self.norms['c']*(np.cos(self.angles['alpha']*conv) - np.cos(self.angles['beta']*conv)*np.cos(self.angles['gamma']*conv))/np.sin(self.angles['gamma']*conv),omega/(self.norms['a']*self.norms['b']*np.sin(self.angles['gamma']*conv))])

Essentially this subclass cml extends QE.Struct (note the QE.Struct.__init__(self) call) by adding in the CML_Process. Now the following will allow me to directly print the QE structure from Avogadro. For information on the crystallographic transformations that I am doing above, see https://en.wikipedia.org/wiki/Fractional_coordinates. Now, we can convert files as:

x = cml('fname.cml')
print(x)

And the output:

ATOMIC_POSITIONS {angstrom}
P 6.4731789658 1.74663600088 5.90752288181
N 3.59670104869 1.48269061764 10.8804430241
H 4.25995797158 0.808829877278 11.2507219619
H 7.38493976475 1.45298214385 3.86796978517
H 2.65766277167 1.34870684685 11.2431320292
O 0.524817095724 3.31017940814 8.78456164856
O 4.88587883073 0.870225450808 8.7665048011
O 2.29621596285 1.09755384491 8.73921132728
Sn 0.723478537496 1.66794558634 7.56161717511
O 6.28365252733 0.250862937336 6.21848299378
O 6.61775826243 3.5839270461 8.74267506217
O 3.46256015676 4.85524660047 6.47972052895
N 6.55199587569 1.91115500245 4.23898689605
P 3.64326407671 1.66716867493 9.22661371593
O 1.96231559562 2.73607969182 6.27946744084
H 3.2968060678 4.61710582931 3.98026672012
O 0.942491438371 0.0664889197862 6.2786914128
N 2.38656326857 4.25966267051 4.25870936466
N -0.76060652308 3.95228781169 10.8726448887
P 2.1812590996 4.21629172619 5.91357968599
H 2.16015055594 3.3800952034 3.80087175139
H 0.142141597579 4.01576939675 11.3361607559
P -0.698689931473 4.13447857683 9.2233771112
O -0.880875543806 2.31286168207 6.46809903593
Sn 5.06464439548 4.20941630124 7.57770556121
O 5.25386131501 2.60030389274 6.31243916858
O 3.79958152412 3.15852919467 8.86097201876
O -0.535855364872 0.61790582418 8.81880151967
H -1.46342223968 4.54728440707 11.3043625339
H 6.57625463583 2.90318280142 3.99010902691
CELL_PARAMETERS {angstrom}
8.67269 0.0 0.0
-6e-05 5.00051 0.0
-3.37074 0.00144 18.92727

If you would like to implement this yourself, you can find my QE.py file here: http://levilentz.com/Codes/QE.py. Happy automating!

Leave a Reply

Your email address will not be published. Required fields are marked *