Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support of non-standard basis set for Gaussian #221

Open
FanwangM opened this issue Mar 12, 2021 · 10 comments
Open

Support of non-standard basis set for Gaussian #221

FanwangM opened this issue Mar 12, 2021 · 10 comments
Milestone

Comments

@FanwangM
Copy link
Contributor

FanwangM commented Mar 12, 2021

Even the latest version of Gaussian will require the user to input exactly a non-standard basis set for the atoms in a given molecule, but give all the basic set for all the atoms will fail the Gaussian job. In this setting, we have to deal with the non-standard basis set properly.

It seems that we will need to add a dependency of bais set exchange. Although I know it's a pain to add a dependency, bais_set_exchange is not a giant one (~30 M).

I have a naive idea that might work. We have two ways of generating input files for Gaussian:

  1. For the standard basis set, we can just use it as it is. For a complete list of basis sets, we can refer https://gaussian.com/basissets/.
  2. For a non-standard basis set, we can first determine the atom numbers or the atom labels with iodata. Then we set up another Template file along with basis_set_exchange to build an input file for Gaussian jobs.

A possible, or even better solution, is to operate on the Template file where we can append gen keyword to the list of basis sets mentioned above. Then add none one more extra empty line at the end of the Template file. If it's the standard basis, we can just leave it as it is. Otherwise, we can substitute that line with the non-standard basis set.

#188

@PaulWAyers @tovrstra @FarnazH

@FanwangM FanwangM mentioned this issue Mar 12, 2021
@FanwangM
Copy link
Contributor Author

This code snippet shows how to build a bais set for a given molecule, simple and fast

from iodata import load_one
import basis_set_exchange as bse

mol = load_one("sample.sdf")

# get basis set information
selected_basis_set = bse.get_basis("def2-svpd",
                                   fmt="gaussian94",
                                   elements=mol.atnums.tolist(),
                                   header=False)

https://molssi-bse.github.io/basis_set_exchange/

@tovrstra
Copy link
Member

Thanks for keeping track of this. I believe for Gaussian, when using the gen keyword, the basis set for each element should only be listed once. We may needs something like np.unique(mol.natoms).tolist().

@FanwangM
Copy link
Contributor Author

Thanks for keeping track of this. I believe for Gaussian, when using the gen keyword, the basis set for each element should only be listed once. We may needs something like np.unique(mol.natoms).tolist().

Nice catch! We should use a list without duplicated elements (sort of set).

Good to know np.unique() instead of list(set(my_list)). Thanks! @tovrstra

@tovrstra
Copy link
Member

tovrstra commented Apr 1, 2021

I did some testing and extended the example to work out a specific case. This is just to get a better idea of how to proceed.

Python script:

from iodata import load_one, write_input
import basis_set_exchange as bse

mol = load_one("formamide.sdf")

basis_set = bse.get_basis(
    "def2-svpd",
    fmt="gaussian94",
    elements=mol.atnums.tolist(),
    header=False,
)

template = """\
#n {lot}/Gen {run_type}

{title}

{charge} {spinmult}
{geometry}

{basis_set}
"""

write_input(mol, "gaussian.com", "gaussian", template, basis_set=basis_set)

The SDF file formamide.sdf:

713
  -OEChem-03292115373D

  6  5  0     0  0  0  0  0  0999 v2000
    1.1280    0.2091    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
   -1.1878    0.1791    0.0000 N   0  0  0  0  0  0  0  0  0  0  0  0
    0.0598   -0.3882    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.3085    1.1864    0.0001 H   0  0  0  0  0  0  0  0  0  0  0  0
   -2.0305   -0.3861   -0.0001 H   0  0  0  0  0  0  0  0  0  0  0  0
   -0.0014   -1.4883   -0.0001 H   0  0  0  0  0  0  0  0  0  0  0  0
  1  3  2  0  0  0  0
  2  3  1  0  0  0  0
  2  4  1  0  0  0  0
  2  5  1  0  0  0  0
  3  6  1  0  0  0  0
M  END
$$$$

This results in a valid Gaussian input file. The bse Python package already knows that it should not repeat basis functions for the same element, which is convenient. @fwmeng88 Does this also work for you?

I'm not sure yet if we should build this into IOData, or just make sure it remains possible for users to use this approach in their own scripts. (We should at least add it as example to the documentation.) Implementing this inside IOData has the obvious advantage of convenience, but comes with some minor drawbacks as well: an extra dependency & (very minor) another option to control if built-in or bse basis sets should be used. To facilitate this decision: did I miss any pros or cons?

@PaulWAyers
Copy link
Member

At least from my point of view, adding the increasingly standard dependency on the basis-set-exchange would be outweighed by increased convenience.

@tovrstra
Copy link
Member

tovrstra commented Apr 1, 2021

Good. It think it would become a much-loved feature. Should it be the default behavior?

By the way, something similar is possible with ORCA, see https://sites.google.com/site/orcainputlibrary/basis-sets (see newgto command).

@wilhadams
Copy link
Collaborator

I'd be in favour of using BSE to do this, especially since it already prepares the formatted lines, so we could just put it right into the template.

If we really wanted to be able to generate the non-standard basis from MolecularBasis, it wouldn't be terribly hard to add a load_one for their BSE JSON format (it's similar to QCSchema's basis schema), but then we'd need to build a converter for every format for which we want to support this feature, which seems like a lot of work without much benefit.

@tovrstra
Copy link
Member

tovrstra commented Apr 2, 2021

@wilhadams I hadn't thought yet of the second possibility you mention. It might be useful in some corner cases, e.g. when you have a MolecularBasis instance from a wavefunction file for which the name is not known or available, but I would indeed not worry about it for now.

@FanwangM
Copy link
Contributor Author

FanwangM commented Apr 2, 2021

Adding the basis_set_exchange would be a very nice feature to build Gaussian input files. I have tested the code you proposed and it worked perfectly on my end. @tovrstra

Speaking of putting an example in the documentation, maybe we need to put more than one? Such as one example for regular geometry optimization and another to show how to handle multi-step jobs (https://gaussian.com/input/). This can show how useful IOData can be for building Gaussian input files.

@tovrstra tovrstra added this to the 1.0.0 milestone Apr 2, 2021
@tovrstra
Copy link
Member

tovrstra commented Apr 2, 2021

I've added it on the list for 1.0, to fix in this in relatively short term. A few related issues need to be handled before this one: #253 to avoid merge conflicts and #254 and #210 to provide a better infrastructure and place for more elaborate examples in the documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants