Skip to content
Oliver Beckstein edited this page Aug 16, 2019 · 14 revisions

Frequently (or at least sometimes) asked questions about MDAnalysis.

Feel free to add to this file.

Overview

  1. Installation
  2. Reading and Writing files
  3. Working with groups of atoms
  4. Working with trajectories

Installation

How can I install MDAnalysis?

Follow the Installation Quick Start. We recommend the conda installation.

If you want to run the tests and do the examples in the docs and in the tutorial then also install MDAnalysisTests, as described in the Quick Start.

How do I use the "development version"?

Sometimes on the mailing list we tell you that a certain feature or fix is "in the development version". You can install the development version by following the instructions under DevelopmentBranch.

Reading and Writing files

How can I convert a DCD file to a TRR file?

Load your DCD trajectory and write out a new TRR:

import MDAnalysis as mda

u = mda.Universe(PSF, DCD)
with mda.Writer(TRR, n_atoms=u.atoms.n_atoms):
  for ts in u.trajectory:
     W.write(u.atoms)

You can then load the new trajectory with

u2 = mda.Universe(PSF, TRR)

Note that you can write out a, say, GRO file with

u.atoms.write(GRO)

which can then be used with the TRR

u3 = mda.Universe(GRO, TRR)

In general, conversion works for all supported coordinate formats.

How can I write out a trajectory with a subset of atoms?

If you want to write, say, an XYZ trajectory of only the water oxygen atoms, given a topology TPR and a trajectory XTC

import MDAnalysis as mda

u = mda.Universe(TPR, XTC)
w = u.select_atoms("resname SOL and name OW")
with mda.Writer("ow.xyz", n_atoms=w.n_atoms) as W:
    for ts in u.trajectory:
        W.write(w)

Note that you probably also need to write a minimal "topology" file for further analysis/visualization. A simple "GRO" or "PDB" file might suffice: w.write("ow.gro").

See the Table of Coordinate Formats for all the formats that can be written.

How can I read a LAMMPS trajectory with Lennard-Jones (unit=LJ) units?

Source: user mailinglist

I do simulation in LAMMPS with LJ units. I read the MDAnalysis manual and realized that the lj unit is not automatically detected by it. How can load the dump files (LAMMPSDUMP) with lj unit system? Can I redefine MDAnalysis (basic) units in LJ and use them in my analysis?

My understanding (based on the 0.19.2 release docs) is that we don’t support LJ units directly in the sense that they cannot automatically converted to the MDAnalysis base units (Å and ps) but you can load your raw values into a trajectory. Just be aware that MDAnalysis keeps thinking that these values are in Å and ps.

At the moment, there's no LJ unit feature implemented that would convert LJ units to real units in MDA but the following might be a workaround:

Raw data

If you are ok with just directly using the numbers “as they are” you can fake units of “1” by pretending that the information in the file is exactly in the MDAnalysis units (namely Å and ps). LAMMPSD.DATAReader has Å as its length unit so that should then give you all your lengths in LJ sigmas and you can multiply manually to get Å.

For trajectories you can choose various units (but not “LJ”). But you can get the lengths and times as “raw” data. This needs a LAMPPS DCD at the moment

u = mda.Universe("lj.data", "lj.dcd", lengthunit="Å", timeunit="ps")

Converting to real units

You can probably trick MDAnalysis into converting to true Å and ps: if you know your LJ sigma and your LJ time unit then

u = mda.Universe(“lj.data”, “lj.dcd”, lengthunit=Å”, timeunit=ps”, dt=LJ_timeunit)

will convert times appropriately. (I think you have to provide dt on Universe creation but you can try if u.trajectory.dt = LJ_timeunit works.)

You can then create a trajectory transformation that converts all positions on the fly to Å, something like

def convert_LJ_to_MDA(ts, factor=sigma):
    ts.positions *= sigma
    return ts	

u.trajectory.add_transformations(convert_LJ_to_MDA)

I haven’t tried it, but something along those lines might work. Let us know how you get on.

Working with groups of atoms

How do I add a new chain ID to a PDB file?

Based on user list: Add chain IDs to protein files.

For MDAnalysis 0.19.2:

Load the PDB file:

U = mda.Universe("protein.pdb")

Then create a new segment ('T' in this example) and assign the segment to the atoms of interest:

newseg = U.add_Segment(segid='T')
U.select_atoms('bynum 1201:2400').residues.segments = newseg

When you write out the atoms as a PDB file, MDAnalysis will set the ChainID to the segment identifier (or the first letter if you used more than one letter)

U.atoms.write("relabeled.pdb")

Working with trajectories

Why do the positions change?

A fundamental concept in MDAnalysis is that at any one time, only one time frame of the trajectory is being accessed. Think of the trajectory as a tape and MDAnalysis provides a read-head that sits at a specific frame. The data from this specific frame are currently available.

The Timestep data structure holds this information: u.trajectory.ts. You also see the Timestep when you iterate through a trajectory:

for ts in u.trajectory:
   print(ts.frame, ts.time) 

This prints the frame index (starting with 0) and the time for all frames.

Many properties of atoms change with time, namely positions (and velocities and forces if they were recorded). Thus u.atoms.positions changes with time: it corresponds to the quantity X(t) (all coordinates as a function of time, at time step t).

Adding velocities to a trajectory

If you try to add or change coordinates or velocities by assigning ts.positions = new_x then you will find that your changes are not persistent: when you read the same frame again, the coordinates are again the ones from the trajectory.

This is because of how MDAnalysis only ever loads a single frame of data into memory at any one time (see Why do the positions change?). Loading the next frame of data overwrites the old frame, so whenever you load a new timestep, any changes are discarded. In the case of you adding velocities however, because the trajectory has no velocities, these don't get discarded/overwritten, but instead the stale velocities from the final frame persist. (Stale data should not happen, that's a bug, see issue #2309, but that's peripheral to solving the problem.)

There are different approaches to solve this problem:

MemoryReader

If your trajectory is small enough that you could fit it into memory you could use the MemoryReader. This stores the trajectory in memory, so any changes you make are persistent. Ie if you modify the coordinates (or velocities) then move between frames and return your changes will still be present. To do this:

# make sure that the Universe thinks that there are velocities
u.trajectory.ts.has_velocities = True
# this sucks all data into memory, so the trajectory file isn't read after this command
u.transfer_to_memory()

# you can then set all velocities at once, ie give it a [nsteps, natoms, 3] array
u.trajectory.velocity_array = something

Write a new trajectory

Alternatively if your data is too large to fit into memory, you would have to save it to file then reload this data, so

with mda.Writer('with_vels.trr', n_atoms=len(u.atoms)) as w:   
    for its,ts in enumerate(u.trajectory):
        ts.velocities = velocities[its,:,:]
        w.write(ts)
u.load_new('with_vels.trr')

This has created a TRR file with the existing position data and your velocity data

Clone this wiki locally