Skip to content

MDAnalysis.analysis user interface

Max Linke edited this page Jul 1, 2017 · 3 revisions

All analysis tools should have a common philosophy and share a common set of options, as discussed in Issue #719. It makes for a better overall user experience if users are able to use different analysis tools "out of the box" once they have a basic understanding of how analysis works in MDAnalysis.

From the developer side, it promotes code re-use and modularization with subsequent improvements in testing coverage and code reliability.

Analysis classes in MDAnalysis.analysis are supposed to follow the "Bauhaus model":

  1. Use MDAnalysis.analysis.base.AnalysisBase as base class.
    • __init__ has to accept a Atomgroup object to define on which atoms the analysis should be done
    • follow example in the MDAnalysis.analysis.base.AnalysisBase docs for starting a new analysis.
    • _single_frame() must be defined. It is called after the trajectory is moved to a new frame. It computes a quantity from a single trajectory frame.
    • _prepare() may be defined. It is called before iteration on the trajectory has begun. It should be used to set up data structures or values that are used in _single_frame().
    • _conclude() may be defined. It is called after iteration through the trajectory is complete. It should be used to perform data reduction (e.g., normalisation and averaging of results).
    • It is not prescribed how the results of the computation are stored but they must be made available as one or more attributes of the instance for further processing by the user. These can be either one or more simple attributes with an expressive names (e.g., RMSD.rmsd for the RMSD timeseries or see pca.PCA) or a dictionary results with multiple keys.
  2. There should be a separate function outside the analysis class that takes as input the minimal data structures needed (e.g., just coordinate arrays), performs the core numerical analysis, and returns calculated data as numpy array or other simple data structure. This function should be used inside _single_frame(). Making this function available outside the class allows users to write their own analysis code, e.g., using parallel approaches.
  3. A plot() function is optional. If it is present then it must
    • take an optional ax keyword argument for a matplotlib Axes instance and plot into the user provided Axes or if ax=None it should use pyplot.gca to get the current used axes.
    • return the Axes that was plotted into.
  4. A method to save results is optional but if present, it must be named save(). This method should only be used if you have to write custom data formats like xvg to allow the data to be handled by other tools. It should not be used to write a simple numpy array. For a numpy array the user should choose his own preferred method np.save/np.savetxt/...
Clone this wiki locally