Simulation Methods

Description
 
Our vision is that simulation codes will be modular and compatible in open source or Standard Reference Simulation (SRS) Framework*, will be efficient and robust, and will be easy to use. Furthermore, tools will be available for complete workflow from problem set up, through convergence, to analysis for a broad range of problems. In addition, simulation and relevant experimental results will be stored in a shared database accessible to program developers and researchers to be used for benchmarking, developing code, and learning both modeling protocols and additional applications (*SRS can include commercial and proprietary code linked by standard input/output formats.)
 

Goals
 
Create and maintain a list of publicly-available simulation codes. Make it available on the web site. Invite developers to correct or augment the information. Add a web form for developers to submit the survey for other codes.
 
Use forum, email lists, and/or wiki to initiate (seeded with content by steering committee) and foster discussion about simulation methods within the broader community.
 
Educate ourselves regarding other previous and on-going efforts that are similar to ours. (completed)
 
Obtain buy-in to strategic plan by key stakeholders. (completed)
 
Identify commercially and publicly available resources; provide links and test cases.
  • Delineate categories for programs (MD simulation driver, etc)
  • Collect info on public codes via survey of developers (e.g., approximate size of user-base, unique features of your code, major challenges, future directions, etc.)
  • Identify resources and populate summary Wiki lists with resource, link, and test-cases location info
  • Publicize summary of available resources

Identify and begin building the framework for software interoperability.

  • Identify gaps in workflow tools
  • Identify standards for input and output, and identify which software modules are currently available to enable interoperability between codes/steps
  • Publicize standardized links and gaps
  • Identify properties and classes of molecules and systems and what methods are available to address them. Categorize them according to routine and non-routine.
 
Establish the necessary characteristics of a graphical user interface (GUI) for setting up, launching, and monitoring a simulation as well as for the analysis of the end results.
 
Identify a problem-oriented simulation language (GUI or text line editing) for tying simulation tasks together to solve a problem.
 
Determine how to establish error bars for calculations.
 
Delineate all steps and criteria required to predict properties to a specific level of accuracy within established error bars for a basic set of simulation tasks which are readily amenable to code modularization.
 
Develop a series of Standard Benchmark Reference Simulation examples with model protocols to illustrate techniques for both expert developers and novice users to test and develop their codes. In so doing, enable the accurate comparison of the results from different codes on a systematic basis via a well-defined protocol. Include a set of coordinate files for a variety of specific systems along with a complete listing of the numerical values of each contribution to the potential energy for a given force field (non-bonded, angles, bonds, torsions, electrostatics) for a variety of systems spanning very simple (Lennard-Jones) to more complex (proteins) for use in validating methods/codes for the calculation of potential energy/forces. Include the consideration of quantum-chemical-based methods including criteria to establish when they have been improved to the extent necessary to sufficiently reproduce non-bonded interactions for fluid simulations
 
Establish a repository of short, explanatory articles about methods and algorithms. Each article should focus on a particular algorithm, contain a "pseudo-code" section which describes its steps in plain terms, and highlight the key papers from the literature which provide further information.
 
Establish a repository for simulation codes and simulation-related subroutines (analysis routines, property calculation routines, etc); Establish a set of methods for code validation (e.g., to insure microscopic reversibility in MC and energy conservation in MD); Establish curatorship protocols for accepting and storing routines; Educate simulation users on the benefits of sharing codes and subroutines; Encourage simulation users to adopt standards for facilitating straight forward integration.
 
Establish a database of simulation and related experimental results. Delineate and develop standards for storing data; Evaluate and recommend use of centralized or distributed databases or a combination of both; Establish curatorship protocols for accepting and integrating data; Educate simulation users on the benefits of sharing data and encourage them to adopt standards for facilitating automated data capture and integration.
 
Develop a primer on writing good molecular simulation routines, a tutorial including guides regarding topics such as the best way to parametrize molecular variables, subdivide tasks, speed performance, and enhance portability from one problem to another.
 
Offer a periodic challenge to test methods and stimulate development of new methods.
 

Other Resources

 
Major Code/Work Flow Efforts
 
CML and Workflow
 
Software Packages
  • LAMMPS:Large-scale Atomic/Molecular Massively Parallel Simulator, administered by Steve Plimpton at Sandia
  • MCCCS Towhee: Monte Carlo for Complex Chemical Systems, administered by Marcus G. Martin at Sandia
  • GAMGI General Atomistic Modelling Graphic Interface
  • J. Comp. Chem. special issue "...on the methods and applications of molecular simulation in the area of biological systems [with]...manuscripts from the leading authors of large molecular simulation programs- Amber, BOSS/MCPRO, GROMOS, GROMACS, IMPACT, and NAMD- [containing]... an updated description of their programs, the critical algorithms, and force field approaches that form the basis for the methods, and representative applications that illustrate the scope of systems to which these techniques can be applied."
  • MDX: "...a collection of C libraries to enable the development of methods for molecular dynamics of biomolecules...MDX provides a modular approach to developing molecular dynamics software. The goal is to develop reusable modules with well-defined interfaces towards the implementation of a sequential molecular dynamics program and related tools. The priority is to create MD codes that are easy to understand and modify, enabling straightforward design and testing of new methods. Flexible software libraries will provide tools for taking care of common tasks, allowing more rapid development of revolutionary techniques." This is related to NAMD.
  • "MESHI: a new library of Java classes for molecular modeling." N. Kalisman, A. Levi, T. Maximova, D. Reshef, S. Zafriri-Lynn, Y. Gleyzer and C. Keasar. Bioinformatics, 21, 3931-3932 (2005). http://dx.doi.org/10.1093/bioinformatics/bti630 ; http://www.cs.bgu.ac.il/~meshi/
  • ABINIT: ABINIT is a package whose main program allows one to find the total energy, charge density and electronic structure of systems made of electrons and nuclei (molecules and periodic solids) within Density Functional Theory (DFT), using pseudopotentials and a planewave basis. ABINIT also includes options to optimize the geometry according to the DFT forces and stresses, or to perform molecular dynamics simulations using these forces, or to generate dynamical matrices, Born effective charges, and dielectric tensors. Excited states can be computed within the Time-Dependent Density Functional Theory (for molecules), or within Many-Body Perturbation Theory (the GW approximation).
  • CAMPOS: The CAMPOS project consists of several atomistic simulation tools and an environment for setting up atomistic calculations and visualizations, written in Python.
  • Ghemical: Ghemical is a computational chemistry software package released under the GNU GPL. It means that full source code of the package is available, and users are free to study and modify the package. Ghemical is written in C++. It has a graphical user interface (which is based on GTK2), and it supports both quantum-mechanics (semi-empirical and ab initio) models and molecular mechanics models (there is an experimental Tripos 5.2-like force field for organic molecules). Also a tool for reduced protein models [1] is included. Geometry optimization, molecular dynamics and a large set of visualization tools are currently available. Ghemical relies on external code to provide the quantum-mechanical calculations. Semi-empirical methods MNDO, MINDO/3, AM1 and PM3 are provided by the MOPAC7 package (Public Domain). The MPQC package (GNU GPL) is used to provide ab initio methods: the methods based on Hartree-Fock theory are currently supported with basis sets ranging from STO-3G to 6-31G**. Ghemical also uses the OpenBabel package for importing and exporting many different file formats (as well as for other tasks).
  • GROMACS: popular MD software for running large-scale simulation.
  • Moscito: software for MD simulations of molecular aggregates. Standard molecular mechanics force-fields such as AMBER, OPLS, CHARMM and GROMOS can be employed. Simulations can be carried out in different ensembles such as NVE, NVT or NPT using the weak coupling scheme. (Smooth Particle Mesh) Ewald summation is used for long range electrostatic interactions.
  • MPQC: MPQC is the Massively Parallel Quantum Chemistry Program. It computes properties of atoms and molecules from first principles using the time independent Schrödinger equation. It runs on a wide range of architectures ranging from individual workstations to symmetric multiprocessors to massively parallel computers. Its design is object oriented, using the C++ programming language.
  • Octopus: octopus is a program aimed at the ab initio virtual experimentation on a hopefully ever increasing range of systems types.
  • PyQuante: PyQuante is an open-source suite of programs for developing quantum chemistry methods. The program is written in the Python programming language, but has many "rate-determining" modules also written in C for speed.
 
Results Repositories
 
Validation
 

Team Notes 

Stakeholders

  • Industry and other USERS
  • Academics (primarily code/algorithm developers)
  • Commercial software developers ??
Gaps in Work Flow (Where simulation doesn't work) 
 
opportunities 
  • long time / large length scales relative to fundamental unit of the simulation
    • low-deformation rate rheology
    • phase transformations near phase coexistence
  • inhomogeneous systems and interfacial properties
    • truncation of nonbonded interactions
    • long range corrections
    • electrostatics in 2D
  • glassy systems
  • complex interaction potentials (e.g., many-body, polarizable)
  • rare events
  • phase equilibria involving crystalline phases (a biggie).
 
 
Why would developers use framework/standards and/or contribute to a code repository?
 
Somehow manage to populate it with something(s) that aren't generally available but would be generally useful. Someone who benefits from it will be more likely to be motivated to then contribute to it.
 
 
Adun
  • webpage
  • journal article
  • attempts to produce code that allows rapid development while remaining easy to maintain by shifting away from an algorithmic-centric development methodology to one that takes into account the program strucutre using object-oriented techniques
  • they desire to have a platform that will enable rapid and easy implementation of yet to be determined functionality while simultaneously avoiding burdening developers with a convoluted and hard to understand program
  • includes an XML-based template for describing force fields coupled with a flat file to XML conversion tool to allow rapid implementation of force fields
  • intended to eventually include all levels of simulation (atomistic, meso, and macro)...initially focusing on MD
  • implemented in Objective-C
  • "Adun is the result of our ambition to provide a highly scalable, easy to develop open-source platform for computer simulations that also allows rapid implementation of new functionality and protocols. This stands in contrast to many current simulation packages that cannot keep pace with the rate of change in the state of the art, a situation that has lead to a proliferation of lab-centric MD programs that implement a few related protocols that subsequently remain unknown or unused outside of a small user core."
 
 
FSAtom
 
organized in 2002, as an outcome of the CECAM workshop "Open Source Software for Microscopic Simulations"
 
Purposes:
  • To spread the use of the "Free Software" concept in the community of Atomic-scale Simulation software developers,
  • To improve the awareness of modern software engineering concepts,
  • To constitute the natural place for interactions between different groups of developers in this field.
Planned Activities:
  • maintain a Web site (hosted by CECAM) with mailing lists and with links to the relevant software projects (also to proprietary software, for information) ;
  • through workgroups, organize the collaboration between developers : file exchange, code testing, definition of objects, exchange of development tools, exchange of expertise ... ;
  • organize workshops and tutorials on related subjects, and on modern software engineering concepts ;
  • maintain a contact with the Free Software foundation, and spread relevant information from it or about it;
  • maintain a contact with relevant funding agencies or institutions, and ease (or foster) the writing of relevant proposals.
WorkGroups:
  • TestingDFT coordinated by Gilles Zerah
  • TestingMD coordinated by David van der Spoel
  • PseudoPotentials coordinated by Karsten Jacobsen
  • FileFormats coordinated by Mark Tuckerman
  • InterfacesAndMiddleware coordinated by Konrad Hinsen
  • MolecularMechanicsOpenStandards coordinated by Konrad Hinsen
Identified the need for accurate comparison of the results from different codes on a systematic basis via a well-defined protocol
 
Emphasized the importance of the community adopting modern software development concepts including:
  • libraries for handling files in view of sharing and comparison of routines and codes;
  • reusability of sources;
  • self-documentation of codes;
  • combination of extension and compute-intensive languages (e.g. Python+C);
  • high-level graphics libraries;
  • installation and maintenance tools (autoconf, automake, CVS);
  • self-testing of code.
Identified a list of relevant topics:
  • parallelism (MPI, OpenMP, ...)
  • exchange of data (NetCDF, HDF, XML)
  • graphics high-level libraries (VTK),
  • techniques for optimisation of codes
  • use of standard libraries (BLAS , LAPACK),
  • extension languages (Python, Scheme, Tkl for user interface),
  • installation tools (autoconf, automake)
  • maintenance tools (CVS, bug tracking)
  • good coding practice (especially for big projects),
  • tools for documentation (Robodoc, Src2tex, TexInfo, Docbook)
 
 
MolecularMechanicsOpenStandards
 
  • a workgroup of FSAtom
  • StructuralData notes
  • ForcefieldData notes
  • General discussion about a common format for representing configurations, trajectories, force fields, etc.
  • In general, most force fields require as input the atom types (possibly forcefield-specific) and partial charges for each atom plus the bond structure. Starting with that information, all force field terms can be identified by algorithmic rules (though the rules can be quite complicated, poorly documented, and very different from one force field to another).
  • Creating an XML representation of force fields can have a number of benefits:
  • it allows the formulation to be used interchangeably. It might be possible to search for programs which used a particular form. (Note TeX doesn't normally support symbolic algebra, but MathML can.)
  • it allows documentation and the equations could , for example, be systematically typeset.
  • it might allow for program simulation to validate correctness of implementation on test sets.
  • CML info:
  • CmlCore
  • CmlAtom
  • In Feb 2005, Konrad Hinsen released a version of the Molecular Modelling Toolkit with support for reading and writing files for chemical systems in an xml format.
 
 
eMinerals
The eMinerals project is intended to provide a "grid computing" environment to address problems of environmental interest using a wide variety of computational tools such as quantum mechanics, classical molecular dynamics and Monte Carlo. The emphasis is on the use of existing codes and on ways to manage the system inputs/outputs so that it is transparent to the user. This is a non-trivial task as both the grid and the users are at several distinct sites in the UK.
Computer security issues are not discussed. Note that computer security concerns have restricted grid computing at NIST to a limited set of machines on a network with carefully llimited access. Only with strong limitations can security issues be resolved.
 
 
The Molecular Modeling Toolkit (MMTK)
[E-mail for Konrad Hinsen, developer: hinsen[at]llb.saclay.cea[dot]fr]
 
Modular/Efficient/Robust/Easy
 
The Molecular Modeling Toolkit lives up to its name since it is a toolkit (or library) of open source Python modules for creating Python-scripted molecular simulation applications. At first glance, the MMTK appears to be only for life science applications, but the object-oriented nature of the Python language makes this modeling environment easy to use outside of the life science area. Also, if new simulation methods or force fields need to be created and tested, then additional Python modules can be written to interact with the MMTK. The only (minor) reservation that I have is that Python executes relatively slowly compared with C and Fortran. Time-critical computations could be written in C since there is an interface with Python. Overall, the MMTK offers a good start in providing a set of modules within the Python environment for creating simulation applications.
 
Workflow
 
From what I can tell, the MMTK does have modules for problem set up, through convergence, to analysis but not really for a broad range of problems. However, the modularity of the Python language would allow the development of the necessary modules to cover many different types of problems. There is no GUI for workflow; one would have to be built on top of MMTK.
 
Database
 
Although there is no database for simulation and relevant experimental results, MMTK does have a user-configurable database which defines atoms, functional groups, molecules, and proteins. This type of database is also important, mainly for setting up the input for a simulation.
 
 
Relevant Feedback
  • On the topic of GUIs, (Marcus Martin (email received 30 Sept 05)): "While your desire to create a GUI for multiple other packages is admirable, my experience with folks who have on and off be working on a GUI for Towhee is that a truly useful GUI takes an incredible amount of effort. I have watched many GUI projects start, sputter, and then fail...I have found little actual interest in using a GUI from my user base. It only truly becomes useful once it has quite complicated features like the ability to draw a molecule and have a force field automatically assigned based on that drawing. Not something that is a simple code project."

 

Subscribe to Comments for "Simulation Methods"