The SOMO (SOlution MOdeller) module of UltraScan initially contained only a bead modelling utility that was originally developed by the Rocco and Byron labs, respectively at the Istituto Nazionale per la Ricerca sul Cancro (IST, Genova, Italy) and at the University of Glasgow (Glasgow, Scotland, UK). The original code was mainly written by B. Spotorno, G. Tassara, N. Rai and M. Nollmann. The SoMo bead modeling utility in SOMO is based on a reduced representation of a biomacromolecule, starting from its atomic coordinates (PDB format), as a set of non-overlapping beads of different radii, from which the hydrodynamic properties in the rigid-body frame can be calculated using the Garcia de la Torre-Bloomfield "supermatrix inversion" (SMI) approach (García de la Torre and Bloomfield, Q. Rev. Biophys. 14:81-139, 1981). The reduced representation is afforded by grouping together atoms and substituting them with a bead of the same volume, appropriately positioned. Importantly, the volume of the water of hydration theoretically bound to each group of atoms can be then added to each bead. The overlaps between the beads are then removed in sequential steps, but preserving as much as possible the original surface envelope of the bead model. The method has been fully validated and reported in the literature (Rai et al., Structure 13:723-734, 2005; Brookes et al., Eur. Biophys. J., 39:423-435, 2010; Brookes et al., Macromol. Biosci. 10:746-753, 2010). Among the main advantages of this method over shell-modelling and grid-based procedures are a better treatment of the hydration water and the preservation of a direct correspondence between beads and original residues. For instance, the latter feature could be used to include flexibility effects into the computations. Furthermore, by identifying and excluding from the hydrodynamic computations beads that are buried and thus not in contact with the solvent, a large span in the size of the structures that can be analysed with this method without loss of precision is obtained: currently, structures from 5K to 450K have been successfully studied.
Subsequently, we have also improved the original AtoB grid method (Byron, Biophys. J. 72:408-415, 1997), which was already included within US-SOMO, by adding the theoretical hydration, accessible surface area screening, and a better preservation of the original surface. The possibility of changing the grid size in the improved AtoB could be very useful to study large structures and complexes.
Later on, in US-SOMO was added an alternative, at the time far more computationally intensive method of calculating the hydrodynamics based on the analogy that exists between certain hydrodynamic and electrostatic properties, ZENO (see Douglas, Some Applications of Fractional Calculus to Polymer Science, Adv. Chem. Phys. 102:121?191, 1997; Douglas et al., Hydrodynamic friction and the capacitance of arbitrarily shaped objects, Phys. Rev. E 49:5319-5331, 1994; Mansfield et al., Intrinsic Viscosity and the Electric Polarizability of Arbitrarily Shaped Objects, Phys. Rev. E, 64:61401-61416, 2001; http://www.stevens.edu/zeno/). Then, in May 2014 an interface and an analysis modulus for the boundary elements method BEST [S.R. Aragon, A precise boundary element method for macromolecular transport properties. J. Comp.Chem., 25, 1191-1205 (2004); S.R. Aragon and D.K. Hahn, Precise boundary element computation of proteins transport properties: Diffusion tensors, specific volume and hydration, Biophysical Journal, 91:1591-1603 (2006)] were implemented within US-SOMO.
More recently, a comprehensive study was conducted to compare various hydrodynamic modeling approaches (Rocco and Byron, Computing translational diffusion and sedimentation coefficients: an evaluation of experimental data and programs., Eur. Biophys. J. 44:417-431, 2015; see also Rocco and Byron, Hydrodynamic Modeling and Its Application in AUC, Methods Enzymol. 562:81-108, 2015). The methods tested were SoMo with computations using either the SMI or ZENO approaches, AtoB with 5 and 2 Å grid sizes and SMI computations, BEST, all under the US-SOMO implementation, and, externally, HYDROPRO (Ortega, A., D. Amorós, and J. García de la Torre. 2011. Prediction of hydrodynamic and other solution properties of rigid proteins from atomic- and residue-level models. Biophys. J. 101:892-898). The results indicate that, on average, BEST and HYDROPRO tend to underestimate the translational frictional properties by ~-3 and -4%, respectively, while SoMo using either the SMI or ZENO approaches overestimates them slightly less (~+2%). The best results using the SMI approach were obtained by AtoB with a 5 Å grid size, ~+0.5. However, a combination of SoMo bead models without overlap reduction and ZENO computations performed even better, with ~0% average discrepancy and all results within ±4%, not far from the average experimental error of ±~3%. For these reasons, starting from the May 2015 US-SOMO release, this combination was directly offered among the bead modeling hydrodynamic computations options.
Very recently, a new implementation of the ZENO method, completely rewritten in the C++ language instead of Fortran, with greatly improved serial performance and able to utilize the multi-core capabilities of modern processors, thus allowing computational times shorter by a factor of ~100, was produced at NIST (Juba et al., J. Res. Natl. Inst. Stand. Technol. 122:1-2, 2017). This new ZENO code was implemented into US-SOMO, but it could be distributed only with Linux system executables. This was due to the US-SOMO underlying architecture which relied on the Qt3 now obsolete libraries. A major rewriting effort was then done, to recode US-SOMO under the up-to-date Qt5 framework. This has allowed us to introduce a new bead model generating option, which we call "van der Waals (vdW) with overlaps". The vdW with overlaps method allows generating a bead model where each atom present in the PDB file (except, by default, H2O molecules) is represented by a bead whose radius is equal to the atom's van der Waals radius as listed in the somo.residue table (see below). If water molecules are associated to any particulat atom in the somo.residue table, their volume is calculated and added to that derived from the atom's van der Waals radius, and a final bead radius is then recomputed. No overlap removal is performed, and the ZENO method is then used to compute the hydrodynamic parameters. While this direct method as so far implemented matches slightly worse the experimental parameters of the test proteins (Rocco and Brookes, Eur. Biophys. J., submitted), it opens up some interesting possibilities, such as using structures explicitly hydrated by Molecular Dynamics simulations. The vdW with overlaps method is under active development.
These new features in US-SOMO are available from the January 2018 release.
US-SOMO also includes a fully functional Small-Angle X-ray or Neutron Scattering (SAXS/SANS) simulator module,
which works on either the original atomic structure, or on a bead model, and has enhanced experimental data processing
capabilities. In the modeling area, several methods are offered for the computation of SAXS and SANS I(q) vs. q
curves. Some of these methods require explicit hydration of the PDB structure(s), which should be presently externally provided.
The pairwise-distance distribution function P(r) vs. r computation is fully operational for both SAXS and SANS,
and includes a graphical mapping utility to visualize which residues in a structure are contributing to specified distance ranges.
In the experimental data processing area, a novel HPLC-SAXS data
processing utility has been implemented, which starts with the transformation of a time series of I(q) vs.
q frames into a series of time chromatograms I(t) vs. t for each q value. A check of the baselines,
potentially revealing capillary fouling due to the accumulation of material on its walls, can then be performed, and corrections
applied. In case of overlapping or not baseline-resolved peaks, Single Value Decomposition (SVD) can be applied on the original
or baseline-corrected data, the latter after automatic back-generation of the I(q) vs. q frames, to identify how
many components are present in the data. Global Gaussian analysis/decomposition can then be performed on the I(t) vs.
t for each q value dataset, followed by back-generation of the I(q) vs. q frames for each Gaussian
peak. Several improvements are present in this area from the June 2015 release,
like an integral baseline evaluation/subtraction procedure, with immediate testing of the results in the I(q) vs. q
space, the possibility of peak decomposition using non-symmetrical Gaussian functions, an improved treatment of concentration
detector data, and a tool to evaluate the data-associated errors, when necessary, from the baseline fluctuations.
The Guinier analysis of experimental I(q) vs. q curves offers the determination of the overall z-average square radius of gyration <Rg2>z and of the w-average molecular weight <M>w from global Guinier, of the z-average square cross-section radius of gyration <Rc2>z and of the w/z-average mass per unit length <M/L>w for rod-like macromolecules, and of the z-average square transverse radius of gyration <Rt2>z and of the w/z-average mass per unit area <M/A>w for disk-like marcomolecules.
The batch operations module includes supercomputing access, with an interface to Discrete Molecular Dynamics (DMD) programs (Dokholyan, NV, Buldyrev, SV, Stanley, HE, and EI Shaknovich. Discrete molecular dynamics studies of the folding of a protein-like model. (1998) Folding & Design 3:577-587; Ding F, Dokholyan NV. Emergence of protein fold families through rational design. Public Library of Science Comput Biol. (2006) 2(7):e85). Starting from the May 2014 release, you will also find the implementation on a supercompute cluster of the boundary-elements hydrodynamic computations BEST [S.R. Aragon, A precise boundary element method for macromolecular transport properties. J. Comp.Chem., 25, 1191-1205 (2004); S.R. Aragon and D.K. Hahn, Precise boundary element computation of proteins transport properties: Diffusion tensors, specific volume and hydration, Biophysical Journal, 91:1591-1603 (2006)], and the relative interfaces in US-SOMO to set-up the analysis parameters and analyze the computations results.
Other features include a model classifier in which calculated parameters can be compared and ranked against experimental data, and a PDB editor.
The program main window contains an upper bar from which all the options governing its operations can be controlled, and a main panel for program execution. However, due to its high level of sophistication, properly setting all the available options can be non-trivial for the general user. Therefore, the US-SOMO module is distributed with pre-defined default options that should allow the direct conversion of a PDB-formatted biomacromolecular structure file into a bead model, and the computation of its hydrodynamic properties, without the need of accessing the advanced options menus. In particular, the SoMo approach is based on properly defining the atoms and residues found in PDB files, and the rules allowing their conversion into beads. The US-SOMO distribution includes the definition of all the standard amino acids, nucleotides, carbohydrates, and common prosthetic groups and co-factors, but this list is by no means exhaustive, and the need to code for "new" residues is not a remote possibility. As this operation can be demanding, notwithstanding the user-friendly GUIs governing it, the pre-defined set of options includes approximate methods to deal with either missing atoms within coded residues, and/or not yet coded residues. Starting from the May 2015 release, the default option is to generate a single bead for each non-coded residue using average parameters. When non-coded residues are found, a pop-up panel will alert the user and present as options (i) to continue with the approximate method; (ii) to skip non-coded residues (not recommended), or (iii) to halt operations and then take proper action like coding for the new residue. For coded residues with missing atoms, since most often this is due to lack of crystallographic data, the default option is now to use the complete residue's bead(s), appropriately positioned (again, a pop-up panel will warn of such instances and present the alternative skip (not recommended) or halt operations options). Obviously, there's no cure for completely missing residues, which will have to be built in the original structure for reliable results, since the structure should contain all residues and atoms that are present in the "real" macromolecule studied in solution. Therefore, for best performance all residues should be properly coded in the US-SOMO tables (see below).
These functions control the execution of the US-SOMO program, whose progress is recorded in the right-side main window (in the picture above, the messages during the model building and hydrodynamic computation phases starting from the 1AKI.pdb RNase A structure are shown in a reduced font size). They are divided in three subpanels controlling operations that deal with the primary PDB file (PDB Functions:), operations relating the generation of bead models (Bead Model Functions:), and the computation of the hydrodynamic parameters (Hydrodynamic Calculations:).
$ULTRASCAN/binfor 32 bit machines, and
$ULTRASCAN/bin64for 64 bit platforms. You can get a copy of RasMol from http://www.bernstein-plus-sons.com/software/rasmol/ (recommended, there it's under active development), or from http://www.umass.edu/microbio/rasmol/, or from http://openrasmol.org/#Software.
Besides visualizing the structure, the HEADER and TITLE fields of the PDB file
will be displayed in the progress window, followed by the residues list in both three- and one-letter codes, and by the
partial specific volume (vbar), molecular weight, molecular volume computed both from vbar and from the individual atomic
volumes, and the average electron density of each chain and of the whole structure.
If problems are encountered with the selected PDB file, like the presence of non-coded residues or missing atoms within coded residues, they will be reported in the progress window either as warnings or errors.
Bead Model Functions:
Alternatively, you can load one or multiple previously-generated bead model by clicking on either the Batch Mode/Cluster Operation (see here) or the Load Bead Model File buttons from the menu. In these cases, and if the model(s) was (were) generated/saved in the US-SOMO format, the various settings/parameters used in model generation will be displayed in the right-side main window. Note that you can decrease the number of beads used, and thus the resolution of the model, by applying a grid procedure on a previously-generated bead model with the Grid Existing Bead Model option (see above). This could be useful when large structures are analyzed, although using the improved AtoB routine on the original PDB file while increasing the grid size (Build AtoB (Grid) Bead Model) seems to produce much better results. By selecting different file types extensions, other type of bead models can also be loaded, like the old BEAMS-format models, or DAMMIN/DAMMIF-generated models. In this case, a pop-up panel appears requesting entering the partial specific volume and molecular weight of the model. The SAXS/SANS Functions button present in this subpanel will allow to perform SAXS-or SANS-related simulations directly on the currently loaded bead model. (see here).
The Select Parameters to be Saved button will open a pop-up window (see here) where characterizing/computed parameters can be selected for saving in a comma-separated file for easy import into spreadsheets. Selecting the Save parameters to file checkbox will generate such file, with extension .csv.
Finally, by pressing the Model classifier button, you will access a tool for selecting a best matching model among a series of models, by comparing their calculated hydrodynamic parameters with user-provided experimental values (see here).
The black bar at the bottom of the progress window will instead report the detailed advancement of some of the steps in the various phases, like the current slice and atoms (or beads) involved in the ASA routine, the iterations in the supermatrix inversion, or the ZENO % progress in the hydrodynamic computations. For small structures (or for low number of MC iterations in the ZENO method), these numbers will be barely flashing by in the box, but for large structures they will allow a more in depth monitoring of the various stages.
Operations can be halted at any moment by clicking on the Stop button. To avoid inadvertendly losing data, the Close button will not immediately close US-SOMO, but confirmation will be required in a pop-up window.
Five pull-down menus are presently available to access the various US-SOMO options:
From this pull-down menu, you can call four different sub-menus controlling
the four tables containing the definitions of the atoms and residues found in
PDB files, and their SAXS coefficients.
More in detail, you can define/edit the hybridizations, atoms and residues that need to be interpreted as beads in the bead model generation. These parameters are collected in different tables that are used as the components from which the bead sizes and positions are calculated. PDB structures can then be converted to bead models based on the bead parameters defined here. For SAXS simulations you also need the atomic scattering factors coefficients (five exponentials plus a constant) and the associated excluded volumes.
From this pull-down menu, you can access various panels where you can set all
the available options for different steps in the program. These options are saved
in a system wide config file
$ULTRASCAN/etc/somo.configEvery time you close the SOMO program, the currently defined options will be saved in
$HOME/ultrascan/etc/somo.configwhere they will be reloaded from upon startup.
From this pull-down menu, it will be possible in the future to access two options panels controlling Brownian dynamics simulations:
From this pull-down menu, you can access two panels controlling the options for parsing the PDB file and for the model(s) visualization by RasMol.
This document is part of the UltraScan Software Documentation
Copyright © notice.
The latest version of this document can always be found at:
Last modified on January 23, 2018.