Path Similarity Analysis: A Method for Quantifying Macromolecular Pathways
Diverse classes of proteins function through large-scale conformational changes and various sophisticated computational algorithms have been proposed to enhance sampling of these macromolecular transition paths. Because such paths are curves in a high-dimensional space, it has been difficult to quantitatively compare multiple paths, a necessary prerequisite to, for instance, assess the quality of different algorithms. We introduce a method named Path Similarity Analysis (PSA) that enables us to quantify the similarity between two arbitrary paths and extract the atomic-scale determinants responsible for their differences. PSA utilizes the full information available in 3N-dimensional configuration space trajectories by employing the Hausdorff or Fréchet metrics (adopted from computational geometry) to quantify the degree of similarity between piecewise-linear curves. It thus completely avoids relying on projections into low dimensional spaces, as used in traditional approaches.
To elucidate the principles of PSA, we quantified the effect of path roughness induced by thermal fluctuations using a toy model system. Using, as an example, the closed-to-open transitions of the enzyme adenylate kinase (AdK) in its substrate-free form, we compared a range of protein transition path-generating algorithms. Molecular dynamics-based dynamic importance sampling (DIMS) MD and targeted MD (TMD) and the purely geometric FRODA (Framework Rigidity Optimized Dynamics Algorithm) were tested along with seven other methods publicly available on servers, including several based on the popular elastic network model (ENM). PSA with clustering revealed that paths produced by a given method are more similar to each other than to those from another method and, for instance, that the ENM-based methods produced relatively similar paths. PSA applied to ensembles of DIMS MD and FRODA trajectories of the conformational transition of diphtheria toxin, a particularly challenging example, showed that the geometry-based FRODA occasionally sampled the pathway space of force field-based DIMS MD. For the AdK transition, the new concept of a Hausdorff-pair map enabled us to extract the molecular structural determinants responsible for differences in pathways, namely a set of conserved salt bridges whose charge-charge interactions are fully modelled in DIMS MD but not in FRODA. PSA has the potential to enhance our understanding of transition path sampling methods, validate them, and to provide a new approach to analyzing conformational transitions.