Data-Driven Chemistry
Like most scientists, chemists are drowning in data from laboratory experiments and from calculations. We are developing tools using machine learning to automate the analysis of quantum-chemistry. Another area in need of automation is in the development of quantitative structure-property relationships, particularly where flexible molecules are concerned.
Collaborators
Matt Sigman (Utah), Tom Rovis (Columbia); Steven Fletcher (Oxford)
Key Papers
DBSTEP.DBSTEP is a python package for obtaining DFT-Based Steric Parameters from 3-dimensional chemical structures. It can parse the outputs from most computational chemistry programs and other common molecular structure file formats. Steric properties can either be obtained exactly or by using a Cartesian grid, the latter approach being amenable to the featurization of a molecular isodensity surface (DBSTEP can process wavefunction files) rather than using classical atomic radii. Currently, traditional Sterimol parameters (L, Bmin, Bmax) and percent buried volume parameters are implemented, as well as our novel steric parameter vectors Sterimol2vec and vol2vec. This package is designed for use on the command line or alternatively implemented in a Python script for use in a computational workflow to collect steric parameters.
[GitHub] Importance of Engineered and Learned Molecular Representations in Predicting Organic Reactivity, Selectivity, and Chemical Properties.Gallegos, L. C.; Luchini, G.; St. John, P. C.; Kim, S.; Paton, R. S. Acc. Chem. Res. 2021, 54, 827–836
Goodvibes.A Python program to compute quasi-harmonic thermochemical data and potential energy surface diagrams from frequency calculations at a given temperature/concentration, corrected for the effects of vibrational scaling-factors. All (electronic, translational, rotational and vibrational) partition functions are recomputed and can be correct to any temperature or concentration. The first public version of GoodVibes was released in 2016 and it has undergone several revisions since, during which time it has been used by many groups around the world. The program is described in the publication: GoodVibes: automated thermochemistry for heterogeneous computational chemistry data
[Zenodo] [GitHub] wSterimol.A program to generate Boltzmann-weighted Sterimol Steric Parameters for conformationally-flexible substituents that integrates with PyMol. The program contains an automated computational workflow which computes multidimensional Sterimol parameters. For flexible molecules or substituents, the program will generate & optimize a conformational ensemble, and produce Boltzmann-weighted Sterimol parameters. It has been developed as a PyMol plugin and can be run from within the graphical user interface. The wSterimol code is described in more detail in Conformational Effects on Physical-Organic Descriptors – the Case of Sterimol Steric Parameters
[Zenodo] [GitHub] Effects of substituents X and Y on the NMR chemical shifts of 2-(4-X phenyl)-5-Y pyrimidines.Yuan, H.; Chen, P.-W.; Li, M.-Y.; Zhang, Y.; Peng, Z.-W.; Liu, W.; Paton, R. S.; Cao, C. J. Mol. Struct. 2020, 1204, 127489
GoodVibes: automated thermochemistry for heterogeneous computational chemistry data.Luchini, G.; Alegre-Requena, J. V.; Funes-Ardoiz, I.; Paton, R. S. F1000Research 2020, 9, 291
Prediction of homolytic bond dissociation enthalpies for organic molecules at near chemical accuracy with sub-second computational cost.St John, P.; Guan, Y.; Kim, Y.; Kim, S.; Paton, R. S. Nat. Commun. 2020, 11, 2328
Quantum chemical calculations for over 200,000 organic radical species and 40,000 associated closed-shell molecules.St John, P.; Guan, Y.; Kim, Y.; Etz, B. D.; Kim, S.; Paton, R. S. Scientific Data 2020, 7, 244
Selective Halogenation Using Designed Phosphine Reagents.Levy, J. N.; Alegre-Requena, J. V.; Liu, R.; Paton, R. S.; McNally, A. J. Am. Chem. Soc. 2020, 142, 11295–11305
Conformational Effects on Physical-Organic Descriptors – the Case of Sterimol Steric Parameters.Brethomé, A. V.; Fletcher, S. P.; Paton, R. S. ACS Catal. 2019, 9, 2313–2323
Data-Mining the Diaryl (Thio) Urea Conformational Landscape: Understanding the Contrasting Behavior of Ureas and Thioureas with Quantum Chemistry.Luchini, G.; Ascough, D. M. H.; Alegre-Requena, J. V.; Gouverneur, V.; Paton, R. S. Tetrahedron (invited contribution) 2019, 75, 697–702
Frontier Molecular Orbital Effects Control the Hole-Catalyzed Racemization of Atropisomeric Biaryls.Tan, J.; Paton, R. S. Chem. Sci. 2019, 10, 2285-2289
Hydrogen-Bond Dependent Conformational Switching: a Computational Challenge from Experimental Thermochemistry.Luccarelli, J.; Paton, R. S. J. Org. Chem. 2019, 84, 613–621
Retooling Asymmetric Conjugate Additions for Sterically Demanding Substrates with an Iterative Data-Driven Approach.Brethomé, A. V.; Paton, R. S.; Fletcher, S. P. ACS Catal. 2019, 9, 7179–7187
Structure Determination of a Chloroenyne from Laurencia Majuscula Using Computational Methods and Total Synthesis.Shepherd, E. D.; Dyson, B. S.; Hak, W. E.; Nguyen, Q. N. N.; Lee, M.; Kim, M. J.; Sohn, T.-I.; Kim, D.; Burton, J. W.; Paton, R. S. J. Org. Chem. 2019, 84, 4971–4991
Synthesis, Characterization, and Reactivity of Complex Tricyclic Oxonium Ions, Proposed Intermediates in Natural Product Biosynthesis.Chan, H. S. S.; Nguyen, Q. N. N.; Paton, R. S.; Burton, J. W. J. Am. Chem. Soc. 2019, 141, 15951–15962
Asymmetric Total Syntheses and Structure Confirmation of Chlorofucins and Bromofucins.Kim, B.; Sohn, T.; Kim, D.; Paton, R. S. Chem. Eur. J. 2018, 24, 2634–2642
Cation–Pi Interactions in Protein-Ligand Binding: Theory and Data-Mining Reveal Different Roles for Lysine and Arginine.Kumar, K.; Woo, S. M.; Siu, T.; Cortopassi, W. A.; Duarte, F.; Paton, R. S. Chem. Sci. 2018, 9, 2655–2665
C−H Cyanation of 6-Ring N-Containing Heteroaromatics.Elbert, B. L.; Farley, Gorman, T. W.; Johnson, T. C.; Genicot, C.; Lallemand, B.; Pasau, P.; Flasz, J.; Castro, J. L.; MacCoss, M.; Paton, R. S.; Schofield, C. J.; Smith, M. D.; Willis, M. C.; Dixon, D. J. Chem. Eur. J. 2017, 23, 14733–14737
Correlating Reactivity and Selectivity to Cyclopentadienyl Ligand Properties in Rh(III)-Catalyzed C−H Activation Reactions — An Experimental and Computational Study.Piou, T.; Romanov-Michailidis, F.; Romanova-Michaelides, M.; Jackson, K. E.; Semakul, N.; Taggart, T. D.; Newell, B. S.; Rithner, C. D.; Paton, R. S.; Rovis, T. J. Am. Chem. Soc. 2017, 39, 1296–1310
Enantioselective Conjugate Addition Catalyzed by a Copper-Phosphoramidite Complex: Computational and Experimental Exploration of Asymmetric Induction.Ardkhean, R.; Roth, P. M. C.; Maksymowicz, R. M.; Curran, A.; Peng, Q.; Paton, R. S.; Fletcher, S. P. ACS Catal. 2017, 7, 6729–6737
Cation–Pi Interactions in CREBBP Bromodomain Inhibition: An Electrostatic Model for Small-Molecule Binding Affinity and Selectivity.Cortopassi, W. A.; Kumar, K.; Paton, R. S. Org. Biomol. Chem. 2016, 14, 10926–10938
Development of a True Transition State Force Field (TTSFF) from Quantum Mechanical Calculations.Madarász, A.; Berta, D.; Paton, R. S. J. Chem. Theor. Comput. 2016, 12, 1833–1844
Computational Ligand Design in Enantio- and Diastereoselective Ynamide [5+2] Cycloisomerizations.Straker, R.; Peng, Q.; Mekareeya, A.; Paton, R. S.; Anderson, E. A. Nat. Commun. 2015, 7, 10109
Ligand Bite Angle-Dependent Palladium-Catalyzed Cyclization of Propargylic Carbonates To 2-Alkynyl Azacycles Or Cyclic Dienamides.Daniels, D. S. B.; Jones, A. S.; Thompson, A. L.; Paton, R. S.; Anderson, E. A. Angew. Chem. Int. Ed. 2014, 53, 1915–1920
Structure Reassignment of Laurefurenynes a and B by Computation and Total Synthesis.Shepherd, D. J.; Broadwith, P. A.; Dyson, B. S.; Paton, R. S.; Burton, J. W. Chem. Eur. J. 2013, 19, 12644–12648
An Efficient Computational Model to Predict the Synthetic Utility of Heterocyclic Arynes.Goetz, A. E.; Bronner, S. M.; Cisneros, J.; Melamed, J.; Paton, R. S.; Houk, K. N.; Garg, N. K. Angew. Chem. Int. Ed. 2012, 51, 2758–2762
Experimental Diels-Alder Reactivities of Cycloalkenones and Cyclic Dienes Explained Through Transition State Distortion Energies.Paton, R. S.; Kim, S.; Ross, A. G.; Danishefsky, S. J.; Houk, K. N. Angew. Chem. Int. Ed. 2011, 50, 10366–10368
Hydrogen Bonding and Pi-Stacking: How Reliable Are Force Fields? A Critical Evaluation of Force Field Descriptions of Non-Bonded Interactions.Paton, R. S.; Goodman, J. M. J. Chem. Inf. Model. 2009, 49, 944–955
Stereostructure Assignment of Flexible Five-Membered Rings by GIAO 13C NMR Calculations: Prediction of the Stereochemistry of Elatenyne.Smith, S. G.; Paton, R. S.; Burton, J. W.; Goodman, J. M. J. Org. Chem. 2008, 73, 4053–4062
Exploration of the Accessible Chemical Space of Acyclic Alkanes.Paton, R. S.; Goodman, J. M. J. Chem. Inf. Model. 2007, 47, 2124–2132