Prediction of in vivo Protein Aggregation
Most of the algorithms described so far have been developed, and in some cases parameterized, based on properties of amyloid-like aggregates derived from the in vitro experimental characterization of aggregation reactions for a number of model proteins—being AGGRESCAN the sole exception, since it was developed from an experimentally derived scale of intrinsic aggregation propensity for the naturally-occurring proteinogenic amino acids.
Since the complex cellular environment strongly influences polypeptide depo- sition—indeed possess an intricate machinery to control this phenomenon—the question arises as to whether the previously described prediction tools are able to predict protein aggregation in vivo. In order to address this question, Chiti and co-workers evaluated the ability of different publicly-available algorithms to predict the depositional properties of polypeptides inside the cell, by employing several datasets of proteins whose tendency to aggregate had been experimentally determined in vivo (Belli et al. 2011). In general terms, the predictors are substantially accurate in the forecasting of protein aggregation in vivo with phenomenological approaches performing globally better than structure-based methods. Such difference can be rationalized considering the constraints influencing the course of protein aggregation in the crowded cellular environment which certainly differ significantly from those in a test tube—here, the controlled environment and the absence of interference from other molecular components allow for more reproducible aggregation kinetics rendering highly ordered aggregated structures. Therefore phenomenological methods are, in principle, expected to capture the complexity of environmental conditions in vivo better than approaches, based exclusively on properties of the fine structure of late assembly products. Unsurprinsingly, AGGRESCAN (relying on its in vivo derived aggregation scale) is the algorithm yielding the best global performance across the different datasets analyzed. Interestingly, the good performance of AGGRESCAN, all considering the proteins in the testing datasets employed by Chiti and co-workers belong to different organisms, provides yet another piece of evidence for the suitability of E. coli as a model organism for the analysis of protein aggregation.