Intrinsic Determinants of Protein Aggregation
The sequential and mutagenic analysis of amyloidogenic proteins and peptides, particularly of those behaving as IDPs under native conditions (thus allowing the disentanglement between the forces promoting aggregation and those favouring folding into a 3-dimensional structure), has led to the identification of a series of properties of both single amino acids and amino acidic combinations which are relevant in determining the ability of a polypeptide to aggregate; consequently defining the intrinsic determinants of protein aggregation. Among them, hydrophobicity has been found to constitute a major force driving aggregation, as evidenced by the effect of substitutions of polar or charged residues by non-polar amino acids increasing the rate of aggregation, while the inverse changes tend to decrease the extent of aggregation or even have a disruptive effect (Hilbich et al. 1992; Esler et al. 1996; Wurth et al. 2002; Buell et al. 2009). Nonetheless, hydrophobicity alone has been judged insufficient to account for the impact of mutations on the propensity to aggregate (Chiti et al. 2003; Rousseau et al. 2006a). The tendency of amino acids to adopt a particular secondary structure is another important determinant of protein aggregation; consistent with the finding that the core of amyloid-like aggregates is enriched in cross-p conformation, both the enrichment in residues with a higher propensity to form p-sheet structure (Chiti et al. 2002a) and the pre-existence of p-strands in the native state (Pallares et al.
2004) enhance the aggregation propensity of polypeptides. Consequently, amino acids with a low tendency to adopt p-sheet secondary structure such as Pro (which induces a bend in the polypeptide backbone), and Gly (due to the entropic cost associated to its fixation in secondary structure elements) tend to disfavour aggregation (Wood et al. 1995; Steward et al. 2002; Parrini et al. 2005). Furthermore, a variety of negative design strategies have been identified in all-p proteins in order to protect the peripheral strands flanking p-sheets (Richardson and Richardson 2002), which are at a higher risk of establishing non-functional intermolecular contacts for being free to establish hydrogen bonds with neighbouring molecules. The net charge of a polypeptide also influences the propensity to aggregate (Chiti et al. 2002a, 2003) since it defines the extent of repulsion between individual molecules, thus affecting the chances to establish the intermolecular contacts required for a protein to self-assemble and aggregate.
Although the above-mentioned factors emerge from the physico-chemical properties of individual amino acids, the linear combination of these properties along the primary sequence has a strong impact on the tendency of a polypeptide to aggregate. For example, the combinatorial design of polypeptide secondary structures has revealed that the alternation of hydrophobic and hydrophilic residues along the sequence facilitates the assembly into amyloid-like structures (West et al. 1999), likely because this pattern favours the formation of amphiphilic p-sheets. Quite interestingly, the statistical analysis of natural protein sequences revealed that this pattern is underrepresented, relative to other amino acid combinations, being less frequent than it would be expected by chance (Broome and Hecht 2000). Similarly, continuous stretches with three or more hydrophobic residues are also underrepresented in natural protein sequences (Schwartz et al. 2001), which is consistent with hydrophobicity being a major force driving deleterious aggregation.