Cell-penetrating peptides are small peptides that can easily penetrate the cell membrane. This class of molecules, especially CPPs with targeting functions, holds promise for efficient drug delivery to target cells.
Therefore, the research on it has certain biomedical significance. In this study, CPPs with different transmembrane activities were studied at the sequence level, trying to find out the factors affecting the transmembrane activity of CPPs, the sequence differences between CPPs with different activities and NonCPPs, and introduce a method for analyzing biological sequences.
CPPs and NonCPPs sequences were obtained from the CPPsite database and different literatures, and transmembrane peptides (HCPPs, MCPPs, LCPPs) with high, medium, and low transmembrane activity were extracted from the CPPs sequences to construct data sets. Based on these data sets, the following studies were conducted:
1, The amino acid and secondary structure composition of different active CPPs and NonCPPs were analyzed by ANOVA. It was found that the electrostatic and hydrophobic interactions of amino acids played an important role in the transmembrane activity of CPPs, and the helical structure and random coiling also affected the transmembrane activity of CPPs.
2. The physical and chemical properties and lengths of CPPs with different activities were displayed on the two-dimensional plane. It was found that CPPs and NonCPPs with different activities could be clustered under some special properties, and HCPPs, MCPPs, LCPPs and NonCPPs were divided into three clusters, showing their differences;
3. In this paper, the concept of physical and chemical centroid of biological sequence is introduced, and the residues composing the sequence are regarded as particle points, and the sequence is abstracted as a particle system for research. This method was applied to the analysis of CPPs by projecting CPPs with different activities onto the 3D plane by PCA method, and it was found that most CPPs clustered together and some LCPPs clustered together with NonCPPs.
This study has implications for the design of CPPs and understanding the differences in the sequences of CPPs with different activities. In addition, the analysis method of physical and chemical centroid of biological sequences introduced in this paper can also be used for the analysis of other biological problems. At the same time, they can be used as input parameters for some biological classification problems and play a role in pattern recognition.
Post time: Jun-15-2023