Analysis of large sets of biological sequence data from related strains or organisms is complicated by superficial redundancy in the set, which may contain many members that are identical except at one or two positions. Thus a new method, based on deriving physicochemical property (PCP)-consensus sequences, was tested for its ability to generate reference sequences and distinguish functionally significant changes from background variability. The PCP consensus program was used to automatically derive consensus sequences starting from sequence alignments of proteins from Flaviviruses (from the Flavitrack database) and human enteroviruses, using a five dimensional set of Eigenvectors that summarize over 200 different scalar values for the PCPs of the amino acids. A PCP-consensus protein of a Dengue virus envelope protein was produced recombinantly and tested for its ability to bind antibodies to strains using ELISA. PCP-consensus sequences of the flavivirus family could be used to classify them into five discrete groups and distinguish areas of the envelope proteins that correlate with host specificity and disease type. A multivalent Dengue virus antigen was designed and shown to bind antibodies against all four DENV types. A consensus enteroviral VPg protein had the same distinctive high pKa as wild type proteins and was recognized by two different polymerases. The process for deriving PCP-consensus sequences for any group of aligned similar sequences, has been validated for sequences with up to 50% diversity. Ongoing projects have shown that the method identifies residues that significantly alter PCPs at a given position, and might thus cause changes in function or immunogenicity. Other potential applications include deriving target proteins for drug design and diagnostic kits.
ASJC Scopus subject areas
- Structural Biology
- Molecular Biology
- Computer Science Applications
- Applied Mathematics