How a Computer Algorithm can be Used to Predict The Hydrophilicity of Proteins

1339 Words3 Pages

Section 1 – Predicting the hydrophilicity of the protein 1. How a computer algorithm can be used to predict the characteristic. In order to predict the characteristics of a protein, computer algorithms use the following method. The databases contain scales such as hydrophobicity scales which have been derived using data obtained from experimental studies carried out on proteins. The experiments are carried out with the purpose of predicting sections that are highly hydrophobic. Apart from hydrophobicity, algorithms also use many other scales based on physical and chemical properties of amino acids. The algorithm used in this assignment was ProtScale. This algorithm contains 50 scales that have been obtained from literature. Each of these scales has a value for each amino acid. In order to obtain the data, the algorithm scans the entered protein sequence with a window of specified size. Next, at every position, the mean scale value of the amino acids within that window is calculated. These values are then used to generate the plot as seen in figure 1 (Gasteiger E. 2005). 2. Reasons for choosing the algorithm and brief explanation of the algorithm. The algorithm that was used to find out the hydrophilicity of the protein was the Hopp-Woods algorithm. This algorithm was chosen because it gives the regions of maximum and minimum hydrophilicity. The algorithm has scale values for all 20 amino acids. They are as follows. Amino acid Scale value Ala -0.500 Arg 3.000 Asn 0.200 Asp 3.000 Cys -1.000 Gln 0.200 Glu 3.000 Gly 0.000 His -0.500 Ile -1.800 Leu -1.800 Lys 3.000 Met -1.300 Phe -2.500 Pro 0.000 Ser 0.300 Thr -0.400 Trp -3.400 Tyr -2.300 Val -1.500 Table 1 – Table showing the scale values for each amino acid on the Hopp-Wodds scale (Hopp T.P.). Based on these values and using the method described previously in part 1, the algorithm generates results that give the regions of maximum and minimum hydrophilicity. The scale gives the non-polar amino acids negative values. Therefore, the region which has the highest value can be considered as the most hydrophilic region. Also, according to (Bowen 1998) when a window of 6 is used, the region of maximal hydrophilicity is likely to be an antigenic site. The protein sequence that was used in this assignment is as follows. MPRAPRCRAVRSLLRSHYREVLPLATFVRRLGPQGWRLVQRGDPAAFRA LVAQCLVCVPWDARPPPAAPSFRQVSCLKELVARVLQRLCERGAKNVLA FGFALLDGARGGPPEAFTTSVRSYLPNTVTDALRGSGAWGLLLRRVGDD VLVHLLARCALFVLVAPSCAYQVCGPPLYQLGAATQARPPPHASGPRRR LGCERAWNHSVREAGVPLGLPAPGARRRGGSASRSLPLPKRPRRGAAPE PERTPVGQGSWAHPGRTRGPSDRGFCVVSPARPAEEATSLEGALSGTRH SHPSVGRQHHAGPPSTSRPPRPWDTPCPPVYAETKHFLYSSGDKEQLRP SFLLSSLRPSLTGARRLVETIFLGSRPWMPGTPRRLPRLPQRYWQMRPL FLELLGNHAQCPYGVLLKTHCPLRAAVTPAAGVCAREKPQGSVAAPEEE DTDPRRLVQLLRQHSSPWQVYGFVRACLRRLVPPGLWGSRHNERRFLRN TKKFISLGKHAKLSLQELTWKMSVRGCAWLRRSPGVGCVPAAEHRLREE ILAKFLHWLMSVYVVELLRSFFYVTETTFQKNRLFFYRKSVWSKLQSIG IRQHLKRVQLRELSEAEVRQHREARPALLTSRLRFIPKPDGLRPIVNMD YVVGARTFRREKRAERLTSRVKALFSVLNYERARRPGLLGASVLGLDDI HRAWRTFVLRVRAQDPPPELYFVKVDVTGAYDTIPQDRLTEVIASIIKP QNTYCVRRYAVVQKAAHGHVRKAFKSHVSTLTDLQPYMRQFVAHLQETS PLRDAVVIEQSSSLNEASSGLFDVFLRFMCHHAVRIRGKSYVQCQGIPQ GSILSTLLCSLCYGDMENKLFAGIRRDGLLLRLVDDFLLVTPHLTHAKT FLRTLVRGVPEYGCVVNLRKTVVNFPVEDEALGGTAFVQMPAHGLFPWC GLLLDTRTLEVQSDYSSYARTSIRASLTFNRGFKAGRNMRRKLFGVLRL KCHSLFLDLQVNSLQTVCTNIYKILLLQAYRFHACVLQLPFHQQVWKNP TFFLRVISDTASLCYSILKAKNAGMSLGAKGAAGPLPSEAVQWLCHQAF LLKLTRHRVTYVPLLGSLRTAQTQLSRKLPGTTLTALEAAANPALPSDF KTILD 3. Results interpretation The sequence was 1132 amino acids in length and the result that was generated is as follows.

Open Document