Study of Protein Secondary Structure Prediction Using Support Vector Machine
Md. Nazrul Islam Mondal , Md. Al Mamun, Saju Saha
Prediction of secondary structure of protein is important problem in bioinformatics, because the tertiary structure of protein can be determinant from the folds that are found in the secondary structure. Knowing the tertiary structure of protein can help us to find the function of protein. Moreover knowing the function of protein help to create of the antibody of protein and their work in human body. Protein secondary structure prediction mostly depends on the information stored in the primary amino acid sequence. Support Vector Machine (SVM) has shown special ability of predicting in a number of application areas including secondary structure prediction. The objective of this paper is to find out the protein secondary structure prediction using Support Vector Machine (SVM). However we introduce six binary classifiers as an almost new technique. We distinguish between the classes helices (H) strand (E), and coil (C). In this paper, we predict secondary structure using Gaussian kernel with a fixed a parameter at γ=0.1 and varying cost parameter C within the range [0.1, 5]. The goal of our approach is to propose a time efficient method for checking accuracy using different tests. Our results show the prediction accuracies are in the range 62-72%. More specifically, our results for H/~H, E/~E, C/~C, H/E, E/C and H/C are respectively 66.25%, 72.28%, 62.58%, 65.33%, 68.56% and 70.85%. The highest accuracy of 72% for OAtest is observed for the One-Against-All approach while the highest accuracy of about 70% is observed in One-Against-One approach on OAtest. We say that our approach is simple with better time complexity in comparison to Tsilo’s work.
Secondary Structure; Support Vector Machine; Binary Classifiers; OAtest.
 Reyaz-Ahmed, Anjum B., "Protein Secondary Structure Prediction Using Support Vector Machines, Nueral Networks and Genetic Algorithms" (2007). Computer Science Theses. Paper 43.
 Rost, B. and Sander, C.” Improved prediction of protein secondary structure by use of sequence profile and neural networks.” Proc Natl Acad Sci U S A 90, 7558-62 (1993).
 Kabsch, W. & Sander, C. (1983). “Dictionary of Protein Secondary Structure: Pattern Recognition of Hydrogen Bonded and Geometrical Features.” Biopolymers, 22: pp. 2577-2637.
 Salamov, A.A. & Solovyev, V.V. “Protein secondary structure prediction using local Alignments” J. Mol. Biol, 268, pp. 31-36. (1997).
 Hua, S. and Sun, Z. “A Novel Method of Protein Secondary Structure Prediction with High Segment Overlap Measure: Support Vector Machine Approach.” J. Mol. Biol. 308, 397-407 (2001).
 Lipontseng Cecilia Tsilo” A thesis submitted to Rhodes University in partial fulfillment of the requirements for the degree of Master of Science.” Department of Statistics, February 2008.
 Nguyen, M.N. & Rajapakse, J.C (2003). “Multi-Class Support Vector Machines for Protein Secondary Structure Prediction. Genome Informatics” 14: pp. 218-227.
 V.Vapnik and C. Corter. “Support vector networks . Machine Learning.” Vol. 20 pp. 273-293, 1995.
 Voet, D., Voet, G. & Pratt, W.C. (2006). “Fundamentals of biochemistry: life at the molecular level”. New York : Wiley
[Md. Nazrul Islam Mondal, Md. Al Mamun, Saju Saha (2015), Study of Protein Secondary Structure Prediction Using Support Vector Machine, International Journal of Innovative Research in Computer Science & Technology (IJIRCST), Vol-3, Issue-6, Page No-5-9], (ISSN 2347 - 5552). www.ijircst.org
Md. Nazrul Islam Mondal
Dept. of CSE, RUET, Rajshahi, Bangladesh, Mobile No. +8801912744327 (e-mail: firstname.lastname@example.org)