Presently, I am coding a neural network for protein secondary structure prediction. Initially, I thought of predicting contact maps. But, the I coudn’t find contact map data for proteins. Moreover, parsing the PDB files and then creating a contact map seemed as a daunting task to me.
Therefore, I chose the secondary structure prediction. Well, what I am gonna do is to input a amino acid sequence and then predict secondary structure for each residue. Current methods achieve around 75-80% accuracy, so I will try to achieve atleast 75% accuracy.
My process will be a 3 stage process. First two will involve a neural network and the third one assigns confidence to prediction based on statistical information.
Any ideas, comments or suggestions?
Hey,
I am currently looking into several machine learning techniques for protein structure prediction. I have coded the program in Python and am training it. But, the problem is that Python is really really slow.
I have 220 inputs and around 18000 samples (testing and training).. It just crawls under this..
However, for other ML techniques, I plan to use ready-to-use software such as Weka or Yale.. Another option is using Orange library for Python.. But again, Python is damn slow.
My inspiration: Well, when I checked on the internet about currently used ANNs for prediction, I was fascinated by their real world applications.. I wondered if these black-boxes can really learn anything about protein structure.. Hence, I tried to code a neural network for myself..