简介:Anewmethodwasdescribedforusingarecurrentneuralnetworkwithbiasunitstopredictcontactmapsinproteins.Themaininputstotheneuralnetworkincluderesiduespairwise,residueclassificationaccordingtohydrophobicity,polar,acidic,basicandsecondarystructureinformationandresidueseparationbetweentworesidues.Inourwork,adatasetwasusedwhichwascomposedof53globulinproteinsofknown3Dstructure.Anaveragepredictiveaccuracyof0.29wasobtained.Ourresultsdemonstratetheviabilityoftheapproachforpredictingcontactmaps.
简介:Detectingtheboundariesofproteindomainsisanimportantandchallengingtaskinbothexperimentalandcomputationalstructuralbiology.Inthispaper,apromisingmethodfordetectingthedomainstructureofaproteinfromsequenceinformationaloneispresented.Themethodisbasedonanalyzingmultiplesequencealignmentsderivedfromadatabasesearch.Multiplemeasuresaredefinedtoquantifythedomaininformationcontentofeachpositionalongthesequence.Thentheyarecombinedintoasinglepredictorusingsupportvectormachine.Whatismoreimportant,thedomaindetectionisfirsttakenasanimbalanceddatalearningproblem.Anovelundersamplingmethodisproposedondistance-basedmaximalentropyinthefeaturespaceofSupportVectorMachine(SVM).Theoverallprecisionisabout80%.Simulationresultsdemonstratethatthemethodcanhelpnotonlyinpredictingthecomplete3Dstructureofaproteinbutalsointhemachinelearningsystemongeneralimbalanceddatasets.