Zgryźliwość kojarzy mi się z radością, która źle skończyła.
//-->.pos {position:absolute; z-index: 0; left: 0px; top: 0px;}BMC BioinformaticsSoftwareBioMedCentralOpen AccessMetaMQAP: A meta-server for the quality assessment of proteinmodelsMarcin Pawlowski*1, Michal J Gajda1, Ryszard Matlak1andJanusz M Bujnicki*1,2Address:1Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology, Trojdena 4, PL-02-109Warsaw, Poland and2Laboratory of Bioinformatics, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam MickiewiczUniversity, Umultowska 89, PL-61-614 Poznan, PolandEmail: Marcin Pawlowski* - marcinp@genesilico.pl; Michal J Gajda - mgajda@genesilico.pl; Ryszard Matlak - rym@post.pl;Janusz M Bujnicki* - iamb@genesilico.pl* Corresponding authorsPublished: 29 September 2008BMC Bioinformatics2008,9:403doi:10.1186/1471-2105-9-403Received: 18 March 2008Accepted: 29 September 2008This article is available from: http://www.biomedcentral.com/1471-2105/9/403© 2008 Pawlowski et al; licensee BioMed Central Ltd.This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0),which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.AbstractBackground:Computational models of protein structure are usually inaccurate and exhibit significant deviations from the truestructure. The utility of models depends on the degree of these deviations. A number of predictive methods have beendeveloped to discriminate between the globally incorrect and approximately correct models. However, only a few methodspredict correctness of different parts of computational models. Several Model Quality Assessment Programs (MQAPs) havebeen developed to detect local inaccuracies in unrefined crystallographic models, but it is not known if they are useful forcomputational models, which usually exhibit different and much more severe errors.Results:The ability to identify local errors in models was tested for eight MQAPs: VERIFY3D, PROSA, BALA, ANOLEA,PROVE, TUNE, REFINER, PROQRES on 8251 models from the CASP-5 and CASP-6 experiments, by calculating the Spearman'srank correlation coefficients between per-residue scores of these methods and local deviations between C-alpha atoms in themodels vs. experimental structures. As a reference, we calculated the value of correlation between the local deviations andtrivial features that can be calculated for each residue directly from the models, i.e. solvent accessibility, depth in the structure,and the number of local and non-local neighbours. We found that absolute correlations of scores returned by the MQAPs andlocal deviations were poor for all methods. In addition, scores of PROQRES and several other MQAPs strongly correlate with'trivial' features. Therefore, we developed MetaMQAP, a meta-predictor based on a multivariate regression model, which usesscores of the above-mentioned methods, but in which trivial parameters are controlled. MetaMQAP predicts the absolutedeviation (in Ångströms) of individual C-alpha atoms between the model and the unknown true structure as well as globaldeviations (expressed as root mean square deviation and GDT_TS scores). Local model accuracy predicted by MetaMQAPshows an impressive correlation coefficient of 0.7 with true deviations from native structures, a significant improvement overall constituent primary MQAP scores. The global MetaMQAP score is correlated with model GDT_TS on the level of 0.89.Conclusion:Finally, we compared our method with the MQAPs that scored best in the 7th edition of CASP, using CASP7server models (not included in the MetaMQAP training set) as the test data. In our benchmark, MetaMQAP is outperformedonly by PCONS6 and method QA_556 – methods that require comparison of multiple alternative models and score each ofthem depending on its similarity to other models. MetaMQAP is however the best among methods capable of evaluating justsingle models.We implemented the MetaMQAP as a web server available for free use by all academic users at the URL https://genesilico.pl/toolkit/Page 1 of 20(page number not for citation purposes)BMC Bioinformatics2008,9:403http://www.biomedcentral.com/1471-2105/9/403BackgroundEvaluation of model accuracy is an essential step in pro-tein structure prediction. The existing methods for qualityassessment of protein models (MQAPs) are usually basedeither on a physical effective energy which can beobtained from fundamental analysis of particle forces oron an empirical pseudo energy derived from known pro-tein structures (review: [1]. So far, most of the develop-ment of MQAPs was focused on the global evaluation ofprotein structure and most of the existing methods wereoptimized to discriminate between globally correct andincorrect 'decoy' structures rather than to detect correctand incorrect fragments [2,3]. Even for MQAPs that arecapable of generating independent evaluations for eachamino acid in the protein structure, it is usually recom-mended that a score is averaged over a long stretch of res-idues (e.g. 21 amino acids in the case of VERIFY3D [4]).Systematic assessment experiments, e.g. Critical Assess-ment of techniques for protein Structure Prediction(CASP) and LiveBench demonstrated that models with acorrect fold can be confidently recognized, especially bythe fold-recognition meta-servers [5,6]. However, com-parative models, especially those based on remotelyrelated templates, often exhibit local inaccuracies that aredifficult to identify by a global evaluation, in particularmisthreadings of short regions (5–10 residues) corre-sponding to shifted alignments within individual second-ary structure elements [7,8].In CASP5, we proposed that inaccuracies due to localalignment shifts can be identified and corrected by identi-fication of variable conformations in alternative homol-ogy models, comparison of their VERIFY3D scoresaveraged over only 5 neighbouring residues, and construc-tion of hybrid models comprising the best-scoring frag-ments [9]. Our method (termed the "FRankenstein'smonster approach") turned out to consistently producevery accurate models, especially if regions with initiallypoor scores were systematically varied to generate addi-tional models for evaluation [10]. However, detailedinspection of cases where we failed to identify the mostnative-like local conformation based on the VERIFY3Dscore revealed a considerable variation of scores evenamong models with similar structural features. Therefore,we decided to carry out a systematic evaluation of thecapability of VERIFY3D and several other popularMQAPs, including the recently published method PRO-QRES [11], to identify the best method for prediction oflocal accuracy of protein models. However, as the workprogressed, we realized that none of the MQAPs we ana-lyzed was sufficiently accurate and robust and that theyexhibited very different strengths, and weaknesses. This inturn prompted us to develop a new "meta-predictor" spe-cifically optimized to detect local errors.ImplementationPreparation of protein models for the local qualityassessmentTraining dataWe downloaded all models generated within the frame-work of the Critical Assessment of techniques for proteinStructure Prediction (CASP) rounds 5 and 6, for cases clas-sified as 'template-based modeling', i.e. 'comparativemodeling' and 'fold recognition' [12,13]. In these cases alarge fraction of models have a correct fold and exhibitwidely varying degree of global and local similarity to thenative structure, with some completely wrong models (ofincorrect folds). To create the model database we usedonly models that covered at least 90% residues of the tar-get sequence and did not exhibit any internal deletions(i.e. missing residues were allowed only at the termini). Ifthe CASP target was a multidomain protein, it was splitinto individual domains, which were then regarded asseparate models. Ultimately, we collected 8251 modelsfor 84 CASP5&6 targets. Then these models where super-imposed onto their experimentally solved counterpartsusing LGA [14], routinely used in CASP assessment. Forour datasets, the average root mean square deviation(RMSD) on C-α atoms between the models and the tem-plates is 2.00 Å and the average GDT_TS score is 59.Many of CASP models are 'non-physical' in the sense thatthey often exhibit steric clashes, non-standard bondlengths and angles, improper stereochemistry or they lackparts of residues (e.g. residues may be reduced to just C-αatoms). Thus, we 'idealized' our CASP5&6 model datasetto minimize the most severe local errors by simply run-ning MODELLER [15] with default options, using theoriginal model as a template to derive spatial restraints tobuild a refined full-atom model. We want to emphasizethat such procedure can lead to false positives in the caseof bad regions of a model and false negatives in the caseof excellent refined models.The average RMS deviation between the idealized modelsand their original counterparts is 0.33 Å reflecting a slightpositional adjustment of the most distorted residues dur-ing the idealization. Nonetheless, the GDT_TS score of theidealized models remains 59, the same as for the originalmodels, and the average RMSD with respect to the nativestructures changes negligibly from 2.00 to 2.01 Å, indicat-ing approximately the same amount of movementtowards and away from the native structures during 'ideal-ization'. Analysis of the RMSD and GDT_TS values formodels of different accuracy reveals that on the average,our 'idealization' has slightly improved the absolute accu-racy of original models with GDT_TS score < 90 (i.e. verygood models) and slightly decreased the quality of mod-els with GDT_TS≥90 (Figure 1). Hereafter, the resultingset of models will be referred to as CASP5&6+.Page 2 of 20(page number not for citation purposes)BMC Bioinformatics2008,9:403http://www.biomedcentral.com/1471-2105/9/403Figure 1all) distribution of GDT_TS scores indicating (dis)similarity between the native structures of CASP targets in(the CASP5&6+FR(H) categories and the corresponding models: original ones (the CASP5&6 all) and their idealized versions the CM andTheThe distribution of GDT_TS scores indicating (dis)similarity between the native structures of CASP targets inthe CM and FR(H) categories and the corresponding models: original ones (the CASP5&6 all) and their ideal-ized versions (the CASP5&6+ all).The aim of our analysis was to develop a method thatwould be able to accurately estimate the deviation of C-αatoms with respect to the corresponding residues in thenative structures without any knowledge of the nativestructure. Despite we introduced 'idealization', weintended to make predictions for the original models.Thus, we trained MetaMQAPII with the deviationsbetween original models (not the idealized models) to thenative structures, even though the other component oftraining was the MQAP score for the idealized models.Test dataIn the last part of this article we compare MetaMQAP withCASP7 winners in the MQAP category. Thus, from theCASP7 website we downloaded both the server modelsand Quality Assessment predictions done by winners ofthe MQAP category. The accuracy of server models calcu-lated by the LGA method was taken from the CASP7 web-site [16]. CASP7 server models have been processed in thesame way as the training CASP5&6 data, e.g. they havebeen 'idealized' with MODELLER and scored with Meta-MQAP. As with the CASP5&6+ dataset, deviationsbetween the models and the experimental structures werecalculated for models before 'idealization'.Statistical analysesWe applied wide range of statistical tools: such as Pearsonand Spearman's rang correlation, ROC curve analysis, t-test, multivariable regression, and cluster analysis. All sta-tistical analyses were done using STATISTICA 7 software(StatSoft, Inc. Tulsa, OK, USA).Model Quality Assessment Programs (MQAPs)For the evaluation of protein models from the trainingdataset and for the development of the MetaMQAP weused 8 primary MQAP methods: VERIFY3D [4],PROSA2003 [17], PROVE [18], ANOLEA [19], BALA-SNAPP [20], TUNE [21], REFINER [22], and PROQRES[11]. VERIFY 3D evaluates the environment of each resi-due in a model with respect to the expected environmentPage 3 of 20(page number not for citation purposes)BMC Bioinformatics2008,9:403http://www.biomedcentral.com/1471-2105/9/403as found in the high resolution X-ray structures. It oper-ates on a '3D-1D profile' of a protein structure, whichincludes the statistical preferences for the following crite-ria: the area of the residue that is buried, the fraction ofside-chain area that is covered by polar atoms (oxygenand nitrogen), and the local secondary structure [4,23]. Inour own experience, VERIFY3D is rather permissive (i.e.detects only relatively major errors, usually related to unu-sual contacts resulting from misalignments, e.g. burial ofcharged groups in a hydrophobic core. On the other hand,VERIFY3D often fails to detect errors such as non-physicalbond lengths or angles or some steric clashes (e.g. thread-ing of a distorted aliphatic side chain through a distortedaromatic ring could be regarded as 'protein-like' by thismethod). PROSA 2003 relies on empirical energy poten-tials derived from pairwise interactions observed in high-resolution protein structures [17]. In our own experience,PROSA 2003 is very strict compared to VERIFY3D, i.e. itoften detects even very minor errors, such as distortedgeometry of hydrogen-bonded residues, and thereforemay be more useful for the evaluation of nearly-nativehomology models than for the fold-recognition modelsthat are plagued by local errors. ANOLEA is also based ona distance-dependent empirical potential. It evaluates thenon-local environment (NLE) of each heavy atom in themodel. The NLE is defined as the set of all heavy atomswithin the distance of 7 Å that belong to amino acids far-ther than 11 residues in the analyzed polypeptide. Owingto the focus on non-local contacts, ANOLEA is able toidentify some errors that remain undetected both byVERIFY3D and PROSA [19]. PROVE analyzes the packingin protein models by evaluating the regularity of the atomvolume, defined by the atom's radius and the planes sep-arating it from other atoms [18]. BALA-SNAPP evaluatesthe structure by means of a four-body statistical potential,applied to tetrahedral quadruplets or spatially neighbour-ing residues [24]. TUNE uses a neural network to predictlocal quality of residue from both a local and non localcontact of residues in the model [21]. REFINER is basedon a statistical potential, which includes terms such as:contacts potential, long distance potential, hydrogenbonds and burial pseudo energy [22]. Finally, PROQRESis the only method in this set, which has been developedspecifically to predict local errors in crude protein models.This method applies a neural network to estimate localstructure from: atom-atom contacts, residue-residue con-tacts, secondary structure context, and solvent accessibility[11].In the final comparison, we analyzed the results of "blind"assessment done for the CASP7 dataset by 6 methods:QA556 – LEE (unpublished), QA704 – QA-ModFOLD[25,26], a method based on the nFOLD protocol [27],QA633 – PROQ, QA692 – ProQlocal [11], QA634 –PCONS6 a new variant of PCONS [28], QA713 – Cir-cleQA (for more information see CASP7 abstracts website[29])Results and discussionPreparation of a set of models for evaluationThe evaluation of the capability of MQAPs to predict thelocal accuracy of protein models requires a carefully pre-pared dataset. We aimed to identify the most native-likesegments in a set of high quality models, in particularthose generated by comparative modeling and fold-recog-nition methods. Therefore, rather than analyzing popularsets of decoys with a clear majority of globally incorrectversions of various protein structures [2,30], we decidedto use models of all CASP5&6 targets in the CM andFR(H) categories (corresponding to the 'template-basedmodeling' category in CASP7). We used only models thatcovered at least 90% residues of the experimentally solvedstructure and exhibited no missing internal residues (i.e.deletions were allowed only for the termini) (see Meth-ods). Models from the CM and FR(H) categories are usu-ally based on templates with the same fold as the nativestructure, and the major reasons of their deviation fromthe native structure are alignment shifts (misthreading)and/or structural divergence between the target and theused template. These models are 'relatively good' onlywith respect to the correct position of the backbone atomsin the protein core, but they often contain various errors,such as steric clashes between the side chains, missing sidechains, unmodeled residues corresponding to insertionsor terminal extensions, and discontinuities in the place ofdeletions. Such models may be considered native-like interms of C-α atoms, but non-physical in details. Unfortu-nately, most MQAP methods were optimized for thestructures of crystallographic quality, and all 'non-physi-cal' details contribute to their scores in unpredictable ways– either as very serious errors (e.g. steric clashes inANOLEA) or as artificially positive elements (e.g. someclashes in VERIFY3D). In addition, CASP models are gen-erated by different modeling protocols which exhibit var-ious peculiarities with respect to inclusion or omission ofatoms. Variants include C-α atoms, backbone and C-βatoms, all heavy atoms, or all atoms including hydrogens,or different combinations of the above (i.e. in one modelsome residues may be complete and others may lack dif-ferent types of atoms). Obviously, it is very difficult tocompare the accuracy of residues modeled at such a differ-ent level of precision, even if the aim is to assess the accu-racy of C-α coordinates only. Moreover, most MQAPsrequire complete models, without chain breaks or missingatoms, but often also without any hydrogen atoms. OurCASP7 results (see below) clearly demonstrate that utili-zation of 'crude' CASP models leads to decreased perform-ance of MQAPs, compared to the 'idealized' variants of thesame models.Page 4 of 20(page number not for citation purposes)BMC Bioinformatics2008,9:403http://www.biomedcentral.com/1471-2105/9/403Taking into account the above-mentioned considerations,we constructed an 'idealized' set of models (hereafterreferred to as the CASP5&6+ dataset) using MODELLER[15], which minimizes the violation of stereo-chemicalconstraints as well as restraints derived from the templateand yields the canonical set of atoms for each residue. Therestraints were derived from the original CASP models,instead of templates. This procedure reconstructed aheavy-atom representation for all residues except theomitted terminal residues and optimized the bondlengths, angles and packing. On the other hand, the back-bone structure of such 'idealized' models maintained theconformation nearly identical to the starting structures(see Methods). We envisage only one situation that canlead to a false significant improvement in an idealizedmodel: If the original model contains big gaps (>>3.6Å)between C-α atoms of adjacent residues (obviously anerror, as this should not occur in real proteins), MODEL-LER will attempt to seal the gap, causing conformationalchange in the neighboring regions and bringing it closerto what may be a native-like conformation (if the flankingregions have correct conformation). However, such casesare quite rare in practice.The idealized models were used only to compute MQAPscores, while the deviation between the modeled andexperimentally observed positions of C-α atoms (used asa measure of the local quality of the model) was calcu-lated for the original, unmodified models. Additional file1 shows the distribution of residue deviation in our set of'original' models.Critical assessment of MQAPsAll models in the CASP5&6+ dataset were evaluated with8 popular MQAP methods that we found to be availablefor download and local installation: VERIFY3D,PROSA2003, PROVE, ANOLEA, BALA-SNAPP, TUNE,REFINER, PROQRES (see Methods). VERIFY3D, ANOLEAand REFINER report series of "raw" scores for individualresidues. PROSA reports the composite score and its twocomponents. ANOLEA and BALA report an additionalscore corresponding to the number of contacts/neigh-bours of each residue. TUNE and PROQRES report only asingle score for each residue. We also analyzed the corre-lation between residue deviation with local residue fea-tures such as: solvent accessibility calculated usingNACCESS [31], residue depth calculated using MSMS [32]as well as with the agreement between secondary structurepredicted with PSI-PRED [33] and calculated from themodel using DSSP [34]. In addition, we studied the accu-racy of a trivial score (calculated directly from the model),based on residue depth in the structure size of the proteinand type of an amino acid. TrivialScore divides the resi-dues into 2000 classes based on the 10 bins of model size(number of residues in the model), 10 bins of residuedepth in the structure (ResDepth) and 20 bins defined byeach amino acid. The predicted TrivialScore valuesdirectly correspond to the average residue deviation of res-idues grouped in a given class. TrivialScore should beregarded as a baseline MQAP, which predicts that on theaverage residues in the protein core are modeled well, andresidues on the surface are modeled poorly.Figure 2 illustrates the comparison of the absolute valueof Spearman's (R) correlation coefficient between theabsolute deviations of the modeled residues from theircounterparts in the native structures and all the above-mentioned parameters (MQAPs, local residue features,TrivialScore). In addition, the figure presents the result ofa cluster analysis, which shows the relationship betweenthe parameters discussed in this work. We applied theUPMGA method with the value of (1 – |Spearman's rankcorrelation coefficient|) as the linkage distance. Notewor-thy, the Spearman's rank correlation was used herebecause the relationship between parameters studied herehas a non-linear but monotonic character. In such case ofnonlinear relationship as a alternative to Spearman's rankcorrelation the ROC curve analysis can be used. Howeverthe ROC curve analysis misses the information requiredfor the cluster analysis presented here.According to our benchmark (Figure 2), the scoresreported by PROQRES exhibit best correlation with thereal residue deviation (R = -0.50). This result wasexpected, since PROQRES is the only method in our testset that has been developed specifically to predict localquality of theoretical models, such as those from CASP.However, our analysis also revealed that PROQRES scoreperforms only slightly better than the global model accu-racy (PROQ score – which is identical for all residues in agiven model). Apart from PROQ and PROQRES, the cor-relations of primary MQAP scores with the local accuracyof the model appear very poor – only a few scores exhibita correlation coefficient above 0.25. The scores that corre-late best with the local model accuracy, are BALA,VERIFY3D score averaged in a window of 5 residues(VERIFY3Dw5), ANOLEAw5, and REFINER. It is notewor-thy that the 'smoothened' VERIFY3Dw5 score is a muchbetter predictor of local model accuracy than the corre-sponding "raw" score (VERIFY3D). The same is observedfor the ANOLEA and its 'smoothened' variantANOLEAw5. Another interesting observation is that thecomposite score reported by PROSA (PROSA) comprisesa relatively well-performing component score describingatom-atom interactions (PROSApair) and a much poorercomponent score describing atom-solvent interactions(PROSAsurf). We also observed that scores reported bydifferent methods are poorly correlated with each other,which provides a stimulus to develop a method that com-Page 5 of 20(page number not for citation purposes)zanotowane.pl doc.pisz.pl pdf.pisz.pl hannaeva.xlx.pl