Identifying functionally important sites in natural protein families.

<p>The symmetrized AUC for the prediction of sites with large mutational effects is computed on 30 protein families, using four different methods: ICOD, SCA, MI and Conservation, using Deep Mutational Scan (DMS) data as ground truth. For ICOD and MI, the average product correction (APC) [<a...

Full description

Saved in:
Bibliographic Details
Main Author: Nicola Dietler (12551766) (author)
Other Authors: Alia Abbara (19557026) (author), Subham Choudhury (19727478) (author), Anne-Florence Bitbol (299618) (author)
Published: 2024
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:<p>The symmetrized AUC for the prediction of sites with large mutational effects is computed on 30 protein families, using four different methods: ICOD, SCA, MI and Conservation, using Deep Mutational Scan (DMS) data as ground truth. For ICOD and MI, the average product correction (APC) [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1012091#pcbi.1012091.ref004" target="_blank">4</a>] is applied to the matrix of interest (it was found to improve the average performance for most families for these methods, but not for SCA). For ICOD, MI and SCA, the components of the eigenvector associated to the largest eigenvalue are employed to make predictions of mutational effects. Protein families are ordered by decreasing symmetrized AUC for ICOD. The mapping between protein family number and name is given in <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1012091#pcbi.1012091.s015" target="_blank">S1 Table</a>. The protein families shaded in grey have DMS data featuring a unimodal shape, the other ones have a bimodal shape.</p>