10-Fold Cross-Validation Area-Under-the-Curve scores (mean ± st.dev.) of protein representation methods in five protein family inference tasks. Hist-8000 (results in bold) outperforms SoT in identifying proteins from all the 25 families tested here from the ProtVec study [17]. We note that for each set of protein sequences belonging to a family to be predicted, we randomly sampled the same number of sequences from other families from Swiss-Prot [41] to form the negative data for a balanced task dataset. For brevity we are showing the results for the top-5 families by number of proteins, see S1 File (supporting information) section ‘S

<p>10-Fold Cross-Validation Area-Under-the-Curve scores (mean ± st.dev.) of protein representation methods in five protein family inference tasks. Hist-8000 (results in bold) outperforms SoT in identifying proteins from all the 25 families tested here from the ProtVec study [<a href="h...

Mô tả đầy đủ

Đã lưu trong:
Chi tiết về thư mục
Tác giả chính: Frixos Papadopoulos (22001664) (author)
Tác giả khác: Tilman Sanchez-Elsner (409950) (author), Mahesan Niranjan (32677) (author), Ashley I. Heinson (8124374) (author)
Được phát hành: 2025
Những chủ đề:
Các nhãn: Thêm thẻ
Không có thẻ, Là người đầu tiên thẻ bản ghi này!