Home Docs Download Contact

 

 

Evaluation

SGDID MedBlast SGD Overlap Recall(%) Precision(%)
S0000273 180 13 12 92.3 6.7
S0000285 33 40 28 70.0 84.8
S0000417 15 19 13 68.4 86.7
S0000854 33 30 21 70.0 63.6
S0001282 41 18 7 38.9 17.1
S0001498 29 26 24 92.3 82.8
S0001948 96 13 12 92.3 12.5
S0002306 47 13 6 46.2 12.8
S0002371 20 21 20 95.2 100.0
S0002567 144 20 17 85.0 11.8
S0002844 186 14 13 92.9 7.0
S0002905 64 14 10 71.4 15.6
S0003162 183 11 7 63.6 3.8
S0003181 19 22 16 72.7 84.2
S0003217 22 20 17 85.0 77.3
S0003391 36 29 22 75.9 61.1
S0003472 118 66 25 37.9 21.2
S0003677 152 18 18 100.0 11.8
S0004145 48 19 7 36.8 14.6
S0004200 157 30 18 60.0 11.5
S0004274 14 14 10 71.4 71.4
S0004345 14 12 10 83.3 71.4
S0004615 15 14 13 92.9 86.7
S0004623 35 36 29 80.6 82.9
S0004715 110 15 11 73.3 10.0
S0004950 11 10 6 60.0 54.5
S0005005 20 13 9 69.2 45.0
S0005154 25 20 20 100.0 80.0
S0005596 51 12 10 83.3 19.6
S0005750 23 29 16 55.2 69.6
S0005783 43 33 23 69.7 53.5
S0005786 20 26 20 76.9 100.0
S0005966 34 15 13 86.7 38.2
S0006013 15 10 9 90.0 60.0
S0007261 141 24 21 87.5 14.9
AVERAGE 75.1 47.0

 

Note:

Now there is no proper testing set to evaluate the exact precision and recall of the program. We tested the program by the SGD data (downloaded at March 15, 2003), which contains manually curated literatures of yeast genes. The testing set is randomly selected from those genes with more than 10 references, for sequences with fewer related references may give unstable results.

Users should keep in mind that it is just a primary estimate, because SGD only collects those articles whose focus is the single gene and particularly from yeast, while MedBlast will try to look for all related articles in all species.