Model Rankings

Available Phenotypes

Motility

Model Balanced Accuracy Precision Sample Size
anthropic/claude-3.5-sonnet 0.901 0.832 4015
openai/gpt-4o 0.862 0.780 5927
openai/gpt-4 0.839 0.832 5949
microsoft/wizardlm-2-8x22b 0.838 0.797 5910
google/gemini-flash-1.5 0.820 0.752 3852
anthropic/claude-3-haiku:beta 0.816 0.715 3778
google/gemini-pro 0.808 0.743 4038
meta-llama/llama-3-70b-instruct:nitro 0.803 0.696 3999
openai/gpt-3.5-turbo-0125 0.790 0.690 5770
mistralai/mixtral-8x7b-instruct:nitro 0.781 0.855 3732
meta-llama/llama-3-8b-instruct:nitro 0.768 0.664 4065
google/palm-2-chat-bison-32k 0.755 0.667 4844
google/gemini-pro-1.5 0.754 0.645 4822
openchat/openchat-7b 0.704 0.775 3905
mistralai/mistral-7b-instruct 0.694 0.652 3940
microsoft/phi-3-mini-128k-instruct 0.693 0.781 5345
perplexity/llama-3-sonar-small-32k-chat 0.645 0.555 3767
google/gemma-7b-it 0.522 0.477 4160

Gram staining

Model Balanced Accuracy Precision Sample Size
anthropic/claude-3.5-sonnet 0.819 0.652 3875
openai/gpt-4 0.812 0.646 3878
meta-llama/llama-3-70b-instruct:nitro 0.811 0.644 3876
openai/gpt-4o 0.811 0.653 3878
openai/gpt-3.5-turbo-0125 0.808 0.643 3878
google/gemini-pro-1.5 0.806 0.641 3871
microsoft/wizardlm-2-8x22b 0.805 0.641 3867
anthropic/claude-3-haiku:beta 0.804 0.640 3870
openchat/openchat-7b 0.799 0.606 3875
google/gemini-flash-1.5 0.794 0.631 3877
google/gemini-pro 0.789 0.625 3842
mistralai/mixtral-8x7b-instruct:nitro 0.780 0.618 3878
google/palm-2-chat-bison-32k 0.779 0.620 3876
perplexity/llama-3-sonar-small-32k-chat 0.741 0.583 3875
meta-llama/llama-3-8b-instruct:nitro 0.707 0.584 3793
mistralai/mistral-7b-instruct 0.666 0.634 3843
microsoft/phi-3-mini-128k-instruct 0.657 0.536 3869
google/gemma-7b-it 0.595 0.530 3847

Aerophilicity

Model Balanced Accuracy Precision Sample Size
openai/gpt-4o 0.897 0.736 4356
google/gemini-flash-1.5 0.841 0.702 1507
google/gemini-pro-1.5 0.822 0.643 5635
meta-llama/llama-3-70b-instruct:nitro 0.782 0.564 2331
anthropic/claude-3.5-sonnet 0.772 0.522 3820
openai/gpt-4 0.761 0.508 5551
google/gemini-pro 0.716 0.392 1509
google/palm-2-chat-bison-32k 0.709 0.406 2398
microsoft/wizardlm-2-8x22b 0.699 0.537 5085
openai/gpt-3.5-turbo-0125 0.671 0.514 6486
openchat/openchat-7b 0.595 0.563 43
mistralai/mixtral-8x7b-instruct:nitro 0.534 0.551 2226
mistralai/mistral-7b-instruct 0.502 0.461 1179
microsoft/phi-3-mini-128k-instruct 0.496 0.426 707
meta-llama/llama-3-8b-instruct:nitro 0.399 0.396 1517
anthropic/claude-3-haiku:beta 0.397 0.267 7

Extreme environment tolerance

Model Balanced Accuracy Precision Sample Size
anthropic/claude-3.5-sonnet 0.723 0.240 4559
openai/gpt-4 0.719 0.442 6739
microsoft/wizardlm-2-8x22b 0.687 0.228 6687
openai/gpt-4o 0.664 0.174 6704
google/gemini-flash-1.5 0.639 0.152 4369
google/gemini-pro-1.5 0.636 0.160 5520
perplexity/llama-3-sonar-small-32k-chat 0.633 0.181 4241
openai/gpt-3.5-turbo-0125 0.624 0.619 6454
meta-llama/llama-3-70b-instruct:nitro 0.623 0.143 4542
google/palm-2-chat-bison-32k 0.617 0.648 5506
mistralai/mixtral-8x7b-instruct:nitro 0.612 0.148 4239
google/gemini-pro 0.604 0.407 4574
microsoft/phi-3-mini-128k-instruct 0.579 0.134 5993
mistralai/mistral-7b-instruct 0.579 0.120 4443
meta-llama/llama-3-8b-instruct:nitro 0.567 0.116 4594
openchat/openchat-7b 0.562 0.237 4443
anthropic/claude-3-haiku:beta 0.533 0.104 4287
google/gemma-7b-it 0.500 0.100 4711

Biofilm formation

Model Balanced Accuracy Precision Sample Size
openai/gpt-4 0.619 0.774 507
mistralai/mistral-7b-instruct 0.602 0.757 332
openchat/openchat-7b 0.600 0.778 338
google/gemini-pro-1.5 0.572 0.734 426
openai/gpt-4o 0.557 0.736 504
google/gemini-pro 0.551 0.743 357
google/palm-2-chat-bison-32k 0.529 0.721 411
anthropic/claude-3.5-sonnet 0.504 0.722 340
mistralai/mixtral-8x7b-instruct:nitro 0.500 0.716 328
perplexity/llama-3-sonar-small-32k-chat 0.500 0.719 317
google/gemma-7b-it 0.500 0.713 363
meta-llama/llama-3-8b-instruct:nitro 0.500 0.692 347
anthropic/claude-3-haiku:beta 0.498 0.719 321
google/gemini-flash-1.5 0.497 0.707 319
microsoft/wizardlm-2-8x22b 0.497 0.708 505
openai/gpt-3.5-turbo-0125 0.497 0.712 477
microsoft/phi-3-mini-128k-instruct 0.491 0.713 451
meta-llama/llama-3-70b-instruct:nitro 0.490 0.708 344

Animal pathogenicity

Model Balanced Accuracy Precision Sample Size
google/gemini-flash-1.5 0.847 0.799 2670
anthropic/claude-3-haiku:beta 0.843 0.803 2598
google/gemini-pro 0.833 0.791 2758
meta-llama/llama-3-70b-instruct:nitro 0.833 0.848 2754
openai/gpt-4o 0.818 0.868 4097
google/gemini-pro-1.5 0.817 0.822 3374
microsoft/wizardlm-2-8x22b 0.803 0.715 4093
anthropic/claude-3.5-sonnet 0.792 0.882 2805
google/palm-2-chat-bison-32k 0.790 0.831 3359
meta-llama/llama-3-8b-instruct:nitro 0.779 0.824 2831
perplexity/llama-3-sonar-small-32k-chat 0.758 0.696 2603
openai/gpt-4 0.752 0.892 4120
mistralai/mistral-7b-instruct 0.742 0.860 2723
openai/gpt-3.5-turbo-0125 0.742 0.843 3934
openchat/openchat-7b 0.740 0.861 2721
mistralai/mixtral-8x7b-instruct:nitro 0.696 0.893 2604
google/gemma-7b-it 0.685 0.580 2886
microsoft/phi-3-mini-128k-instruct 0.593 0.851 3657

Biosafety level

Model Balanced Accuracy Precision Sample Size
anthropic/claude-3.5-sonnet 0.916 0.671 5039
openai/gpt-4o 0.894 0.641 7438
google/gemini-pro-1.5 0.888 0.597 6104
google/gemini-pro 0.884 0.576 5063
meta-llama/llama-3-70b-instruct:nitro 0.865 0.673 5027
anthropic/claude-3-haiku:beta 0.847 0.507 4743
openai/gpt-4 0.842 0.603 7473
google/gemini-flash-1.5 0.834 0.489 4863
google/palm-2-chat-bison-32k 0.831 0.749 6121
microsoft/wizardlm-2-8x22b 0.817 0.551 7418
openai/gpt-3.5-turbo-0125 0.768 0.568 7180
mistralai/mistral-7b-instruct 0.759 0.442 4907
perplexity/llama-3-sonar-small-32k-chat 0.731 0.460 4717
microsoft/phi-3-mini-128k-instruct 0.725 0.388 6663
mistralai/mixtral-8x7b-instruct:nitro 0.713 0.616 4698
meta-llama/llama-3-8b-instruct:nitro 0.702 0.448 5100
google/gemma-7b-it 0.636 0.397 5220
openchat/openchat-7b 0.614 0.422 4926

Health association

Model Balanced Accuracy Precision Sample Size
mistralai/mistral-7b-instruct 0.667 0.500 15
meta-llama/llama-3-8b-instruct:nitro 0.636 0.467 18
google/gemma-7b-it 0.636 0.333 15
openai/gpt-4 0.633 0.421 23
openai/gpt-4o 0.567 0.381 23
google/gemini-pro 0.550 0.471 18
google/palm-2-chat-bison-32k 0.536 0.350 21
anthropic/claude-3-haiku:beta 0.500 0.333 18
anthropic/claude-3.5-sonnet 0.500 0.294 17
microsoft/wizardlm-2-8x22b 0.500 0.348 23
meta-llama/llama-3-70b-instruct:nitro 0.500 0.316 19
mistralai/mixtral-8x7b-instruct:nitro 0.500 0.333 12
perplexity/llama-3-sonar-small-32k-chat 0.500 0.357 14
google/gemini-flash-1.5 0.500 0.286 14
google/gemini-pro-1.5 0.500 0.300 20
openchat/openchat-7b 0.482 0.300 16
openai/gpt-3.5-turbo-0125 0.464 0.273 20
microsoft/phi-3-mini-128k-instruct 0.442 0.333 18

Host association

Model Balanced Accuracy Precision Sample Size
openai/gpt-4o 0.809 0.812 5332
openai/gpt-4 0.787 0.840 5357
anthropic/claude-3-haiku:beta 0.777 0.662 3385
google/gemini-pro-1.5 0.759 0.682 4381
google/palm-2-chat-bison-32k 0.759 0.805 4393
google/gemini-pro 0.734 0.613 3617
anthropic/claude-3.5-sonnet 0.733 0.652 3621
openai/gpt-3.5-turbo-0125 0.693 0.853 5129
microsoft/wizardlm-2-8x22b 0.687 0.545 5327
openchat/openchat-7b 0.653 0.551 3513
mistralai/mistral-7b-instruct 0.651 0.492 3545
meta-llama/llama-3-70b-instruct:nitro 0.651 0.505 3607
google/gemini-flash-1.5 0.631 0.484 3487
meta-llama/llama-3-8b-instruct:nitro 0.601 0.463 3657
mistralai/mixtral-8x7b-instruct:nitro 0.589 0.441 3369
google/gemma-7b-it 0.583 0.631 3773
microsoft/phi-3-mini-128k-instruct 0.582 0.461 4763
perplexity/llama-3-sonar-small-32k-chat 0.499 0.398 3387

Plant pathogenicity

Model Balanced Accuracy Precision Sample Size
google/gemini-pro-1.5 0.797 0.263 6136
google/gemini-flash-1.5 0.774 0.419 4884
meta-llama/llama-3-70b-instruct:nitro 0.772 0.262 5050
anthropic/claude-3.5-sonnet 0.756 0.689 5060
microsoft/wizardlm-2-8x22b 0.752 0.362 7453
anthropic/claude-3-haiku:beta 0.746 0.196 4760
openai/gpt-4 0.739 0.202 7507
openai/gpt-4o 0.736 0.284 7467
openai/gpt-3.5-turbo-0125 0.730 0.421 7206
mistralai/mixtral-8x7b-instruct:nitro 0.704 0.275 4712
meta-llama/llama-3-8b-instruct:nitro 0.699 0.490 5119
google/palm-2-chat-bison-32k 0.695 0.376 6139
mistralai/mistral-7b-instruct 0.692 0.273 4961
openchat/openchat-7b 0.689 0.158 4945
google/gemini-pro 0.667 0.248 5090
perplexity/llama-3-sonar-small-32k-chat 0.658 0.946 4744
google/gemma-7b-it 0.656 0.148 5249
microsoft/phi-3-mini-128k-instruct 0.629 0.216 6699

Spore formation

Model Balanced Accuracy Precision Sample Size
google/gemini-pro-1.5 0.966 0.941 5944
google/gemini-flash-1.5 0.951 0.909 4730
anthropic/claude-3.5-sonnet 0.938 0.982 4915
openai/gpt-4o 0.932 0.935 7240
anthropic/claude-3-haiku:beta 0.923 0.905 4606
openai/gpt-4 0.922 0.962 7276
meta-llama/llama-3-70b-instruct:nitro 0.919 0.928 4899
mistralai/mixtral-8x7b-instruct:nitro 0.918 0.792 4568
openai/gpt-3.5-turbo-0125 0.906 0.923 6996
google/gemini-pro 0.898 0.927 4938
microsoft/wizardlm-2-8x22b 0.887 0.955 7219
perplexity/llama-3-sonar-small-32k-chat 0.882 0.665 4605
meta-llama/llama-3-8b-instruct:nitro 0.866 0.562 4959
mistralai/mistral-7b-instruct 0.864 0.891 4806
google/palm-2-chat-bison-32k 0.860 0.937 5967
openchat/openchat-7b 0.760 0.987 4777
microsoft/phi-3-mini-128k-instruct 0.720 0.980 6475
google/gemma-7b-it 0.682 0.969 5082

Hemolysis

Model Balanced Accuracy Precision Sample Size
google/palm-2-chat-bison-32k 0.627 0.178 57
google/gemini-flash-1.5 0.591 0.406 495
openai/gpt-4o 0.578 0.443 293
anthropic/claude-3-haiku:beta 0.568 0.415 460
anthropic/claude-3.5-sonnet 0.549 0.396 197
microsoft/wizardlm-2-8x22b 0.538 0.427 204
google/gemini-pro-1.5 0.517 0.333 127
meta-llama/llama-3-70b-instruct:nitro 0.516 0.239 96
openai/gpt-4 0.512 0.342 239
openai/gpt-3.5-turbo-0125 0.507 0.296 254
openchat/openchat-7b 0.500 0.278 12
mistralai/mixtral-8x7b-instruct:nitro 0.500 0.300 10
google/gemma-7b-it 0.500 0.013 262
perplexity/llama-3-sonar-small-32k-chat 0.485 0.171 76
meta-llama/llama-3-8b-instruct:nitro 0.475 0.144 126
microsoft/phi-3-mini-128k-instruct 0.473 0.458 19
google/gemini-pro 0.460 0.585 152
mistralai/mistral-7b-instruct 3

Cell shape

Model Balanced Accuracy Precision Sample Size
openai/gpt-4o 0.785 0.685 6597
anthropic/claude-3.5-sonnet 0.784 0.603 4490
meta-llama/llama-3-70b-instruct:nitro 0.783 0.619 4462
google/gemini-pro-1.5 0.780 0.583 5400
microsoft/wizardlm-2-8x22b 0.773 0.554 6573
openai/gpt-3.5-turbo-0125 0.765 0.637 6343
anthropic/claude-3-haiku:beta 0.756 0.631 4206
google/gemini-pro 0.751 0.568 4467
openai/gpt-4 0.738 0.601 6626
google/gemini-flash-1.5 0.729 0.564 4306
google/palm-2-chat-bison-32k 0.712 0.576 5440
mistralai/mixtral-8x7b-instruct:nitro 0.702 0.563 4127
mistralai/mistral-7b-instruct 0.653 0.509 4371
meta-llama/llama-3-8b-instruct:nitro 0.641 0.434 4502
openchat/openchat-7b 0.640 0.324 4368
microsoft/phi-3-mini-128k-instruct 0.625 0.282 5886
perplexity/llama-3-sonar-small-32k-chat 0.593 0.279 4177
google/gemma-7b-it 0.523 0.433 4633