Model Rankings

Available Phenotypes

Motility: The ability of bacteria to move independently, typically via flagella, enabling species classification as motile or non-motile, which is crucial for ecological adaptation and pathogenicity assessment.
Gram staining: A technique that classifies bacteria into Gram-positive, Gram-negative, or Gram-variable, based on their cell wall structure, with variations due to factors like peptidoglycan thickness or inconsistencies in staining results.
Aerophilicity: The ability to utilize or tolerate oxygen, classifies bacteria as aerobic (require oxygen), aerotolerant (tolerate but do not use oxygen), anaerobic (cannot grow with oxygen), or facultatively anaerobic (can grow with or without oxygen), which is crucial for understanding their metabolic processes and environmental adaptations.
Extreme environment tolerance: The ability of some species to survive and thrive in harsh conditions.
Biofilm formation: The ability of organizing into structured communities within a self-produced protective matrix.
Animal pathogenicity: The ability to infect and cause illness in animal hosts.
Biosafety level: Lab safety measures for working with bacteria, ranging from BSL-1 (minimal risk) to BSL-3 (moderate to high risk), based on the bacterium's ability to cause infection and the severity of the disease.
Health association: The ability of certain bacterial strains to either support or harm human health.
Host association: The ability of certain bacterial strains to establish relationships with specific hosts.
Plant pathogenicity: The ability of bacterial strains to infect plants.
Spore formation: The ability of certain bacteria to form spores allows them to survive under extreme environmental stress.
Hemolysis: The ability of bacteria to lyse red blood cells, leading to observable changes on blood agar, such as partial (alpha), complete (beta), or no hemolysis (gamma).
Cell shape: The morphology of the bacterial cell, with common forms including coccus (spherical), rod (cylindrical), ovoid (egg-shaped), and spiral (twisted or curved).

Motility

Model	Balanced Accuracy	Precision	Sample Size
anthropic/claude-3.5-sonnet	0.901	0.832	4015
openai/gpt-4o	0.862	0.780	5927
openai/gpt-4	0.839	0.832	5949
microsoft/wizardlm-2-8x22b	0.838	0.797	5910
google/gemini-flash-1.5	0.820	0.752	3852
anthropic/claude-3-haiku:beta	0.816	0.715	3778
google/gemini-pro	0.808	0.743	4038
meta-llama/llama-3-70b-instruct:nitro	0.803	0.696	3999
openai/gpt-3.5-turbo-0125	0.790	0.690	5770
mistralai/mixtral-8x7b-instruct:nitro	0.781	0.855	3732
meta-llama/llama-3-8b-instruct:nitro	0.768	0.664	4065
google/palm-2-chat-bison-32k	0.755	0.667	4844
google/gemini-pro-1.5	0.754	0.645	4822
openchat/openchat-7b	0.704	0.775	3905
mistralai/mistral-7b-instruct	0.694	0.652	3940
microsoft/phi-3-mini-128k-instruct	0.693	0.781	5345
perplexity/llama-3-sonar-small-32k-chat	0.645	0.555	3767
google/gemma-7b-it	0.522	0.477	4160

Gram staining

Model	Balanced Accuracy	Precision	Sample Size
anthropic/claude-3.5-sonnet	0.819	0.652	3875
openai/gpt-4	0.812	0.646	3878
meta-llama/llama-3-70b-instruct:nitro	0.811	0.644	3876
openai/gpt-4o	0.811	0.653	3878
openai/gpt-3.5-turbo-0125	0.808	0.643	3878
google/gemini-pro-1.5	0.806	0.641	3871
microsoft/wizardlm-2-8x22b	0.805	0.641	3867
anthropic/claude-3-haiku:beta	0.804	0.640	3870
openchat/openchat-7b	0.799	0.606	3875
google/gemini-flash-1.5	0.794	0.631	3877
google/gemini-pro	0.789	0.625	3842
mistralai/mixtral-8x7b-instruct:nitro	0.780	0.618	3878
google/palm-2-chat-bison-32k	0.779	0.620	3876
perplexity/llama-3-sonar-small-32k-chat	0.741	0.583	3875
meta-llama/llama-3-8b-instruct:nitro	0.707	0.584	3793
mistralai/mistral-7b-instruct	0.666	0.634	3843
microsoft/phi-3-mini-128k-instruct	0.657	0.536	3869
google/gemma-7b-it	0.595	0.530	3847

Aerophilicity

Model	Balanced Accuracy	Precision	Sample Size
openai/gpt-4o	0.897	0.736	4356
google/gemini-flash-1.5	0.841	0.702	1507
google/gemini-pro-1.5	0.822	0.643	5635
meta-llama/llama-3-70b-instruct:nitro	0.782	0.564	2331
anthropic/claude-3.5-sonnet	0.772	0.522	3820
openai/gpt-4	0.761	0.508	5551
google/gemini-pro	0.716	0.392	1509
google/palm-2-chat-bison-32k	0.709	0.406	2398
microsoft/wizardlm-2-8x22b	0.699	0.537	5085
openai/gpt-3.5-turbo-0125	0.671	0.514	6486
openchat/openchat-7b	0.595	0.563	43
mistralai/mixtral-8x7b-instruct:nitro	0.534	0.551	2226
mistralai/mistral-7b-instruct	0.502	0.461	1179
microsoft/phi-3-mini-128k-instruct	0.496	0.426	707
meta-llama/llama-3-8b-instruct:nitro	0.399	0.396	1517
anthropic/claude-3-haiku:beta	0.397	0.267	7

Extreme environment tolerance

Model	Balanced Accuracy	Precision	Sample Size
anthropic/claude-3.5-sonnet	0.723	0.240	4559
openai/gpt-4	0.719	0.442	6739
microsoft/wizardlm-2-8x22b	0.687	0.228	6687
openai/gpt-4o	0.664	0.174	6704
google/gemini-flash-1.5	0.639	0.152	4369
google/gemini-pro-1.5	0.636	0.160	5520
perplexity/llama-3-sonar-small-32k-chat	0.633	0.181	4241
openai/gpt-3.5-turbo-0125	0.624	0.619	6454
meta-llama/llama-3-70b-instruct:nitro	0.623	0.143	4542
google/palm-2-chat-bison-32k	0.617	0.648	5506
mistralai/mixtral-8x7b-instruct:nitro	0.612	0.148	4239
google/gemini-pro	0.604	0.407	4574
microsoft/phi-3-mini-128k-instruct	0.579	0.134	5993
mistralai/mistral-7b-instruct	0.579	0.120	4443
meta-llama/llama-3-8b-instruct:nitro	0.567	0.116	4594
openchat/openchat-7b	0.562	0.237	4443
anthropic/claude-3-haiku:beta	0.533	0.104	4287
google/gemma-7b-it	0.500	0.100	4711

Biofilm formation

Model	Balanced Accuracy	Precision	Sample Size
openai/gpt-4	0.619	0.774	507
mistralai/mistral-7b-instruct	0.602	0.757	332
openchat/openchat-7b	0.600	0.778	338
google/gemini-pro-1.5	0.572	0.734	426
openai/gpt-4o	0.557	0.736	504
google/gemini-pro	0.551	0.743	357
google/palm-2-chat-bison-32k	0.529	0.721	411
anthropic/claude-3.5-sonnet	0.504	0.722	340
mistralai/mixtral-8x7b-instruct:nitro	0.500	0.716	328
perplexity/llama-3-sonar-small-32k-chat	0.500	0.719	317
google/gemma-7b-it	0.500	0.713	363
meta-llama/llama-3-8b-instruct:nitro	0.500	0.692	347
anthropic/claude-3-haiku:beta	0.498	0.719	321
google/gemini-flash-1.5	0.497	0.707	319
microsoft/wizardlm-2-8x22b	0.497	0.708	505
openai/gpt-3.5-turbo-0125	0.497	0.712	477
microsoft/phi-3-mini-128k-instruct	0.491	0.713	451
meta-llama/llama-3-70b-instruct:nitro	0.490	0.708	344

Animal pathogenicity

Model	Balanced Accuracy	Precision	Sample Size
google/gemini-flash-1.5	0.847	0.799	2670
anthropic/claude-3-haiku:beta	0.843	0.803	2598
google/gemini-pro	0.833	0.791	2758
meta-llama/llama-3-70b-instruct:nitro	0.833	0.848	2754
openai/gpt-4o	0.818	0.868	4097
google/gemini-pro-1.5	0.817	0.822	3374
microsoft/wizardlm-2-8x22b	0.803	0.715	4093
anthropic/claude-3.5-sonnet	0.792	0.882	2805
google/palm-2-chat-bison-32k	0.790	0.831	3359
meta-llama/llama-3-8b-instruct:nitro	0.779	0.824	2831
perplexity/llama-3-sonar-small-32k-chat	0.758	0.696	2603
openai/gpt-4	0.752	0.892	4120
mistralai/mistral-7b-instruct	0.742	0.860	2723
openai/gpt-3.5-turbo-0125	0.742	0.843	3934
openchat/openchat-7b	0.740	0.861	2721
mistralai/mixtral-8x7b-instruct:nitro	0.696	0.893	2604
google/gemma-7b-it	0.685	0.580	2886
microsoft/phi-3-mini-128k-instruct	0.593	0.851	3657

Biosafety level

Model	Balanced Accuracy	Precision	Sample Size
anthropic/claude-3.5-sonnet	0.916	0.671	5039
openai/gpt-4o	0.894	0.641	7438
google/gemini-pro-1.5	0.888	0.597	6104
google/gemini-pro	0.884	0.576	5063
meta-llama/llama-3-70b-instruct:nitro	0.865	0.673	5027
anthropic/claude-3-haiku:beta	0.847	0.507	4743
openai/gpt-4	0.842	0.603	7473
google/gemini-flash-1.5	0.834	0.489	4863
google/palm-2-chat-bison-32k	0.831	0.749	6121
microsoft/wizardlm-2-8x22b	0.817	0.551	7418
openai/gpt-3.5-turbo-0125	0.768	0.568	7180
mistralai/mistral-7b-instruct	0.759	0.442	4907
perplexity/llama-3-sonar-small-32k-chat	0.731	0.460	4717
microsoft/phi-3-mini-128k-instruct	0.725	0.388	6663
mistralai/mixtral-8x7b-instruct:nitro	0.713	0.616	4698
meta-llama/llama-3-8b-instruct:nitro	0.702	0.448	5100
google/gemma-7b-it	0.636	0.397	5220
openchat/openchat-7b	0.614	0.422	4926

Health association

Model	Balanced Accuracy	Precision	Sample Size
mistralai/mistral-7b-instruct	0.667	0.500	15
meta-llama/llama-3-8b-instruct:nitro	0.636	0.467	18
google/gemma-7b-it	0.636	0.333	15
openai/gpt-4	0.633	0.421	23
openai/gpt-4o	0.567	0.381	23
google/gemini-pro	0.550	0.471	18
google/palm-2-chat-bison-32k	0.536	0.350	21
anthropic/claude-3-haiku:beta	0.500	0.333	18
anthropic/claude-3.5-sonnet	0.500	0.294	17
microsoft/wizardlm-2-8x22b	0.500	0.348	23
meta-llama/llama-3-70b-instruct:nitro	0.500	0.316	19
mistralai/mixtral-8x7b-instruct:nitro	0.500	0.333	12
perplexity/llama-3-sonar-small-32k-chat	0.500	0.357	14
google/gemini-flash-1.5	0.500	0.286	14
google/gemini-pro-1.5	0.500	0.300	20
openchat/openchat-7b	0.482	0.300	16
openai/gpt-3.5-turbo-0125	0.464	0.273	20
microsoft/phi-3-mini-128k-instruct	0.442	0.333	18

Host association

Model	Balanced Accuracy	Precision	Sample Size
openai/gpt-4o	0.809	0.812	5332
openai/gpt-4	0.787	0.840	5357
anthropic/claude-3-haiku:beta	0.777	0.662	3385
google/gemini-pro-1.5	0.759	0.682	4381
google/palm-2-chat-bison-32k	0.759	0.805	4393
google/gemini-pro	0.734	0.613	3617
anthropic/claude-3.5-sonnet	0.733	0.652	3621
openai/gpt-3.5-turbo-0125	0.693	0.853	5129
microsoft/wizardlm-2-8x22b	0.687	0.545	5327
openchat/openchat-7b	0.653	0.551	3513
mistralai/mistral-7b-instruct	0.651	0.492	3545
meta-llama/llama-3-70b-instruct:nitro	0.651	0.505	3607
google/gemini-flash-1.5	0.631	0.484	3487
meta-llama/llama-3-8b-instruct:nitro	0.601	0.463	3657
mistralai/mixtral-8x7b-instruct:nitro	0.589	0.441	3369
google/gemma-7b-it	0.583	0.631	3773
microsoft/phi-3-mini-128k-instruct	0.582	0.461	4763
perplexity/llama-3-sonar-small-32k-chat	0.499	0.398	3387

Plant pathogenicity

Model	Balanced Accuracy	Precision	Sample Size
google/gemini-pro-1.5	0.797	0.263	6136
google/gemini-flash-1.5	0.774	0.419	4884
meta-llama/llama-3-70b-instruct:nitro	0.772	0.262	5050
anthropic/claude-3.5-sonnet	0.756	0.689	5060
microsoft/wizardlm-2-8x22b	0.752	0.362	7453
anthropic/claude-3-haiku:beta	0.746	0.196	4760
openai/gpt-4	0.739	0.202	7507
openai/gpt-4o	0.736	0.284	7467
openai/gpt-3.5-turbo-0125	0.730	0.421	7206
mistralai/mixtral-8x7b-instruct:nitro	0.704	0.275	4712
meta-llama/llama-3-8b-instruct:nitro	0.699	0.490	5119
google/palm-2-chat-bison-32k	0.695	0.376	6139
mistralai/mistral-7b-instruct	0.692	0.273	4961
openchat/openchat-7b	0.689	0.158	4945
google/gemini-pro	0.667	0.248	5090
perplexity/llama-3-sonar-small-32k-chat	0.658	0.946	4744
google/gemma-7b-it	0.656	0.148	5249
microsoft/phi-3-mini-128k-instruct	0.629	0.216	6699

Spore formation

Model	Balanced Accuracy	Precision	Sample Size
google/gemini-pro-1.5	0.966	0.941	5944
google/gemini-flash-1.5	0.951	0.909	4730
anthropic/claude-3.5-sonnet	0.938	0.982	4915
openai/gpt-4o	0.932	0.935	7240
anthropic/claude-3-haiku:beta	0.923	0.905	4606
openai/gpt-4	0.922	0.962	7276
meta-llama/llama-3-70b-instruct:nitro	0.919	0.928	4899
mistralai/mixtral-8x7b-instruct:nitro	0.918	0.792	4568
openai/gpt-3.5-turbo-0125	0.906	0.923	6996
google/gemini-pro	0.898	0.927	4938
microsoft/wizardlm-2-8x22b	0.887	0.955	7219
perplexity/llama-3-sonar-small-32k-chat	0.882	0.665	4605
meta-llama/llama-3-8b-instruct:nitro	0.866	0.562	4959
mistralai/mistral-7b-instruct	0.864	0.891	4806
google/palm-2-chat-bison-32k	0.860	0.937	5967
openchat/openchat-7b	0.760	0.987	4777
microsoft/phi-3-mini-128k-instruct	0.720	0.980	6475
google/gemma-7b-it	0.682	0.969	5082

Hemolysis

Model	Balanced Accuracy	Precision	Sample Size
google/palm-2-chat-bison-32k	0.627	0.178	57
google/gemini-flash-1.5	0.591	0.406	495
openai/gpt-4o	0.578	0.443	293
anthropic/claude-3-haiku:beta	0.568	0.415	460
anthropic/claude-3.5-sonnet	0.549	0.396	197
microsoft/wizardlm-2-8x22b	0.538	0.427	204
google/gemini-pro-1.5	0.517	0.333	127
meta-llama/llama-3-70b-instruct:nitro	0.516	0.239	96
openai/gpt-4	0.512	0.342	239
openai/gpt-3.5-turbo-0125	0.507	0.296	254
openchat/openchat-7b	0.500	0.278	12
mistralai/mixtral-8x7b-instruct:nitro	0.500	0.300	10
google/gemma-7b-it	0.500	0.013	262
perplexity/llama-3-sonar-small-32k-chat	0.485	0.171	76
meta-llama/llama-3-8b-instruct:nitro	0.475	0.144	126
microsoft/phi-3-mini-128k-instruct	0.473	0.458	19
google/gemini-pro	0.460	0.585	152
mistralai/mistral-7b-instruct			3

Cell shape

Model	Balanced Accuracy	Precision	Sample Size
openai/gpt-4o	0.785	0.685	6597
anthropic/claude-3.5-sonnet	0.784	0.603	4490
meta-llama/llama-3-70b-instruct:nitro	0.783	0.619	4462
google/gemini-pro-1.5	0.780	0.583	5400
microsoft/wizardlm-2-8x22b	0.773	0.554	6573
openai/gpt-3.5-turbo-0125	0.765	0.637	6343
anthropic/claude-3-haiku:beta	0.756	0.631	4206
google/gemini-pro	0.751	0.568	4467
openai/gpt-4	0.738	0.601	6626
google/gemini-flash-1.5	0.729	0.564	4306
google/palm-2-chat-bison-32k	0.712	0.576	5440
mistralai/mixtral-8x7b-instruct:nitro	0.702	0.563	4127
mistralai/mistral-7b-instruct	0.653	0.509	4371
meta-llama/llama-3-8b-instruct:nitro	0.641	0.434	4502
openchat/openchat-7b	0.640	0.324	4368
microsoft/phi-3-mini-128k-instruct	0.625	0.282	5886
perplexity/llama-3-sonar-small-32k-chat	0.593	0.279	4177
google/gemma-7b-it	0.523	0.433	4633