B

BentoML✓

API BentoML

An Inference Platform built for speed and control, enabling deployment of any AI/ML model anywhere with tailored optimization, efficient scaling, and streamlined operations. It offers a complete solution to simplify inference infrastructure while giving full control over deployments.

Official Site ↗ Back to Directory

Free Limits

Hardware dependent性能

Community Votes

1

■ Available Models

Llama 3 8B InstructOpenLLM Generic

■ Tags

InferenceDeploymentModel ServingLLM ServingMLOpsContainerizationScalabilityCloudOn-PremiseHybrid Cloud

■ Related Providers

LLM路由平台，聚合多厂商免费模型。2026年5月最新免费模型包括Owl Alpha、NVIDIA Nemotron 3 Supe...

Google AI Studio

Google AI Studio 是基于网页的原型开发环境，2025年12月免费额度大幅缩减50-80%。Gemini 2.5 P...

Mistral (La Plateforme)

欧洲AI巨头Mistral的实验计划。需手机号验证+同意数据训练。免费层支持1请求/秒速率，500K TPM，每模型每月约1B T...

Hugging Face Inference

Hugging Face无服务器推理API，可访问200+模型。每月约$0.10免费额度，普通用户每小时约几百次请求。适合快速原型...