当前位置：首页 > article >正文

litellm - 简化大模型 API 调用的工具

article 2025/2/21 3:02:01

更多AI开源软件：

AI开源 - 小众AIhttps://www.aiinn.cn/sources

11000 Stars 1300 Forks 445 Issues 275 贡献者 MIT License Python 语言

代码: GitHub - BerriAI/litellm: Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]

主页: https://docs.litellm.ai/

litellm该项目能够将各种 AI 大模型和服务的接口，统一转换成 OpenAI 的格式，简化了在不同 AI 服务/大模型切换和管理的工作。此外，它还支持设置预算、限制请求频率、管理 API Key 和配置 OpenAI 代理服务器等功能。

liteLLM

主要功能

使用相同的输入输出格式调用100多个大型语言模型（LLMs）

将输入转换为服务提供商的完成、嵌入和图像生成等端点所需的格式
确保输出的一致性，文本响应始终位于[‘choices’][0][‘message’][‘content’]路径下
在多个部署（如Azure、OpenAI等）之间实现重试/回退逻辑 - 路由器
跟踪支出并为每个项目设置预算 - OpenAI代理服务器

使用说明

LiteLLM v1.0.0 now requires openai>=1.0.0. Migration guide here
LiteLLM v1.40.14+ now requires pydantic>=2.0.0. No changes required.

pip install litellm

from litellm import completion
import os

## set ENV variables
os.environ["OPENAI_API_KEY"] = "your-openai-key"
os.environ["COHERE_API_KEY"] = "your-cohere-key"

messages = [{ "content": "Hello, how are you?","role": "user"}]

# openai call
response = completion(model="gpt-3.5-turbo", messages=messages)

# cohere call
response = completion(model="command-nightly", messages=messages)
print(response)

Call any model supported by a provider, with model=<provider_name>/<model_name>. There might be provider-specific details here, so refer to provider docs for more information

异步

from litellm import acompletion
import asyncio

async def test_get_response():
    user_message = "Hello, how are you?"
    messages = [{"content": user_message, "role": "user"}]
    response = await acompletion(model="gpt-3.5-turbo", messages=messages)
    return response

response = asyncio.run(test_get_response())
print(response)

流

liteLLM supports streaming the model response back, pass stream=True to get a streaming iterator in response.
Streaming is supported for all models (Bedrock, Huggingface, TogetherAI, Azure, OpenAI, etc.)

from litellm import completion
response = completion(model="gpt-3.5-turbo", messages=messages, stream=True)
for part in response:
    print(part.choices[0].delta.content or "")

# claude 2
response = completion('claude-2', messages, stream=True)
for part in response:
    print(part.choices[0].delta.content or "")

日志

LiteLLM exposes pre defined callbacks to send data to Lunary, Langfuse, DynamoDB, s3 Buckets, Helicone, Promptlayer, Traceloop, Athina, Slack

from litellm import completion

## set env variables for logging tools
os.environ["LUNARY_PUBLIC_KEY"] = "your-lunary-public-key"
os.environ["HELICONE_API_KEY"] = "your-helicone-auth-key"
os.environ["LANGFUSE_PUBLIC_KEY"] = ""
os.environ["LANGFUSE_SECRET_KEY"] = ""
os.environ["ATHINA_API_KEY"] = "your-athina-api-key"

os.environ["OPENAI_API_KEY"]

# set callbacks
litellm.success_callback = ["lunary", "langfuse", "athina", "helicone"] # log input/output to lunary, langfuse, supabase, athina, helicone etc

#openai call
response = completion(model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Hi 👋 - i'm openai"}])

查看全文

http://www.kler.cn/a/404001.html