Skip to main content

Databricks

LiteLLM supports all models on Databricks

tip

We support ALL Databricks models, just set model=databricks/<any-model-on-databricks> as a prefix when sending litellm requests

Authentication​

LiteLLM supports multiple authentication methods for Databricks, listed in order of preference:

OAuth Machine-to-Machine authentication using Service Principal credentials is the recommended method for production deployments per Databricks Partner requirements.

import os
from litellm import completion

# Set OAuth credentials (Service Principal)
os.environ["DATABRICKS_CLIENT_ID"] = "your-service-principal-application-id"
os.environ["DATABRICKS_CLIENT_SECRET"] = "your-service-principal-secret"
os.environ["DATABRICKS_API_BASE"] = "https://adb-xxx.azuredatabricks.net/serving-endpoints"

response = completion(
model="databricks/databricks-dbrx-instruct",
messages=[{"role": "user", "content": "Hello!"}],
)

Personal Access Token (PAT)​

PAT authentication is supported for development and testing scenarios.

import os
from litellm import completion

os.environ["DATABRICKS_API_KEY"] = "dapi..." # Your Personal Access Token
os.environ["DATABRICKS_API_BASE"] = "https://adb-xxx.azuredatabricks.net/serving-endpoints"

response = completion(
model="databricks/databricks-dbrx-instruct",
messages=[{"role": "user", "content": "Hello!"}],
)

Databricks SDK Authentication (Automatic)​

If no credentials are provided, LiteLLM will use the Databricks SDK for automatic authentication. This supports OAuth, Azure AD, and other unified auth methods configured in your environment.

from litellm import completion

# No environment variables needed - uses Databricks SDK unified auth
# Requires: pip install databricks-sdk
response = completion(
model="databricks/databricks-dbrx-instruct",
messages=[{"role": "user", "content": "Hello!"}],
)

Custom User-Agent for Partner Attribution​

If you're building a product on top of LiteLLM that integrates with Databricks, you can pass your own partner identifier for proper attribution in Databricks telemetry.

The partner name will be prefixed to the LiteLLM user agent:

# Via parameter
response = completion(
model="databricks/databricks-dbrx-instruct",
messages=[{"role": "user", "content": "Hello!"}],
user_agent="mycompany/1.0.0",
)
# Resulting User-Agent: mycompany_litellm/1.79.1

# Via environment variable
os.environ["DATABRICKS_USER_AGENT"] = "mycompany/1.0.0"
# Resulting User-Agent: mycompany_litellm/1.79.1
InputResulting User-Agent
(none)litellm/1.79.1
mycompany/1.0.0mycompany_litellm/1.79.1
partner_product/2.5.0partner_product_litellm/1.79.1
acmeacme_litellm/1.79.1

Note: The version from your custom user agent is ignored; LiteLLM's version is always used.

Security​

LiteLLM automatically redacts sensitive information (tokens, secrets, API keys) from all debug logs to prevent credential leakage. This includes:

  • Authorization headers
  • API keys and tokens
  • Client secrets
  • Personal access tokens (PATs)

Usage​

ENV VAR​

import os 
os.environ["DATABRICKS_API_KEY"] = ""
os.environ["DATABRICKS_API_BASE"] = ""

Example Call​

from litellm import completion
import os
## set ENV variables
os.environ["DATABRICKS_API_KEY"] = "databricks key"
os.environ["DATABRICKS_API_BASE"] = "databricks base url" # e.g.: https://adb-3064715882934586.6.azuredatabricks.net/serving-endpoints

# Databricks dbrx-instruct call
response = completion(
model="databricks/databricks-dbrx-instruct",
messages = [{ "content": "Hello, how are you?","role": "user"}]
)

Passing additional params - max_tokens, temperature​

See all litellm.completion supported params here

# !pip install litellm
from litellm import completion
import os
## set ENV variables
os.environ["DATABRICKS_API_KEY"] = "databricks key"
os.environ["DATABRICKS_API_BASE"] = "databricks api base"

# databricks dbrx call
response = completion(
model="databricks/databricks-dbrx-instruct",
messages = [{ "content": "Hello, how are you?","role": "user"}],
max_tokens=20,
temperature=0.5
)

proxy

  model_list:
- model_name: llama-3
litellm_params:
model: databricks/databricks-meta-llama-3-70b-instruct
api_key: os.environ/DATABRICKS_API_KEY
max_tokens: 20
temperature: 0.5

Usage - Thinking / reasoning_content​

LiteLLM translates OpenAI's reasoning_effort to Anthropic's thinking parameter. Code

reasoning_effortthinking
"low""budget_tokens": 1024
"medium""budget_tokens": 2048
"high""budget_tokens": 4096

Known Limitations:

  • Support for passing thinking blocks back to Claude Issue
from litellm import completion
import os

# set ENV variables (can also be passed in to .completion() - e.g. `api_base`, `api_key`)
os.environ["DATABRICKS_API_KEY"] = "databricks key"
os.environ["DATABRICKS_API_BASE"] = "databricks base url"

resp = completion(
model="databricks/databricks-claude-3-7-sonnet",
messages=[{"role": "user", "content": "What is the capital of France?"}],
reasoning_effort="low",
)

Expected Response

ModelResponse(
id='chatcmpl-c542d76d-f675-4e87-8e5f-05855f5d0f5e',
created=1740470510,
model='claude-3-7-sonnet-20250219',
object='chat.completion',
system_fingerprint=None,
choices=[
Choices(
finish_reason='stop',
index=0,
message=Message(
content="The capital of France is Paris.",
role='assistant',
tool_calls=None,
function_call=None,
provider_specific_fields={
'citations': None,
'thinking_blocks': [
{
'type': 'thinking',
'thinking': 'The capital of France is Paris. This is a very straightforward factual question.',
'signature': 'EuYBCkQYAiJAy6...'
}
]
}
),
thinking_blocks=[
{
'type': 'thinking',
'thinking': 'The capital of France is Paris. This is a very straightforward factual question.',
'signature': 'EuYBCkQYAiJAy6AGB...'
}
],
reasoning_content='The capital of France is Paris. This is a very straightforward factual question.'
)
],
usage=Usage(
completion_tokens=68,
prompt_tokens=42,
total_tokens=110,
completion_tokens_details=None,
prompt_tokens_details=PromptTokensDetailsWrapper(
audio_tokens=None,
cached_tokens=0,
text_tokens=None,
image_tokens=None
),
cache_creation_input_tokens=0,
cache_read_input_tokens=0
)
)

Citations​

Anthropic models served through Databricks can return citation metadata. LiteLLM exposes these via response.choices[0].message.provider_specific_fields["citations"].

Pass thinking to Anthropic models​

You can also pass the thinking parameter to Anthropic models.

You can also pass the thinking parameter to Anthropic models.

from litellm import completion
import os

# set ENV variables (can also be passed in to .completion() - e.g. `api_base`, `api_key`)
os.environ["DATABRICKS_API_KEY"] = "databricks key"
os.environ["DATABRICKS_API_BASE"] = "databricks base url"

response = litellm.completion(
model="databricks/databricks-claude-3-7-sonnet",
messages=[{"role": "user", "content": "What is the capital of France?"}],
thinking={"type": "enabled", "budget_tokens": 1024},
)

Supported Databricks Chat Completion Models​

tip

We support ALL Databricks models, just set model=databricks/<any-model-on-databricks> as a prefix when sending litellm requests

Model NameCommand
databricks/databricks-claude-3-7-sonnetcompletion(model='databricks/databricks/databricks-claude-3-7-sonnet', messages=messages)
databricks-meta-llama-3-1-70b-instructcompletion(model='databricks/databricks-meta-llama-3-1-70b-instruct', messages=messages)
databricks-meta-llama-3-1-405b-instructcompletion(model='databricks/databricks-meta-llama-3-1-405b-instruct', messages=messages)
databricks-dbrx-instructcompletion(model='databricks/databricks-dbrx-instruct', messages=messages)
databricks-meta-llama-3-70b-instructcompletion(model='databricks/databricks-meta-llama-3-70b-instruct', messages=messages)
databricks-llama-2-70b-chatcompletion(model='databricks/databricks-llama-2-70b-chat', messages=messages)
databricks-mixtral-8x7b-instructcompletion(model='databricks/databricks-mixtral-8x7b-instruct', messages=messages)
databricks-mpt-30b-instructcompletion(model='databricks/databricks-mpt-30b-instruct', messages=messages)
databricks-mpt-7b-instructcompletion(model='databricks/databricks-mpt-7b-instruct', messages=messages)

Embedding Models​

Passing Databricks specific params - 'instruction'​

For embedding models, databricks lets you pass in an additional param 'instruction'. Full Spec

# !pip install litellm
from litellm import embedding
import os
## set ENV variables
os.environ["DATABRICKS_API_KEY"] = "databricks key"
os.environ["DATABRICKS_API_BASE"] = "databricks url"

# Databricks bge-large-en call
response = litellm.embedding(
model="databricks/databricks-bge-large-en",
input=["good morning from litellm"],
instruction="Represent this sentence for searching relevant passages:",
)

proxy

  model_list:
- model_name: bge-large
litellm_params:
model: databricks/databricks-bge-large-en
api_key: os.environ/DATABRICKS_API_KEY
api_base: os.environ/DATABRICKS_API_BASE
instruction: "Represent this sentence for searching relevant passages:"

Supported Databricks Embedding Models​

tip

We support ALL Databricks models, just set model=databricks/<any-model-on-databricks> as a prefix when sending litellm requests

Model NameCommand
databricks-bge-large-enembedding(model='databricks/databricks-bge-large-en', messages=messages)
databricks-gte-large-enembedding(model='databricks/databricks-gte-large-en', messages=messages)