Merge branch 'main' into litellm_web_search_2

2025-04-25 10:44:24 +00:00 · 2025-03-22 14:35:32 -07:00 · 2025-03-22 14:35:32 -07:00 · 3a454d00df
commit 3a454d00df
parent bddbeff717 1f1f79d7cf
19 changed files with 1099 additions and 731 deletions
--- a/docs/my-website/docs/pass_through/vertex_ai.md
+++ b/docs/my-website/docs/pass_through/vertex_ai.md
@ -15,6 +15,91 @@ Pass-through endpoints for Vertex AI - call provider-specific endpoint, in nativ
 Just replace `https://REGION-aiplatform.googleapis.com` with `LITELLM_PROXY_BASE_URL/vertex_ai`
 LiteLLM supports 3 flows for calling Vertex AI endpoints via pass-through:
 1. **Specific Credentials**: Admin sets passthrough credentials for a specific project/region.
 2. **Default Credentials**: Admin sets default credentials.
 3. **Client-Side Credentials**: User can send client-side credentials through to Vertex AI (default behavior - if no default or mapped credentials are found, the request is passed through directly).
 ## Example Usage
 <Tabs>
 <TabItem value="specific_credentials" label="Specific Project/Region">
 ```yaml
 model_list:
  - model_name: gemini-1.0-pro
    litellm_params:
      model: vertex_ai/gemini-1.0-pro
      vertex_project: adroit-crow-413218
      vertex_region: us-central1
      vertex_credentials: /path/to/credentials.json
      use_in_pass_through: true # 👈 KEY CHANGE
 ```
 </TabItem>
 <TabItem value="default_credentials" label="Default Credentials">
 <Tabs>
 <TabItem value="yaml" label="Set in config.yaml">
 ```yaml
 default_vertex_config: 
  vertex_project: adroit-crow-413218
  vertex_region: us-central1
  vertex_credentials: /path/to/credentials.json
 ```
 </TabItem>
 <TabItem value="env_var" label="Set in environment variables">
 ```bash
 export DEFAULT_VERTEXAI_PROJECT="adroit-crow-413218"
 export DEFAULT_VERTEXAI_LOCATION="us-central1"
 export DEFAULT_GOOGLE_APPLICATION_CREDENTIALS="/path/to/credentials.json"
 ```
 </TabItem>
 </Tabs>
 </TabItem>
 <TabItem value="client_credentials" label="Client Credentials">
 Try Gemini 2.0 Flash (curl)
 ```
 MODEL_ID="gemini-2.0-flash-001"
 PROJECT_ID="YOUR_PROJECT_ID"
 ```
 ```bash
 curl \
  -X POST \
  -H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
  -H "Content-Type: application/json" \
  "${LITELLM_PROXY_BASE_URL}/vertex_ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/${MODEL_ID}:streamGenerateContent" -d \
  $'{
    "contents": {
      "role": "user",
      "parts": [
        {
        "fileData": {
          "mimeType": "image/png",
          "fileUri": "gs://generativeai-downloads/images/scones.jpg"
          }
        },
        {
          "text": "Describe this picture."
        }
      ]
    }
  }'
 ```
 </TabItem>
 </Tabs>
 #### **Example Usage**
@ -22,7 +107,7 @@ Just replace `https://REGION-aiplatform.googleapis.com` with `LITELLM_PROXY_BASE
 <TabItem value="curl" label="curl">
 ```bash
-curl http://localhost:4000/vertex_ai/publishers/google/models/gemini-1.0-pro:generateContent \
+curl http://localhost:4000/vertex_ai/vertex_ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/${MODEL_ID}:generateContent \
  -H "Content-Type: application/json" \
  -H "x-litellm-api-key: Bearer sk-1234" \
  -d '{
@ -101,7 +186,7 @@ litellm
 Let's call the Google AI Studio token counting endpoint
 ```bash
-curl http://localhost:4000/vertex-ai/publishers/google/models/gemini-1.0-pro:generateContent \
+curl http://localhost:4000/vertex-ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/gemini-1.0-pro:generateContent \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-1234" \
  -d '{
@ -140,7 +225,7 @@ LiteLLM Proxy Server supports two methods of authentication to Vertex AI:
 ```shell
-curl http://localhost:4000/vertex_ai/publishers/google/models/gemini-1.5-flash-001:generateContent \
+curl http://localhost:4000/vertex_ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/gemini-1.5-flash-001:generateContent \
  -H "Content-Type: application/json" \
  -H "x-litellm-api-key: Bearer sk-1234" \
  -d '{"contents":[{"role": "user", "parts":[{"text": "hi"}]}]}'
@ -152,7 +237,7 @@ curl http://localhost:4000/vertex_ai/publishers/google/models/gemini-1.5-flash-0
 ```shell
-curl http://localhost:4000/vertex_ai/publishers/google/models/textembedding-gecko@001:predict \
+curl http://localhost:4000/vertex_ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/textembedding-gecko@001:predict \
  -H "Content-Type: application/json" \
  -H "x-litellm-api-key: Bearer sk-1234" \
  -d '{"instances":[{"content": "gm"}]}'
@ -162,7 +247,7 @@ curl http://localhost:4000/vertex_ai/publishers/google/models/textembedding-geck
 ### Imagen API
 ```shell
-curl http://localhost:4000/vertex_ai/publishers/google/models/imagen-3.0-generate-001:predict \
+curl http://localhost:4000/vertex_ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/imagen-3.0-generate-001:predict \
  -H "Content-Type: application/json" \
  -H "x-litellm-api-key: Bearer sk-1234" \
  -d '{"instances":[{"prompt": "make an otter"}], "parameters": {"sampleCount": 1}}'
@ -172,7 +257,7 @@ curl http://localhost:4000/vertex_ai/publishers/google/models/imagen-3.0-generat
 ### Count Tokens API
 ```shell
-curl http://localhost:4000/vertex_ai/publishers/google/models/gemini-1.5-flash-001:countTokens \
+curl http://localhost:4000/vertex_ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/gemini-1.5-flash-001:countTokens \
  -H "Content-Type: application/json" \
  -H "x-litellm-api-key: Bearer sk-1234" \
  -d '{"contents":[{"role": "user", "parts":[{"text": "hi"}]}]}'
@ -183,7 +268,7 @@ Create Fine Tuning Job
 ```shell
-curl http://localhost:4000/vertex_ai/tuningJobs \
+curl http://localhost:4000/vertex_ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/gemini-1.5-flash-001:tuningJobs \
      -H "Content-Type: application/json" \
      -H "x-litellm-api-key: Bearer sk-1234" \
      -d '{
@ -243,7 +328,7 @@ Expected Response
 ```bash
-curl http://localhost:4000/vertex_ai/publishers/google/models/gemini-1.0-pro:generateContent \
+curl http://localhost:4000/vertex_ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/gemini-1.0-pro:generateContent \
  -H "Content-Type: application/json" \
  -H "x-litellm-api-key: Bearer sk-1234" \
  -d '{
@ -268,7 +353,7 @@ tags: ["vertex-js-sdk", "pass-through-endpoint"]
 <TabItem value="curl" label="curl">
 ```bash
-curl http://localhost:4000/vertex-ai/publishers/google/models/gemini-1.0-pro:generateContent \
+curl http://localhost:4000/vertex_ai/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/gemini-1.0-pro:generateContent \
  -H "Content-Type: application/json" \
  -H "x-litellm-api-key: Bearer sk-1234" \
  -H "tags: vertex-js-sdk,pass-through-endpoint" \
--- a/litellm/llms/vertex_ai/common_utils.py
+++ b/litellm/llms/vertex_ai/common_utils.py
@ -1,3 +1,4 @@
 import re
 from typing import Dict, List, Literal, Optional, Tuple, Union
 import httpx
@ -280,3 +281,81 @@ def _convert_vertex_datetime_to_openai_datetime(vertex_datetime: str) -> int:
    dt = datetime.strptime(vertex_datetime, "%Y-%m-%dT%H:%M:%S.%fZ")
    # Convert to Unix timestamp (seconds since epoch)
    return int(dt.timestamp())
 def get_vertex_project_id_from_url(url: str) -> Optional[str]:
    """
    Get the vertex project id from the url
    `https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/publishers/google/models/${MODEL_ID}:streamGenerateContent`
    """
    match = re.search(r"/projects/([^/]+)", url)
    return match.group(1) if match else None
 def get_vertex_location_from_url(url: str) -> Optional[str]:
    """
    Get the vertex location from the url
    `https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/publishers/google/models/${MODEL_ID}:streamGenerateContent`
    """
    match = re.search(r"/locations/([^/]+)", url)
    return match.group(1) if match else None
 def replace_project_and_location_in_route(
    requested_route: str, vertex_project: str, vertex_location: str
 ) -> str:
    """
    Replace project and location values in the route with the provided values
    """
    # Replace project and location values while keeping route structure
    modified_route = re.sub(
        r"/projects/[^/]+/locations/[^/]+/",
        f"/projects/{vertex_project}/locations/{vertex_location}/",
        requested_route,
    )
    return modified_route
 def construct_target_url(
    base_url: str,
    requested_route: str,
    vertex_location: Optional[str],
    vertex_project: Optional[str],
 ) -> httpx.URL:
    """
    Allow user to specify their own project id / location.
    If missing, use defaults
    Handle cachedContent scenario - https://github.com/BerriAI/litellm/issues/5460
    Constructed Url:
    POST https://LOCATION-aiplatform.googleapis.com/{version}/projects/PROJECT_ID/locations/LOCATION/cachedContents
    """
    new_base_url = httpx.URL(base_url)
    if "locations" in requested_route:  # contains the target project id + location
        if vertex_project and vertex_location:
            requested_route = replace_project_and_location_in_route(
                requested_route, vertex_project, vertex_location
            )
        return new_base_url.copy_with(path=requested_route)
    """
    - Add endpoint version (e.g. v1beta for cachedContent, v1 for rest)
    - Add default project id
    - Add default location
    """
    vertex_version: Literal["v1", "v1beta1"] = "v1"
    if "cachedContent" in requested_route:
        vertex_version = "v1beta1"
    base_requested_route = "{}/projects/{}/locations/{}".format(
        vertex_version, vertex_project, vertex_location
    )
    updated_requested_route = "/" + base_requested_route + requested_route
    updated_url = new_base_url.copy_with(path=updated_requested_route)
    return updated_url
--- a/litellm/model_prices_and_context_window_backup.json
+++ b/litellm/model_prices_and_context_window_backup.json
@ -82,6 +82,31 @@
        "supports_system_messages": true,
        "supports_tool_choice": true
    },
    "gpt-4o-search-preview-2025-03-11": {
        "max_tokens": 16384,
         "max_input_tokens": 128000,
         "max_output_tokens": 16384,
         "input_cost_per_token": 0.0000025,
         "output_cost_per_token": 0.000010,
         "input_cost_per_token_batches": 0.00000125,
         "output_cost_per_token_batches": 0.00000500,
         "cache_read_input_token_cost": 0.00000125,
         "litellm_provider": "openai",
         "mode": "chat",
         "supports_function_calling": true,
         "supports_parallel_function_calling": true,
         "supports_response_schema": true,
         "supports_vision": true,
         "supports_prompt_caching": true,
         "supports_system_messages": true,
         "supports_tool_choice": true,
         "supports_web_search": true,
         "search_context_cost_per_query": {
             "search_context_size_low": 0.030,
             "search_context_size_medium": 0.035,
             "search_context_size_high": 0.050
         }
     },
    "gpt-4o-search-preview": {
       "max_tokens": 16384,
        "max_input_tokens": 128000,
@ -232,6 +257,31 @@
        "supports_system_messages": true,
        "supports_tool_choice": true
    },
    "gpt-4o-mini-search-preview-2025-03-11":{
        "max_tokens": 16384,
        "max_input_tokens": 128000,
        "max_output_tokens": 16384,
        "input_cost_per_token": 0.00000015,
        "output_cost_per_token": 0.00000060,
        "input_cost_per_token_batches": 0.000000075,
        "output_cost_per_token_batches": 0.00000030,
        "cache_read_input_token_cost": 0.000000075,
        "litellm_provider": "openai",
        "mode": "chat",
        "supports_function_calling": true,
        "supports_parallel_function_calling": true,
        "supports_response_schema": true,
        "supports_vision": true,
        "supports_prompt_caching": true,
        "supports_system_messages": true,
        "supports_tool_choice": true,
        "supports_web_search": true,
        "search_context_cost_per_query": {
            "search_context_size_low": 0.025,
            "search_context_size_medium": 0.0275,
            "search_context_size_high": 0.030
        }
    },
    "gpt-4o-mini-search-preview": {
        "max_tokens": 16384,
        "max_input_tokens": 128000,
--- a/litellm/proxy/pass_through_endpoints/common_utils.py
+++ b/litellm/proxy/pass_through_endpoints/common_utils.py
@ -0,0 +1,16 @@
 from fastapi import Request
 def get_litellm_virtual_key(request: Request) -> str:
    """
    Extract and format API key from request headers.
    Prioritizes x-litellm-api-key over Authorization header.
    Vertex JS SDK uses `Authorization` header, we use `x-litellm-api-key` to pass litellm virtual key
    """
    litellm_api_key = request.headers.get("x-litellm-api-key")
    if litellm_api_key:
        return f"Bearer {litellm_api_key}"
    return request.headers.get("Authorization", "")
--- a/litellm/proxy/pass_through_endpoints/llm_passthrough_endpoints.py
+++ b/litellm/proxy/pass_through_endpoints/llm_passthrough_endpoints.py
@ -12,10 +12,13 @@ import httpx
 from fastapi import APIRouter, Depends, HTTPException, Request, Response
 import litellm
 from litellm._logging import verbose_proxy_logger
 from litellm.constants import BEDROCK_AGENT_RUNTIME_PASS_THROUGH_ROUTES
 from litellm.llms.vertex_ai.vertex_llm_base import VertexBase
 from litellm.proxy._types import *
 from litellm.proxy.auth.route_checks import RouteChecks
 from litellm.proxy.auth.user_api_key_auth import user_api_key_auth
 from litellm.proxy.pass_through_endpoints.common_utils import get_litellm_virtual_key
 from litellm.proxy.pass_through_endpoints.pass_through_endpoints import (
    create_pass_through_route,
 )
@ -23,6 +26,7 @@ from litellm.secret_managers.main import get_secret_str
 from .passthrough_endpoint_router import PassthroughEndpointRouter
 vertex_llm_base = VertexBase()
 router = APIRouter()
 default_vertex_config = None
@ -417,6 +421,138 @@ async def azure_proxy_route(
    )
@router.api_route(
    "/vertex-ai/{endpoint:path}",
    methods=["GET", "POST", "PUT", "DELETE", "PATCH"],
    tags=["Vertex AI Pass-through", "pass-through"],
    include_in_schema=False,
 )
@router.api_route(
    "/vertex_ai/{endpoint:path}",
    methods=["GET", "POST", "PUT", "DELETE", "PATCH"],
    tags=["Vertex AI Pass-through", "pass-through"],
 )
 async def vertex_proxy_route(
    endpoint: str,
    request: Request,
    fastapi_response: Response,
 ):
    """
    Call LiteLLM proxy via Vertex AI SDK.
    [Docs](https://docs.litellm.ai/docs/pass_through/vertex_ai)
    """
    from litellm.llms.vertex_ai.common_utils import (
        construct_target_url,
        get_vertex_location_from_url,
        get_vertex_project_id_from_url,
    )
    encoded_endpoint = httpx.URL(endpoint).path
    verbose_proxy_logger.debug("requested endpoint %s", endpoint)
    headers: dict = {}
    api_key_to_use = get_litellm_virtual_key(request=request)
    user_api_key_dict = await user_api_key_auth(
        request=request,
        api_key=api_key_to_use,
    )
    vertex_project: Optional[str] = get_vertex_project_id_from_url(endpoint)
    vertex_location: Optional[str] = get_vertex_location_from_url(endpoint)
    vertex_credentials = passthrough_endpoint_router.get_vertex_credentials(
        project_id=vertex_project,
        location=vertex_location,
    )
    headers_passed_through = False
    # Use headers from the incoming request if no vertex credentials are found
    if vertex_credentials is None or vertex_credentials.vertex_project is None:
        headers = dict(request.headers) or {}
        headers_passed_through = True
        verbose_proxy_logger.debug(
            "default_vertex_config  not set, incoming request headers %s", headers
        )
        base_target_url = f"https://{vertex_location}-aiplatform.googleapis.com/"
        headers.pop("content-length", None)
        headers.pop("host", None)
    else:
        vertex_project = vertex_credentials.vertex_project
        vertex_location = vertex_credentials.vertex_location
        vertex_credentials_str = vertex_credentials.vertex_credentials
        # Construct base URL for the target endpoint
        base_target_url = f"https://{vertex_location}-aiplatform.googleapis.com/"
        _auth_header, vertex_project = await vertex_llm_base._ensure_access_token_async(
            credentials=vertex_credentials_str,
            project_id=vertex_project,
            custom_llm_provider="vertex_ai_beta",
        )
        auth_header, _ = vertex_llm_base._get_token_and_url(
            model="",
            auth_header=_auth_header,
            gemini_api_key=None,
            vertex_credentials=vertex_credentials_str,
            vertex_project=vertex_project,
            vertex_location=vertex_location,
            stream=False,
            custom_llm_provider="vertex_ai_beta",
            api_base="",
        )
        headers = {
            "Authorization": f"Bearer {auth_header}",
        }
    request_route = encoded_endpoint
    verbose_proxy_logger.debug("request_route %s", request_route)
    # Ensure endpoint starts with '/' for proper URL construction
    if not encoded_endpoint.startswith("/"):
        encoded_endpoint = "/" + encoded_endpoint
    # Construct the full target URL using httpx
    updated_url = construct_target_url(
        base_url=base_target_url,
        requested_route=encoded_endpoint,
        vertex_location=vertex_location,
        vertex_project=vertex_project,
    )
    verbose_proxy_logger.debug("updated url %s", updated_url)
    ## check for streaming
    target = str(updated_url)
    is_streaming_request = False
    if "stream" in str(updated_url):
        is_streaming_request = True
        target += "?alt=sse"
    ## CREATE PASS-THROUGH
    endpoint_func = create_pass_through_route(
        endpoint=endpoint,
        target=target,
        custom_headers=headers,
    )  # dynamically construct pass-through endpoint based on incoming path
    try:
        received_value = await endpoint_func(
            request,
            fastapi_response,
            user_api_key_dict,
            stream=is_streaming_request,  # type: ignore
        )
    except Exception as e:
        if headers_passed_through:
            raise Exception(
                f"No credentials found on proxy for this request. Headers were passed through directly but request failed with error: {str(e)}"
            )
        else:
            raise e
    return received_value
@router.api_route(
    "/openai/{endpoint:path}",
    methods=["GET", "POST", "PUT", "DELETE", "PATCH"],
--- a/litellm/proxy/pass_through_endpoints/passthrough_endpoint_router.py
+++ b/litellm/proxy/pass_through_endpoints/passthrough_endpoint_router.py
@ -1,7 +1,9 @@
 from typing import Dict, Optional
-from litellm._logging import verbose_logger
+from litellm._logging import verbose_router_logger
 from litellm.secret_managers.main import get_secret_str
 from litellm.types.llms.vertex_ai import VERTEX_CREDENTIALS_TYPES
 from litellm.types.passthrough_endpoints.vertex_ai import VertexPassThroughCredentials
 class PassthroughEndpointRouter:
@ -11,6 +13,10 @@ class PassthroughEndpointRouter:
    def __init__(self):
        self.credentials: Dict[str, str] = {}
        self.deployment_key_to_vertex_credentials: Dict[
            str, VertexPassThroughCredentials
        ] = {}
        self.default_vertex_config: Optional[VertexPassThroughCredentials] = None
    def set_pass_through_credentials(
        self,
@ -45,14 +51,14 @@ class PassthroughEndpointRouter:
            custom_llm_provider=custom_llm_provider,
            region_name=region_name,
        )
-        verbose_logger.debug(
+        verbose_router_logger.debug(
            f"Pass-through llm endpoints router, looking for credentials for {credential_name}"
        )
        if credential_name in self.credentials:
-            verbose_logger.debug(f"Found credentials for {credential_name}")
+            verbose_router_logger.debug(f"Found credentials for {credential_name}")
            return self.credentials[credential_name]
        else:
-            verbose_logger.debug(
+            verbose_router_logger.debug(
                f"No credentials found for {credential_name}, looking for env variable"
            )
            _env_variable_name = (
@ -62,6 +68,100 @@ class PassthroughEndpointRouter:
            )
            return get_secret_str(_env_variable_name)
    def _get_vertex_env_vars(self) -> VertexPassThroughCredentials:
        """
        Helper to get vertex pass through config from environment variables
        The following environment variables are used:
        - DEFAULT_VERTEXAI_PROJECT (project id)
        - DEFAULT_VERTEXAI_LOCATION (location)
        - DEFAULT_GOOGLE_APPLICATION_CREDENTIALS (path to credentials file)
        """
        return VertexPassThroughCredentials(
            vertex_project=get_secret_str("DEFAULT_VERTEXAI_PROJECT"),
            vertex_location=get_secret_str("DEFAULT_VERTEXAI_LOCATION"),
            vertex_credentials=get_secret_str("DEFAULT_GOOGLE_APPLICATION_CREDENTIALS"),
        )
    def set_default_vertex_config(self, config: Optional[dict] = None):
        """Sets vertex configuration from provided config and/or environment variables
        Args:
            config (Optional[dict]): Configuration dictionary
            Example: {
                "vertex_project": "my-project-123",
                "vertex_location": "us-central1",
                "vertex_credentials": "os.environ/GOOGLE_CREDS"
            }
        """
        # Initialize config dictionary if None
        if config is None:
            self.default_vertex_config = self._get_vertex_env_vars()
            return
        if isinstance(config, dict):
            for key, value in config.items():
                if isinstance(value, str) and value.startswith("os.environ/"):
                    config[key] = get_secret_str(value)
        self.default_vertex_config = VertexPassThroughCredentials(**config)
    def add_vertex_credentials(
        self,
        project_id: str,
        location: str,
        vertex_credentials: VERTEX_CREDENTIALS_TYPES,
    ):
        """
        Add the vertex credentials for the given project-id, location
        """
        deployment_key = self._get_deployment_key(
            project_id=project_id,
            location=location,
        )
        if deployment_key is None:
            verbose_router_logger.debug(
                "No deployment key found for project-id, location"
            )
            return
        vertex_pass_through_credentials = VertexPassThroughCredentials(
            vertex_project=project_id,
            vertex_location=location,
            vertex_credentials=vertex_credentials,
        )
        self.deployment_key_to_vertex_credentials[deployment_key] = (
            vertex_pass_through_credentials
        )
    def _get_deployment_key(
        self, project_id: Optional[str], location: Optional[str]
    ) -> Optional[str]:
        """
        Get the deployment key for the given project-id, location
        """
        if project_id is None or location is None:
            return None
        return f"{project_id}-{location}"
    def get_vertex_credentials(
        self, project_id: Optional[str], location: Optional[str]
    ) -> Optional[VertexPassThroughCredentials]:
        """
        Get the vertex credentials for the given project-id, location
        """
        deployment_key = self._get_deployment_key(
            project_id=project_id,
            location=location,
        )
        if deployment_key is None:
            return self.default_vertex_config
        if deployment_key in self.deployment_key_to_vertex_credentials:
            return self.deployment_key_to_vertex_credentials[deployment_key]
        else:
            return self.default_vertex_config
    def _get_credential_name_for_provider(
        self,
        custom_llm_provider: str,
--- a/litellm/proxy/proxy_server.py
+++ b/litellm/proxy/proxy_server.py
@ -235,6 +235,9 @@ from litellm.proxy.openai_files_endpoints.files_endpoints import (
    router as openai_files_router,
 )
 from litellm.proxy.openai_files_endpoints.files_endpoints import set_files_config
 from litellm.proxy.pass_through_endpoints.llm_passthrough_endpoints import (
    passthrough_endpoint_router,
 )
 from litellm.proxy.pass_through_endpoints.llm_passthrough_endpoints import (
    router as llm_passthrough_router,
 )
@ -272,8 +275,6 @@ from litellm.proxy.utils import (
 from litellm.proxy.vertex_ai_endpoints.langfuse_endpoints import (
    router as langfuse_router,
 )
 from litellm.proxy.vertex_ai_endpoints.vertex_endpoints import router as vertex_router
 from litellm.proxy.vertex_ai_endpoints.vertex_endpoints import set_default_vertex_config
 from litellm.router import (
    AssistantsTypedDict,
    Deployment,
@ -2115,7 +2116,9 @@ class ProxyConfig:
        ## default config for vertex ai routes
        default_vertex_config = config.get("default_vertex_config", None)
-        set_default_vertex_config(config=default_vertex_config)
+        passthrough_endpoint_router.set_default_vertex_config(
            config=default_vertex_config
        )
        ## ROUTER SETTINGS (e.g. routing_strategy, ...)
        router_settings = config.get("router_settings", None)
@ -8161,7 +8164,6 @@ app.include_router(batches_router)
 app.include_router(rerank_router)
 app.include_router(fine_tuning_router)
 app.include_router(credential_router)
 app.include_router(vertex_router)
 app.include_router(llm_passthrough_router)
 app.include_router(anthropic_router)
 app.include_router(langfuse_router)
--- a/litellm/proxy/vertex_ai_endpoints/vertex_endpoints.py
+++ b/litellm/proxy/vertex_ai_endpoints/vertex_endpoints.py
@ -1,274 +0,0 @@
 import traceback
 from typing import Optional
 import httpx
 from fastapi import APIRouter, HTTPException, Request, Response, status
 import litellm
 from litellm._logging import verbose_proxy_logger
 from litellm.fine_tuning.main import vertex_fine_tuning_apis_instance
 from litellm.proxy._types import *
 from litellm.proxy.auth.user_api_key_auth import user_api_key_auth
 from litellm.proxy.pass_through_endpoints.pass_through_endpoints import (
    create_pass_through_route,
 )
 from litellm.secret_managers.main import get_secret_str
 from litellm.types.passthrough_endpoints.vertex_ai import *
 from .vertex_passthrough_router import VertexPassThroughRouter
 router = APIRouter()
 vertex_pass_through_router = VertexPassThroughRouter()
 default_vertex_config: VertexPassThroughCredentials = VertexPassThroughCredentials()
 def _get_vertex_env_vars() -> VertexPassThroughCredentials:
    """
    Helper to get vertex pass through config from environment variables
    The following environment variables are used:
    - DEFAULT_VERTEXAI_PROJECT (project id)
    - DEFAULT_VERTEXAI_LOCATION (location)
    - DEFAULT_GOOGLE_APPLICATION_CREDENTIALS (path to credentials file)
    """
    return VertexPassThroughCredentials(
        vertex_project=get_secret_str("DEFAULT_VERTEXAI_PROJECT"),
        vertex_location=get_secret_str("DEFAULT_VERTEXAI_LOCATION"),
        vertex_credentials=get_secret_str("DEFAULT_GOOGLE_APPLICATION_CREDENTIALS"),
    )
 def set_default_vertex_config(config: Optional[dict] = None):
    """Sets vertex configuration from provided config and/or environment variables
    Args:
        config (Optional[dict]): Configuration dictionary
        Example: {
            "vertex_project": "my-project-123",
            "vertex_location": "us-central1",
            "vertex_credentials": "os.environ/GOOGLE_CREDS"
        }
    """
    global default_vertex_config
    # Initialize config dictionary if None
    if config is None:
        default_vertex_config = _get_vertex_env_vars()
        return
    if isinstance(config, dict):
        for key, value in config.items():
            if isinstance(value, str) and value.startswith("os.environ/"):
                config[key] = litellm.get_secret(value)
    _set_default_vertex_config(VertexPassThroughCredentials(**config))
 def _set_default_vertex_config(
    vertex_pass_through_credentials: VertexPassThroughCredentials,
 ):
    global default_vertex_config
    default_vertex_config = vertex_pass_through_credentials
 def exception_handler(e: Exception):
    verbose_proxy_logger.error(
        "litellm.proxy.proxy_server.v1/projects/tuningJobs(): Exception occurred - {}".format(
            str(e)
        )
    )
    verbose_proxy_logger.debug(traceback.format_exc())
    if isinstance(e, HTTPException):
        return ProxyException(
            message=getattr(e, "message", str(e.detail)),
            type=getattr(e, "type", "None"),
            param=getattr(e, "param", "None"),
            code=getattr(e, "status_code", status.HTTP_400_BAD_REQUEST),
        )
    else:
        error_msg = f"{str(e)}"
        return ProxyException(
            message=getattr(e, "message", error_msg),
            type=getattr(e, "type", "None"),
            param=getattr(e, "param", "None"),
            code=getattr(e, "status_code", 500),
        )
 def construct_target_url(
    base_url: str,
    requested_route: str,
    default_vertex_location: Optional[str],
    default_vertex_project: Optional[str],
 ) -> httpx.URL:
    """
    Allow user to specify their own project id / location.
    If missing, use defaults
    Handle cachedContent scenario - https://github.com/BerriAI/litellm/issues/5460
    Constructed Url:
    POST https://LOCATION-aiplatform.googleapis.com/{version}/projects/PROJECT_ID/locations/LOCATION/cachedContents
    """
    new_base_url = httpx.URL(base_url)
    if "locations" in requested_route:  # contains the target project id + location
        updated_url = new_base_url.copy_with(path=requested_route)
        return updated_url
    """
    - Add endpoint version (e.g. v1beta for cachedContent, v1 for rest)
    - Add default project id
    - Add default location
    """
    vertex_version: Literal["v1", "v1beta1"] = "v1"
    if "cachedContent" in requested_route:
        vertex_version = "v1beta1"
    base_requested_route = "{}/projects/{}/locations/{}".format(
        vertex_version, default_vertex_project, default_vertex_location
    )
    updated_requested_route = "/" + base_requested_route + requested_route
    updated_url = new_base_url.copy_with(path=updated_requested_route)
    return updated_url
@router.api_route(
    "/vertex-ai/{endpoint:path}",
    methods=["GET", "POST", "PUT", "DELETE", "PATCH"],
    tags=["Vertex AI Pass-through", "pass-through"],
    include_in_schema=False,
 )
@router.api_route(
    "/vertex_ai/{endpoint:path}",
    methods=["GET", "POST", "PUT", "DELETE", "PATCH"],
    tags=["Vertex AI Pass-through", "pass-through"],
 )
 async def vertex_proxy_route(
    endpoint: str,
    request: Request,
    fastapi_response: Response,
 ):
    """
    Call LiteLLM proxy via Vertex AI SDK.
    [Docs](https://docs.litellm.ai/docs/pass_through/vertex_ai)
    """
    encoded_endpoint = httpx.URL(endpoint).path
    verbose_proxy_logger.debug("requested endpoint %s", endpoint)
    headers: dict = {}
    api_key_to_use = get_litellm_virtual_key(request=request)
    user_api_key_dict = await user_api_key_auth(
        request=request,
        api_key=api_key_to_use,
    )
    vertex_project: Optional[str] = (
        VertexPassThroughRouter._get_vertex_project_id_from_url(endpoint)
    )
    vertex_location: Optional[str] = (
        VertexPassThroughRouter._get_vertex_location_from_url(endpoint)
    )
    vertex_credentials = vertex_pass_through_router.get_vertex_credentials(
        project_id=vertex_project,
        location=vertex_location,
    )
    # Use headers from the incoming request if no vertex credentials are found
    if vertex_credentials.vertex_project is None:
        headers = dict(request.headers) or {}
        verbose_proxy_logger.debug(
            "default_vertex_config  not set, incoming request headers %s", headers
        )
        base_target_url = f"https://{vertex_location}-aiplatform.googleapis.com/"
        headers.pop("content-length", None)
        headers.pop("host", None)
    else:
        vertex_project = vertex_credentials.vertex_project
        vertex_location = vertex_credentials.vertex_location
        vertex_credentials_str = vertex_credentials.vertex_credentials
        # Construct base URL for the target endpoint
        base_target_url = f"https://{vertex_location}-aiplatform.googleapis.com/"
        _auth_header, vertex_project = (
            await vertex_fine_tuning_apis_instance._ensure_access_token_async(
                credentials=vertex_credentials_str,
                project_id=vertex_project,
                custom_llm_provider="vertex_ai_beta",
            )
        )
        auth_header, _ = vertex_fine_tuning_apis_instance._get_token_and_url(
            model="",
            auth_header=_auth_header,
            gemini_api_key=None,
            vertex_credentials=vertex_credentials_str,
            vertex_project=vertex_project,
            vertex_location=vertex_location,
            stream=False,
            custom_llm_provider="vertex_ai_beta",
            api_base="",
        )
        headers = {
            "Authorization": f"Bearer {auth_header}",
        }
    request_route = encoded_endpoint
    verbose_proxy_logger.debug("request_route %s", request_route)
    # Ensure endpoint starts with '/' for proper URL construction
    if not encoded_endpoint.startswith("/"):
        encoded_endpoint = "/" + encoded_endpoint
    # Construct the full target URL using httpx
    updated_url = construct_target_url(
        base_url=base_target_url,
        requested_route=encoded_endpoint,
        default_vertex_location=vertex_location,
        default_vertex_project=vertex_project,
    )
    # base_url = httpx.URL(base_target_url)
    # updated_url = base_url.copy_with(path=encoded_endpoint)
    verbose_proxy_logger.debug("updated url %s", updated_url)
    ## check for streaming
    target = str(updated_url)
    is_streaming_request = False
    if "stream" in str(updated_url):
        is_streaming_request = True
        target += "?alt=sse"
    ## CREATE PASS-THROUGH
    endpoint_func = create_pass_through_route(
        endpoint=endpoint,
        target=target,
        custom_headers=headers,
    )  # dynamically construct pass-through endpoint based on incoming path
    received_value = await endpoint_func(
        request,
        fastapi_response,
        user_api_key_dict,
        stream=is_streaming_request,  # type: ignore
    )
    return received_value
 def get_litellm_virtual_key(request: Request) -> str:
    """
    Extract and format API key from request headers.
    Prioritizes x-litellm-api-key over Authorization header.
    Vertex JS SDK uses `Authorization` header, we use `x-litellm-api-key` to pass litellm virtual key
    """
    litellm_api_key = request.headers.get("x-litellm-api-key")
    if litellm_api_key:
        return f"Bearer {litellm_api_key}"
    return request.headers.get("Authorization", "")
--- a/litellm/proxy/vertex_ai_endpoints/vertex_passthrough_router.py
+++ b/litellm/proxy/vertex_ai_endpoints/vertex_passthrough_router.py
@ -1,121 +0,0 @@
 import json
 import re
 from typing import Dict, Optional
 from litellm._logging import verbose_proxy_logger
 from litellm.proxy.vertex_ai_endpoints.vertex_endpoints import (
    VertexPassThroughCredentials,
 )
 from litellm.types.llms.vertex_ai import VERTEX_CREDENTIALS_TYPES
 class VertexPassThroughRouter:
    """
    Vertex Pass Through Router for Vertex AI pass-through endpoints
    - if request specifies a project-id, location -> use credentials corresponding to the project-id, location
    - if request does not specify a project-id, location -> use credentials corresponding to the DEFAULT_VERTEXAI_PROJECT, DEFAULT_VERTEXAI_LOCATION
    """
    def __init__(self):
        """
        Initialize the VertexPassThroughRouter
        Stores the vertex credentials for each deployment key
        ```
        {
            "project_id-location": VertexPassThroughCredentials,
            "adroit-crow-us-central1": VertexPassThroughCredentials,
        }
        ```
        """
        self.deployment_key_to_vertex_credentials: Dict[
            str, VertexPassThroughCredentials
        ] = {}
        pass
    def get_vertex_credentials(
        self, project_id: Optional[str], location: Optional[str]
    ) -> VertexPassThroughCredentials:
        """
        Get the vertex credentials for the given project-id, location
        """
        from litellm.proxy.vertex_ai_endpoints.vertex_endpoints import (
            default_vertex_config,
        )
        deployment_key = self._get_deployment_key(
            project_id=project_id,
            location=location,
        )
        if deployment_key is None:
            return default_vertex_config
        if deployment_key in self.deployment_key_to_vertex_credentials:
            return self.deployment_key_to_vertex_credentials[deployment_key]
        else:
            return default_vertex_config
    def add_vertex_credentials(
        self,
        project_id: str,
        location: str,
        vertex_credentials: VERTEX_CREDENTIALS_TYPES,
    ):
        """
        Add the vertex credentials for the given project-id, location
        """
        from litellm.proxy.vertex_ai_endpoints.vertex_endpoints import (
            _set_default_vertex_config,
        )
        deployment_key = self._get_deployment_key(
            project_id=project_id,
            location=location,
        )
        if deployment_key is None:
            verbose_proxy_logger.debug(
                "No deployment key found for project-id, location"
            )
            return
        vertex_pass_through_credentials = VertexPassThroughCredentials(
            vertex_project=project_id,
            vertex_location=location,
            vertex_credentials=vertex_credentials,
        )
        self.deployment_key_to_vertex_credentials[deployment_key] = (
            vertex_pass_through_credentials
        )
        verbose_proxy_logger.debug(
            f"self.deployment_key_to_vertex_credentials: {json.dumps(self.deployment_key_to_vertex_credentials, indent=4, default=str)}"
        )
        _set_default_vertex_config(vertex_pass_through_credentials)
    def _get_deployment_key(
        self, project_id: Optional[str], location: Optional[str]
    ) -> Optional[str]:
        """
        Get the deployment key for the given project-id, location
        """
        if project_id is None or location is None:
            return None
        return f"{project_id}-{location}"
    @staticmethod
    def _get_vertex_project_id_from_url(url: str) -> Optional[str]:
        """
        Get the vertex project id from the url
        `https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/publishers/google/models/${MODEL_ID}:streamGenerateContent`
        """
        match = re.search(r"/projects/([^/]+)", url)
        return match.group(1) if match else None
    @staticmethod
    def _get_vertex_location_from_url(url: str) -> Optional[str]:
        """
        Get the vertex location from the url
        `https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/publishers/google/models/${MODEL_ID}:streamGenerateContent`
        """
        match = re.search(r"/locations/([^/]+)", url)
        return match.group(1) if match else None
--- a/litellm/router.py
+++ b/litellm/router.py
@ -4495,11 +4495,11 @@ class Router:
        Each provider uses diff .env vars for pass-through endpoints, this helper uses the deployment credentials to set the .env vars for pass-through endpoints
        """
        if deployment.litellm_params.use_in_pass_through is True:
-            if custom_llm_provider == "vertex_ai":
+            from litellm.proxy.pass_through_endpoints.llm_passthrough_endpoints import (
-                from litellm.proxy.vertex_ai_endpoints.vertex_endpoints import (
+                passthrough_endpoint_router,
-                    vertex_pass_through_router,
+            )
                )
            if custom_llm_provider == "vertex_ai":
                if (
                    deployment.litellm_params.vertex_project is None
                    or deployment.litellm_params.vertex_location is None
@ -4508,16 +4508,12 @@ class Router:
                    raise ValueError(
                        "vertex_project, vertex_location, and vertex_credentials must be set in litellm_params for pass-through endpoints"
                    )
-                vertex_pass_through_router.add_vertex_credentials(
+                passthrough_endpoint_router.add_vertex_credentials(
                    project_id=deployment.litellm_params.vertex_project,
                    location=deployment.litellm_params.vertex_location,
                    vertex_credentials=deployment.litellm_params.vertex_credentials,
                )
            else:
                from litellm.proxy.pass_through_endpoints.llm_passthrough_endpoints import (
                    passthrough_endpoint_router,
                )
                passthrough_endpoint_router.set_pass_through_credentials(
                    custom_llm_provider=custom_llm_provider,
                    api_base=deployment.litellm_params.api_base,
--- a/model_prices_and_context_window.json
+++ b/model_prices_and_context_window.json
@ -82,6 +82,31 @@
        "supports_system_messages": true,
        "supports_tool_choice": true
    },
    "gpt-4o-search-preview-2025-03-11": {
        "max_tokens": 16384,
         "max_input_tokens": 128000,
         "max_output_tokens": 16384,
         "input_cost_per_token": 0.0000025,
         "output_cost_per_token": 0.000010,
         "input_cost_per_token_batches": 0.00000125,
         "output_cost_per_token_batches": 0.00000500,
         "cache_read_input_token_cost": 0.00000125,
         "litellm_provider": "openai",
         "mode": "chat",
         "supports_function_calling": true,
         "supports_parallel_function_calling": true,
         "supports_response_schema": true,
         "supports_vision": true,
         "supports_prompt_caching": true,
         "supports_system_messages": true,
         "supports_tool_choice": true,
         "supports_web_search": true,
         "search_context_cost_per_query": {
             "search_context_size_low": 0.030,
             "search_context_size_medium": 0.035,
             "search_context_size_high": 0.050
         }
     },
    "gpt-4o-search-preview": {
       "max_tokens": 16384,
        "max_input_tokens": 128000,
@ -232,6 +257,31 @@
        "supports_system_messages": true,
        "supports_tool_choice": true
    },
    "gpt-4o-mini-search-preview-2025-03-11":{
        "max_tokens": 16384,
        "max_input_tokens": 128000,
        "max_output_tokens": 16384,
        "input_cost_per_token": 0.00000015,
        "output_cost_per_token": 0.00000060,
        "input_cost_per_token_batches": 0.000000075,
        "output_cost_per_token_batches": 0.00000030,
        "cache_read_input_token_cost": 0.000000075,
        "litellm_provider": "openai",
        "mode": "chat",
        "supports_function_calling": true,
        "supports_parallel_function_calling": true,
        "supports_response_schema": true,
        "supports_vision": true,
        "supports_prompt_caching": true,
        "supports_system_messages": true,
        "supports_tool_choice": true,
        "supports_web_search": true,
        "search_context_cost_per_query": {
            "search_context_size_low": 0.025,
            "search_context_size_medium": 0.0275,
            "search_context_size_high": 0.030
        }
    },
    "gpt-4o-mini-search-preview": {
        "max_tokens": 16384,
        "max_input_tokens": 128000,
--- a/tests/litellm/llms/vertex_ai/test_vertex_ai_common_utils.py
+++ b/tests/litellm/llms/vertex_ai/test_vertex_ai_common_utils.py
@ -0,0 +1,43 @@
 import os
 import sys
 from unittest.mock import MagicMock, call, patch
 import pytest
 sys.path.insert(
    0, os.path.abspath("../../..")
 )  # Adds the parent directory to the system path
 import litellm
 from litellm.llms.vertex_ai.common_utils import (
    get_vertex_location_from_url,
    get_vertex_project_id_from_url,
 )
@pytest.mark.asyncio
 async def test_get_vertex_project_id_from_url():
    """Test _get_vertex_project_id_from_url with various URLs"""
    # Test with valid URL
    url = "https://us-central1-aiplatform.googleapis.com/v1/projects/test-project/locations/us-central1/publishers/google/models/gemini-pro:streamGenerateContent"
    project_id = get_vertex_project_id_from_url(url)
    assert project_id == "test-project"
    # Test with invalid URL
    url = "https://invalid-url.com"
    project_id = get_vertex_project_id_from_url(url)
    assert project_id is None
@pytest.mark.asyncio
 async def test_get_vertex_location_from_url():
    """Test _get_vertex_location_from_url with various URLs"""
    # Test with valid URL
    url = "https://us-central1-aiplatform.googleapis.com/v1/projects/test-project/locations/us-central1/publishers/google/models/gemini-pro:streamGenerateContent"
    location = get_vertex_location_from_url(url)
    assert location == "us-central1"
    # Test with invalid URL
    url = "https://invalid-url.com"
    location = get_vertex_location_from_url(url)
    assert location is None
--- a/tests/litellm/proxy/pass_through_endpoints/test_llm_pass_through_endpoints.py
+++ b/tests/litellm/proxy/pass_through_endpoints/test_llm_pass_through_endpoints.py
@ -1,7 +1,9 @@
 import json
 import os
 import sys
-from unittest.mock import MagicMock, patch
+import traceback
 from unittest import mock
 from unittest.mock import AsyncMock, MagicMock, Mock, patch
 import httpx
 import pytest
@ -17,7 +19,9 @@ from litellm.proxy.pass_through_endpoints.llm_passthrough_endpoints import (
    BaseOpenAIPassThroughHandler,
    RouteChecks,
    create_pass_through_route,
    vertex_proxy_route,
 )
 from litellm.types.passthrough_endpoints.vertex_ai import VertexPassThroughCredentials
 class TestBaseOpenAIPassThroughHandler:
@ -176,3 +180,279 @@ class TestBaseOpenAIPassThroughHandler:
        print(f"query_params: {call_kwargs['query_params']}")
        assert call_kwargs["stream"] is False
        assert call_kwargs["query_params"] == {"model": "gpt-4"}
 class TestVertexAIPassThroughHandler:
    """
    Case 1: User set passthrough credentials - confirm credentials used.
    Case 2: User set default credentials, no exact passthrough credentials - confirm default credentials used.
    Case 3: No default credentials, no mapped credentials - request passed through directly.
    """
    @pytest.mark.asyncio
    async def test_vertex_passthrough_with_credentials(self, monkeypatch):
        """
        Test that when passthrough credentials are set, they are correctly used in the request
        """
        from litellm.proxy.pass_through_endpoints.passthrough_endpoint_router import (
            PassthroughEndpointRouter,
        )
        vertex_project = "test-project"
        vertex_location = "us-central1"
        vertex_credentials = "test-creds"
        pass_through_router = PassthroughEndpointRouter()
        pass_through_router.add_vertex_credentials(
            project_id=vertex_project,
            location=vertex_location,
            vertex_credentials=vertex_credentials,
        )
        monkeypatch.setattr(
            "litellm.proxy.pass_through_endpoints.llm_passthrough_endpoints.passthrough_endpoint_router",
            pass_through_router,
        )
        endpoint = f"/v1/projects/{vertex_project}/locations/{vertex_location}/publishers/google/models/gemini-1.5-flash:generateContent"
        # Mock request
        mock_request = Request(
            scope={
                "type": "http",
                "method": "POST",
                "path": endpoint,
                "headers": [
                    (b"Authorization", b"Bearer test-creds"),
                    (b"Content-Type", b"application/json"),
                ],
            }
        )
        # Mock response
        mock_response = Response()
        # Mock vertex credentials
        test_project = vertex_project
        test_location = vertex_location
        test_token = vertex_credentials
        with mock.patch(
            "litellm.proxy.pass_through_endpoints.llm_passthrough_endpoints.vertex_llm_base._ensure_access_token_async"
        ) as mock_ensure_token, mock.patch(
            "litellm.proxy.pass_through_endpoints.llm_passthrough_endpoints.vertex_llm_base._get_token_and_url"
        ) as mock_get_token, mock.patch(
            "litellm.proxy.pass_through_endpoints.llm_passthrough_endpoints.create_pass_through_route"
        ) as mock_create_route:
            mock_ensure_token.return_value = ("test-auth-header", test_project)
            mock_get_token.return_value = (test_token, "")
            # Call the route
            try:
                await vertex_proxy_route(
                    endpoint=endpoint,
                    request=mock_request,
                    fastapi_response=mock_response,
                )
            except Exception as e:
                print(f"Error: {e}")
            # Verify create_pass_through_route was called with correct arguments
            mock_create_route.assert_called_once_with(
                endpoint=endpoint,
                target=f"https://{test_location}-aiplatform.googleapis.com/v1/projects/{test_project}/locations/{test_location}/publishers/google/models/gemini-1.5-flash:generateContent",
                custom_headers={"Authorization": f"Bearer {test_token}"},
            )
    @pytest.mark.parametrize(
        "initial_endpoint",
        [
            "publishers/google/models/gemini-1.5-flash:generateContent",
            "v1/projects/bad-project/locations/bad-location/publishers/google/models/gemini-1.5-flash:generateContent",
        ],
    )
    @pytest.mark.asyncio
    async def test_vertex_passthrough_with_default_credentials(
        self, monkeypatch, initial_endpoint
    ):
        """
        Test that when no passthrough credentials are set, default credentials are used in the request
        """
        from litellm.proxy.pass_through_endpoints.passthrough_endpoint_router import (
            PassthroughEndpointRouter,
        )
        # Setup default credentials
        default_project = "default-project"
        default_location = "us-central1"
        default_credentials = "default-creds"
        pass_through_router = PassthroughEndpointRouter()
        pass_through_router.default_vertex_config = VertexPassThroughCredentials(
            vertex_project=default_project,
            vertex_location=default_location,
            vertex_credentials=default_credentials,
        )
        monkeypatch.setattr(
            "litellm.proxy.pass_through_endpoints.llm_passthrough_endpoints.passthrough_endpoint_router",
            pass_through_router,
        )
        # Use different project/location in request than the default
        endpoint = initial_endpoint
        mock_request = Request(
            scope={
                "type": "http",
                "method": "POST",
                "path": f"/vertex_ai/{endpoint}",
                "headers": {},
            }
        )
        mock_response = Response()
        with mock.patch(
            "litellm.proxy.pass_through_endpoints.llm_passthrough_endpoints.vertex_llm_base._ensure_access_token_async"
        ) as mock_ensure_token, mock.patch(
            "litellm.proxy.pass_through_endpoints.llm_passthrough_endpoints.vertex_llm_base._get_token_and_url"
        ) as mock_get_token, mock.patch(
            "litellm.proxy.pass_through_endpoints.llm_passthrough_endpoints.create_pass_through_route"
        ) as mock_create_route:
            mock_ensure_token.return_value = ("test-auth-header", default_project)
            mock_get_token.return_value = (default_credentials, "")
            try:
                await vertex_proxy_route(
                    endpoint=endpoint,
                    request=mock_request,
                    fastapi_response=mock_response,
                )
            except Exception as e:
                traceback.print_exc()
                print(f"Error: {e}")
            # Verify default credentials were used
            mock_create_route.assert_called_once_with(
                endpoint=endpoint,
                target=f"https://{default_location}-aiplatform.googleapis.com/v1/projects/{default_project}/locations/{default_location}/publishers/google/models/gemini-1.5-flash:generateContent",
                custom_headers={"Authorization": f"Bearer {default_credentials}"},
            )
    @pytest.mark.asyncio
    async def test_vertex_passthrough_with_no_default_credentials(self, monkeypatch):
        """
        Test that when no default credentials are set, the request fails
        """
        """
        Test that when passthrough credentials are set, they are correctly used in the request
        """
        from litellm.proxy.pass_through_endpoints.passthrough_endpoint_router import (
            PassthroughEndpointRouter,
        )
        vertex_project = "my-project"
        vertex_location = "us-central1"
        vertex_credentials = "test-creds"
        test_project = "test-project"
        test_location = "test-location"
        test_token = "test-creds"
        pass_through_router = PassthroughEndpointRouter()
        pass_through_router.add_vertex_credentials(
            project_id=vertex_project,
            location=vertex_location,
            vertex_credentials=vertex_credentials,
        )
        monkeypatch.setattr(
            "litellm.proxy.pass_through_endpoints.llm_passthrough_endpoints.passthrough_endpoint_router",
            pass_through_router,
        )
        endpoint = f"/v1/projects/{test_project}/locations/{test_location}/publishers/google/models/gemini-1.5-flash:generateContent"
        # Mock request
        mock_request = Request(
            scope={
                "type": "http",
                "method": "POST",
                "path": endpoint,
                "headers": [
                    (b"authorization", b"Bearer test-creds"),
                ],
            }
        )
        # Mock response
        mock_response = Response()
        with mock.patch(
            "litellm.proxy.pass_through_endpoints.llm_passthrough_endpoints.vertex_llm_base._ensure_access_token_async"
        ) as mock_ensure_token, mock.patch(
            "litellm.proxy.pass_through_endpoints.llm_passthrough_endpoints.vertex_llm_base._get_token_and_url"
        ) as mock_get_token, mock.patch(
            "litellm.proxy.pass_through_endpoints.llm_passthrough_endpoints.create_pass_through_route"
        ) as mock_create_route:
            mock_ensure_token.return_value = ("test-auth-header", test_project)
            mock_get_token.return_value = (test_token, "")
            # Call the route
            try:
                await vertex_proxy_route(
                    endpoint=endpoint,
                    request=mock_request,
                    fastapi_response=mock_response,
                )
            except Exception as e:
                traceback.print_exc()
                print(f"Error: {e}")
            # Verify create_pass_through_route was called with correct arguments
            mock_create_route.assert_called_once_with(
                endpoint=endpoint,
                target=f"https://{test_location}-aiplatform.googleapis.com/v1/projects/{test_project}/locations/{test_location}/publishers/google/models/gemini-1.5-flash:generateContent",
                custom_headers={"authorization": f"Bearer {test_token}"},
            )
    @pytest.mark.asyncio
    async def test_async_vertex_proxy_route_api_key_auth(self):
        """
        Critical
        This is how Vertex AI JS SDK will Auth to Litellm Proxy
        """
        # Mock dependencies
        mock_request = Mock()
        mock_request.headers = {"x-litellm-api-key": "test-key-123"}
        mock_request.method = "POST"
        mock_response = Mock()
        with patch(
            "litellm.proxy.pass_through_endpoints.llm_passthrough_endpoints.user_api_key_auth"
        ) as mock_auth:
            mock_auth.return_value = {"api_key": "test-key-123"}
            with patch(
                "litellm.proxy.pass_through_endpoints.llm_passthrough_endpoints.create_pass_through_route"
            ) as mock_pass_through:
                mock_pass_through.return_value = AsyncMock(
                    return_value={"status": "success"}
                )
                # Call the function
                result = await vertex_proxy_route(
                    endpoint="v1/projects/test-project/locations/us-central1/publishers/google/models/gemini-1.5-pro:generateContent",
                    request=mock_request,
                    fastapi_response=mock_response,
                )
                # Verify user_api_key_auth was called with the correct Bearer token
                mock_auth.assert_called_once()
                call_args = mock_auth.call_args[1]
                assert call_args["api_key"] == "Bearer test-key-123"
--- a/tests/litellm/proxy/pass_through_endpoints/test_passthrough_endpoints_common_utils.py
+++ b/tests/litellm/proxy/pass_through_endpoints/test_passthrough_endpoints_common_utils.py
@ -0,0 +1,44 @@
 import json
 import os
 import sys
 import traceback
 from unittest import mock
 from unittest.mock import MagicMock, patch
 import httpx
 import pytest
 from fastapi import Request, Response
 from fastapi.testclient import TestClient
 sys.path.insert(
    0, os.path.abspath("../../../..")
 )  # Adds the parent directory to the system path
 from unittest.mock import Mock
 from litellm.proxy.pass_through_endpoints.common_utils import get_litellm_virtual_key
@pytest.mark.asyncio
 async def test_get_litellm_virtual_key():
    """
    Test that the get_litellm_virtual_key function correctly handles the API key authentication
    """
    # Test with x-litellm-api-key
    mock_request = Mock()
    mock_request.headers = {"x-litellm-api-key": "test-key-123"}
    result = get_litellm_virtual_key(mock_request)
    assert result == "Bearer test-key-123"
    # Test with Authorization header
    mock_request.headers = {"Authorization": "Bearer auth-key-456"}
    result = get_litellm_virtual_key(mock_request)
    assert result == "Bearer auth-key-456"
    # Test with both headers (x-litellm-api-key should take precedence)
    mock_request.headers = {
        "x-litellm-api-key": "test-key-123",
        "Authorization": "Bearer auth-key-456",
    }
    result = get_litellm_virtual_key(mock_request)
    assert result == "Bearer test-key-123"
--- a/tests/pass_through_unit_tests/test_pass_through_unit_tests.py
+++ b/tests/pass_through_unit_tests/test_pass_through_unit_tests.py
@ -339,9 +339,6 @@ def test_pass_through_routes_support_all_methods():
    from litellm.proxy.pass_through_endpoints.llm_passthrough_endpoints import (
        router as llm_router,
    )
    from litellm.proxy.vertex_ai_endpoints.vertex_endpoints import (
        router as vertex_router,
    )
    # Expected HTTP methods
    expected_methods = {"GET", "POST", "PUT", "DELETE", "PATCH"}
@ -361,7 +358,6 @@ def test_pass_through_routes_support_all_methods():
    # Check both routers
    check_router_methods(llm_router)
    check_router_methods(vertex_router)
 def test_is_bedrock_agent_runtime_route():
--- a/tests/pass_through_unit_tests/test_unit_test_passthrough_router.py
+++ b/tests/pass_through_unit_tests/test_unit_test_passthrough_router.py
@ -11,6 +11,7 @@ from unittest.mock import patch
 from litellm.proxy.pass_through_endpoints.passthrough_endpoint_router import (
    PassthroughEndpointRouter,
 )
 from litellm.types.passthrough_endpoints.vertex_ai import VertexPassThroughCredentials
 passthrough_endpoint_router = PassthroughEndpointRouter()
@ -132,3 +133,185 @@ class TestPassthroughEndpointRouter(unittest.TestCase):
            ),
            "COHERE_API_KEY",
        )
    def test_get_deployment_key(self):
        """Test _get_deployment_key with various inputs"""
        router = PassthroughEndpointRouter()
        # Test with valid inputs
        key = router._get_deployment_key("test-project", "us-central1")
        assert key == "test-project-us-central1"
        # Test with None values
        key = router._get_deployment_key(None, "us-central1")
        assert key is None
        key = router._get_deployment_key("test-project", None)
        assert key is None
        key = router._get_deployment_key(None, None)
        assert key is None
    def test_add_vertex_credentials(self):
        """Test add_vertex_credentials functionality"""
        router = PassthroughEndpointRouter()
        # Test adding valid credentials
        router.add_vertex_credentials(
            project_id="test-project",
            location="us-central1",
            vertex_credentials='{"credentials": "test-creds"}',
        )
        assert "test-project-us-central1" in router.deployment_key_to_vertex_credentials
        creds = router.deployment_key_to_vertex_credentials["test-project-us-central1"]
        assert creds.vertex_project == "test-project"
        assert creds.vertex_location == "us-central1"
        assert creds.vertex_credentials == '{"credentials": "test-creds"}'
        # Test adding with None values
        router.add_vertex_credentials(
            project_id=None,
            location=None,
            vertex_credentials='{"credentials": "test-creds"}',
        )
        # Should not add None values
        assert len(router.deployment_key_to_vertex_credentials) == 1
    def test_default_credentials(self):
        """
        Test get_vertex_credentials with stored credentials.
        Tests if default credentials are used if set.
        Tests if no default credentials are used, if no default set
        """
        router = PassthroughEndpointRouter()
        router.add_vertex_credentials(
            project_id="test-project",
            location="us-central1",
            vertex_credentials='{"credentials": "test-creds"}',
        )
        creds = router.get_vertex_credentials(
            project_id="test-project", location="us-central2"
        )
        assert creds is None
    def test_get_vertex_env_vars(self):
        """Test that _get_vertex_env_vars correctly reads environment variables"""
        # Set environment variables for the test
        os.environ["DEFAULT_VERTEXAI_PROJECT"] = "test-project-123"
        os.environ["DEFAULT_VERTEXAI_LOCATION"] = "us-central1"
        os.environ["DEFAULT_GOOGLE_APPLICATION_CREDENTIALS"] = "/path/to/creds"
        try:
            result = self.router._get_vertex_env_vars()
            print(result)
            # Verify the result
            assert isinstance(result, VertexPassThroughCredentials)
            assert result.vertex_project == "test-project-123"
            assert result.vertex_location == "us-central1"
            assert result.vertex_credentials == "/path/to/creds"
        finally:
            # Clean up environment variables
            del os.environ["DEFAULT_VERTEXAI_PROJECT"]
            del os.environ["DEFAULT_VERTEXAI_LOCATION"]
            del os.environ["DEFAULT_GOOGLE_APPLICATION_CREDENTIALS"]
    def test_set_default_vertex_config(self):
        """Test set_default_vertex_config with various inputs"""
        # Test with None config - set environment variables first
        os.environ["DEFAULT_VERTEXAI_PROJECT"] = "env-project"
        os.environ["DEFAULT_VERTEXAI_LOCATION"] = "env-location"
        os.environ["DEFAULT_GOOGLE_APPLICATION_CREDENTIALS"] = "env-creds"
        os.environ["GOOGLE_CREDS"] = "secret-creds"
        try:
            # Test with None config
            self.router.set_default_vertex_config()
            assert self.router.default_vertex_config.vertex_project == "env-project"
            assert self.router.default_vertex_config.vertex_location == "env-location"
            assert self.router.default_vertex_config.vertex_credentials == "env-creds"
            # Test with valid config.yaml settings on vertex_config
            test_config = {
                "vertex_project": "my-project-123",
                "vertex_location": "us-central1",
                "vertex_credentials": "path/to/creds",
            }
            self.router.set_default_vertex_config(test_config)
            assert self.router.default_vertex_config.vertex_project == "my-project-123"
            assert self.router.default_vertex_config.vertex_location == "us-central1"
            assert (
                self.router.default_vertex_config.vertex_credentials == "path/to/creds"
            )
            # Test with environment variable reference
            test_config = {
                "vertex_project": "my-project-123",
                "vertex_location": "us-central1",
                "vertex_credentials": "os.environ/GOOGLE_CREDS",
            }
            self.router.set_default_vertex_config(test_config)
            assert (
                self.router.default_vertex_config.vertex_credentials == "secret-creds"
            )
        finally:
            # Clean up environment variables
            del os.environ["DEFAULT_VERTEXAI_PROJECT"]
            del os.environ["DEFAULT_VERTEXAI_LOCATION"]
            del os.environ["DEFAULT_GOOGLE_APPLICATION_CREDENTIALS"]
            del os.environ["GOOGLE_CREDS"]
    def test_vertex_passthrough_router_init(self):
        """Test VertexPassThroughRouter initialization"""
        router = PassthroughEndpointRouter()
        assert isinstance(router.deployment_key_to_vertex_credentials, dict)
        assert len(router.deployment_key_to_vertex_credentials) == 0
    def test_get_vertex_credentials_none(self):
        """Test get_vertex_credentials with various inputs"""
        router = PassthroughEndpointRouter()
        router.set_default_vertex_config(
            config={
                "vertex_project": None,
                "vertex_location": None,
                "vertex_credentials": None,
            }
        )
        # Test with None project_id and location - should return default config
        creds = router.get_vertex_credentials(None, None)
        assert isinstance(creds, VertexPassThroughCredentials)
        # Test with valid project_id and location but no stored credentials
        creds = router.get_vertex_credentials("test-project", "us-central1")
        assert isinstance(creds, VertexPassThroughCredentials)
        assert creds.vertex_project is None
        assert creds.vertex_location is None
        assert creds.vertex_credentials is None
    def test_get_vertex_credentials_stored(self):
        """Test get_vertex_credentials with stored credentials"""
        router = PassthroughEndpointRouter()
        router.add_vertex_credentials(
            project_id="test-project",
            location="us-central1",
            vertex_credentials='{"credentials": "test-creds"}',
        )
        creds = router.get_vertex_credentials(
            project_id="test-project", location="us-central1"
        )
        assert creds.vertex_project == "test-project"
        assert creds.vertex_location == "us-central1"
        assert creds.vertex_credentials == '{"credentials": "test-creds"}'
--- a/tests/pass_through_unit_tests/test_unit_test_vertex_pass_through.py
+++ b/tests/pass_through_unit_tests/test_unit_test_vertex_pass_through.py
@ -1,294 +0,0 @@
 import json
 import os
 import sys
 from datetime import datetime
 from unittest.mock import AsyncMock, Mock, patch
 sys.path.insert(
    0, os.path.abspath("../..")
 )  # Adds the parent directory to the system-path
 import httpx
 import pytest
 import litellm
 from litellm.litellm_core_utils.litellm_logging import Logging as LiteLLMLoggingObj
 from litellm.proxy.vertex_ai_endpoints.vertex_endpoints import (
    get_litellm_virtual_key,
    vertex_proxy_route,
    _get_vertex_env_vars,
    set_default_vertex_config,
    VertexPassThroughCredentials,
    default_vertex_config,
 )
 from litellm.proxy.vertex_ai_endpoints.vertex_passthrough_router import (
    VertexPassThroughRouter,
 )
@pytest.mark.asyncio
 async def test_get_litellm_virtual_key():
    """
    Test that the get_litellm_virtual_key function correctly handles the API key authentication
    """
    # Test with x-litellm-api-key
    mock_request = Mock()
    mock_request.headers = {"x-litellm-api-key": "test-key-123"}
    result = get_litellm_virtual_key(mock_request)
    assert result == "Bearer test-key-123"
    # Test with Authorization header
    mock_request.headers = {"Authorization": "Bearer auth-key-456"}
    result = get_litellm_virtual_key(mock_request)
    assert result == "Bearer auth-key-456"
    # Test with both headers (x-litellm-api-key should take precedence)
    mock_request.headers = {
        "x-litellm-api-key": "test-key-123",
        "Authorization": "Bearer auth-key-456",
    }
    result = get_litellm_virtual_key(mock_request)
    assert result == "Bearer test-key-123"
@pytest.mark.asyncio
 async def test_async_vertex_proxy_route_api_key_auth():
    """
    Critical
    This is how Vertex AI JS SDK will Auth to Litellm Proxy
    """
    # Mock dependencies
    mock_request = Mock()
    mock_request.headers = {"x-litellm-api-key": "test-key-123"}
    mock_request.method = "POST"
    mock_response = Mock()
    with patch(
        "litellm.proxy.vertex_ai_endpoints.vertex_endpoints.user_api_key_auth"
    ) as mock_auth:
        mock_auth.return_value = {"api_key": "test-key-123"}
        with patch(
            "litellm.proxy.vertex_ai_endpoints.vertex_endpoints.create_pass_through_route"
        ) as mock_pass_through:
            mock_pass_through.return_value = AsyncMock(
                return_value={"status": "success"}
            )
            # Call the function
            result = await vertex_proxy_route(
                endpoint="v1/projects/test-project/locations/us-central1/publishers/google/models/gemini-1.5-pro:generateContent",
                request=mock_request,
                fastapi_response=mock_response,
            )
            # Verify user_api_key_auth was called with the correct Bearer token
            mock_auth.assert_called_once()
            call_args = mock_auth.call_args[1]
            assert call_args["api_key"] == "Bearer test-key-123"
@pytest.mark.asyncio
 async def test_get_vertex_env_vars():
    """Test that _get_vertex_env_vars correctly reads environment variables"""
    # Set environment variables for the test
    os.environ["DEFAULT_VERTEXAI_PROJECT"] = "test-project-123"
    os.environ["DEFAULT_VERTEXAI_LOCATION"] = "us-central1"
    os.environ["DEFAULT_GOOGLE_APPLICATION_CREDENTIALS"] = "/path/to/creds"
    try:
        result = _get_vertex_env_vars()
        print(result)
        # Verify the result
        assert isinstance(result, VertexPassThroughCredentials)
        assert result.vertex_project == "test-project-123"
        assert result.vertex_location == "us-central1"
        assert result.vertex_credentials == "/path/to/creds"
    finally:
        # Clean up environment variables
        del os.environ["DEFAULT_VERTEXAI_PROJECT"]
        del os.environ["DEFAULT_VERTEXAI_LOCATION"]
        del os.environ["DEFAULT_GOOGLE_APPLICATION_CREDENTIALS"]
@pytest.mark.asyncio
 async def test_set_default_vertex_config():
    """Test set_default_vertex_config with various inputs"""
    # Test with None config - set environment variables first
    os.environ["DEFAULT_VERTEXAI_PROJECT"] = "env-project"
    os.environ["DEFAULT_VERTEXAI_LOCATION"] = "env-location"
    os.environ["DEFAULT_GOOGLE_APPLICATION_CREDENTIALS"] = "env-creds"
    os.environ["GOOGLE_CREDS"] = "secret-creds"
    try:
        # Test with None config
        set_default_vertex_config()
        from litellm.proxy.vertex_ai_endpoints.vertex_endpoints import (
            default_vertex_config,
        )
        assert default_vertex_config.vertex_project == "env-project"
        assert default_vertex_config.vertex_location == "env-location"
        assert default_vertex_config.vertex_credentials == "env-creds"
        # Test with valid config.yaml settings on vertex_config
        test_config = {
            "vertex_project": "my-project-123",
            "vertex_location": "us-central1",
            "vertex_credentials": "path/to/creds",
        }
        set_default_vertex_config(test_config)
        from litellm.proxy.vertex_ai_endpoints.vertex_endpoints import (
            default_vertex_config,
        )
        assert default_vertex_config.vertex_project == "my-project-123"
        assert default_vertex_config.vertex_location == "us-central1"
        assert default_vertex_config.vertex_credentials == "path/to/creds"
        # Test with environment variable reference
        test_config = {
            "vertex_project": "my-project-123",
            "vertex_location": "us-central1",
            "vertex_credentials": "os.environ/GOOGLE_CREDS",
        }
        set_default_vertex_config(test_config)
        from litellm.proxy.vertex_ai_endpoints.vertex_endpoints import (
            default_vertex_config,
        )
        assert default_vertex_config.vertex_credentials == "secret-creds"
    finally:
        # Clean up environment variables
        del os.environ["DEFAULT_VERTEXAI_PROJECT"]
        del os.environ["DEFAULT_VERTEXAI_LOCATION"]
        del os.environ["DEFAULT_GOOGLE_APPLICATION_CREDENTIALS"]
        del os.environ["GOOGLE_CREDS"]
@pytest.mark.asyncio
 async def test_vertex_passthrough_router_init():
    """Test VertexPassThroughRouter initialization"""
    router = VertexPassThroughRouter()
    assert isinstance(router.deployment_key_to_vertex_credentials, dict)
    assert len(router.deployment_key_to_vertex_credentials) == 0
@pytest.mark.asyncio
 async def test_get_vertex_credentials_none():
    """Test get_vertex_credentials with various inputs"""
    from litellm.proxy.vertex_ai_endpoints import vertex_endpoints
    setattr(vertex_endpoints, "default_vertex_config", VertexPassThroughCredentials())
    router = VertexPassThroughRouter()
    # Test with None project_id and location - should return default config
    creds = router.get_vertex_credentials(None, None)
    assert isinstance(creds, VertexPassThroughCredentials)
    # Test with valid project_id and location but no stored credentials
    creds = router.get_vertex_credentials("test-project", "us-central1")
    assert isinstance(creds, VertexPassThroughCredentials)
    assert creds.vertex_project is None
    assert creds.vertex_location is None
    assert creds.vertex_credentials is None
@pytest.mark.asyncio
 async def test_get_vertex_credentials_stored():
    """Test get_vertex_credentials with stored credentials"""
    router = VertexPassThroughRouter()
    router.add_vertex_credentials(
        project_id="test-project",
        location="us-central1",
        vertex_credentials='{"credentials": "test-creds"}',
    )
    creds = router.get_vertex_credentials(
        project_id="test-project", location="us-central1"
    )
    assert creds.vertex_project == "test-project"
    assert creds.vertex_location == "us-central1"
    assert creds.vertex_credentials == '{"credentials": "test-creds"}'
@pytest.mark.asyncio
 async def test_add_vertex_credentials():
    """Test add_vertex_credentials functionality"""
    router = VertexPassThroughRouter()
    # Test adding valid credentials
    router.add_vertex_credentials(
        project_id="test-project",
        location="us-central1",
        vertex_credentials='{"credentials": "test-creds"}',
    )
    assert "test-project-us-central1" in router.deployment_key_to_vertex_credentials
    creds = router.deployment_key_to_vertex_credentials["test-project-us-central1"]
    assert creds.vertex_project == "test-project"
    assert creds.vertex_location == "us-central1"
    assert creds.vertex_credentials == '{"credentials": "test-creds"}'
    # Test adding with None values
    router.add_vertex_credentials(
        project_id=None,
        location=None,
        vertex_credentials='{"credentials": "test-creds"}',
    )
    # Should not add None values
    assert len(router.deployment_key_to_vertex_credentials) == 1
@pytest.mark.asyncio
 async def test_get_deployment_key():
    """Test _get_deployment_key with various inputs"""
    router = VertexPassThroughRouter()
    # Test with valid inputs
    key = router._get_deployment_key("test-project", "us-central1")
    assert key == "test-project-us-central1"
    # Test with None values
    key = router._get_deployment_key(None, "us-central1")
    assert key is None
    key = router._get_deployment_key("test-project", None)
    assert key is None
    key = router._get_deployment_key(None, None)
    assert key is None
@pytest.mark.asyncio
 async def test_get_vertex_project_id_from_url():
    """Test _get_vertex_project_id_from_url with various URLs"""
    # Test with valid URL
    url = "https://us-central1-aiplatform.googleapis.com/v1/projects/test-project/locations/us-central1/publishers/google/models/gemini-pro:streamGenerateContent"
    project_id = VertexPassThroughRouter._get_vertex_project_id_from_url(url)
    assert project_id == "test-project"
    # Test with invalid URL
    url = "https://invalid-url.com"
    project_id = VertexPassThroughRouter._get_vertex_project_id_from_url(url)
    assert project_id is None
@pytest.mark.asyncio
 async def test_get_vertex_location_from_url():
    """Test _get_vertex_location_from_url with various URLs"""
    # Test with valid URL
    url = "https://us-central1-aiplatform.googleapis.com/v1/projects/test-project/locations/us-central1/publishers/google/models/gemini-pro:streamGenerateContent"
    location = VertexPassThroughRouter._get_vertex_location_from_url(url)
    assert location == "us-central1"
    # Test with invalid URL
    url = "https://invalid-url.com"
    location = VertexPassThroughRouter._get_vertex_location_from_url(url)
    assert location is None
--- a/tests/proxy_admin_ui_tests/test_route_check_unit_tests.py
+++ b/tests/proxy_admin_ui_tests/test_route_check_unit_tests.py
@ -30,9 +30,6 @@ from litellm.proxy._types import LiteLLM_UserTable, LitellmUserRoles, UserAPIKey
 from litellm.proxy.pass_through_endpoints.llm_passthrough_endpoints import (
    router as llm_passthrough_router,
 )
 from litellm.proxy.vertex_ai_endpoints.vertex_endpoints import (
    router as vertex_router,
 )
 # Replace the actual hash_token function with our mock
 import litellm.proxy.auth.route_checks
@ -96,7 +93,7 @@ def test_is_llm_api_route():
    assert RouteChecks.is_llm_api_route("/key/regenerate/82akk800000000jjsk") is False
    assert RouteChecks.is_llm_api_route("/key/82akk800000000jjsk/delete") is False
-    all_llm_api_routes = vertex_router.routes + llm_passthrough_router.routes
+    all_llm_api_routes = llm_passthrough_router.routes
    # check all routes in llm_passthrough_router, ensure they are considered llm api routes
    for route in all_llm_api_routes:
--- a/tests/router_unit_tests/test_router_adding_deployments.py
+++ b/tests/router_unit_tests/test_router_adding_deployments.py
@ -36,11 +36,11 @@ def test_initialize_deployment_for_pass_through_success():
    )
    # Verify the credentials were properly set
-    from litellm.proxy.vertex_ai_endpoints.vertex_endpoints import (
+    from litellm.proxy.pass_through_endpoints.llm_passthrough_endpoints import (
-        vertex_pass_through_router,
+        passthrough_endpoint_router,
    )
-    vertex_creds = vertex_pass_through_router.get_vertex_credentials(
+    vertex_creds = passthrough_endpoint_router.get_vertex_credentials(
        project_id="test-project", location="us-central1"
    )
    assert vertex_creds.vertex_project == "test-project"
@ -123,21 +123,21 @@ def test_add_vertex_pass_through_deployment():
    router.add_deployment(deployment)
    # Get the vertex credentials from the router
-    from litellm.proxy.vertex_ai_endpoints.vertex_endpoints import (
+    from litellm.proxy.pass_through_endpoints.llm_passthrough_endpoints import (
-        vertex_pass_through_router,
+        passthrough_endpoint_router,
    )
    # current state of pass-through vertex router
    print("\n vertex_pass_through_router.deployment_key_to_vertex_credentials\n\n")
    print(
        json.dumps(
-            vertex_pass_through_router.deployment_key_to_vertex_credentials,
+            passthrough_endpoint_router.deployment_key_to_vertex_credentials,
            indent=4,
            default=str,
        )
    )
-    vertex_creds = vertex_pass_through_router.get_vertex_credentials(
+    vertex_creds = passthrough_endpoint_router.get_vertex_credentials(
        project_id="test-project", location="us-central1"
    )