Workload identity for LLM connections is currently available to enterprise SaaS customers on CrewAI AMP. Contact your CrewAI account team to enable it for your organization before starting this guide.
Version requirements
| Component | Required version | Notes |
|---|---|---|
| CrewAI AMP | Early access (per-organization feature flag) | Contact CrewAI support to enable Workload Identity Configs and LLM workload identity on your org. |
CrewAI Python SDK (crewai) | 1.14.3 or higher | Crews built from this version (or later) include the OIDC token fetch and GCP credential setup needed for Vertex workload identity. |
| LLM provider | Google Gen AI SDK (google/ model prefix) | Required. LiteLLM’s vertex_ai/* provider is not supported with workload identity. Use the google/ prefix on your LLM connection’s model field — for example google/gemini-2.5-pro, google/gemini-2.5-flash, google/gemini-2.0-flash. |
| Google Cloud APIs | iam.googleapis.com, iamcredentials.googleapis.com, sts.googleapis.com, aiplatform.googleapis.com | All four must be enabled on the target project (see Part 1, step 1). |
Overview
CrewAI AMP can authenticate to Google Vertex AI using GCP Workload Identity Federation instead of long-lived service account keys. At kickoff, your crew execution fetches a short-lived OIDC token from AMP scoped to your organization and writes a Google Application Default Credentials (ADC)external_account configuration that points at it. The Google Gen AI SDK (invoked via CrewAI’s google/ model prefix) then transparently exchanges that OIDC token at GCP STS, optionally impersonates a service account, and calls Vertex AI — all in-process inside the running crew.
The result:
- No Google credentials stored in CrewAI AMP — no service account JSON keys, no API keys. AMP holds only the OIDC signing key it uses to mint tokens.
- Trust is anchored in your GCP project. You decide which CrewAI organization can impersonate which service account.
- The STS exchange happens inside the crew execution, not in AMP’s control plane. AMP only mints OIDC tokens; the Google credentials returned by GCP are never seen or persisted by AMP — they live and die inside a single execution.
- Access tokens are refreshed automatically, and the underlying OIDC subject token is rotated before expiry — long-running crews are supported (with one edge case noted below).
How it works
GCP fetches AMP’s public signing keys from a standard OIDC discovery endpoint and validates each token before exchanging it. AMP never sees your GCP service account key, and the federated/SA tokens minted by GCP stay inside the crew execution that requested them — they are not returned to or persisted by AMP’s control plane.Prerequisites
- A GCP project with Vertex AI enabled (
aiplatform.googleapis.com). - The
gcloudCLI authenticated as a user with IAM admin on that project. See Appendix: minimum IAM for the specific roles required. - Your CrewAI organization UUID. Find it in CrewAI AMP at Settings → Organization (use the UUID, not the numeric ID).
- Workload identity for LLM connections enabled on your AMP organization — contact CrewAI support.
Part 1 — GCP setup
Create the OIDC provider inside the pool
The Record the full provider resource name — you’ll need it in Part 2:
attribute-condition is the critical security boundary — it restricts which CrewAI organization can assume any identity from this pool. Replace YOUR_ORG_UUID with your AMP organization UUID.Create a Vertex AI service account
crewai-vertex is an example name — pick anything that fits your naming conventions, but use the same value in the impersonation binding (next step) and on the LLM connection (Part 2).roles/aiplatform.user is the minimum role needed for generateContent and predict. Tighten further with custom roles if your security policy requires it.Part 2 — CrewAI AMP setup
Create a Workload Identity Config
In AMP, go to Settings → Workload Identity Configs → New and fill in:
Creating workload identity configs requires a role with manage access to LLM connections (see RBAC).
| Field | Value |
|---|---|
| Name | A memorable label, e.g. vertex-ai-prod |
| Cloud provider | GCP |
| GCP Workload Identity Provider | The full resource name from Part 1, step 3 (projects/PROJECT_NUMBER/locations/global/workloadIdentityPools/crewai-amp/providers/crewai-amp-oidc) |
| Default for GCP | Optional — marks this as the default GCP config for new connections |
Attach the config to a Vertex LLM connection
Go to LLM Connections → New (or edit an existing one) and select:
- Provider:
Vertex - Workload Identity Config: the config from the previous step
- GCP Service Account Email: the SA you created in Part 1 (e.g.,
crewai-vertex@PROJECT_ID.iam.gserviceaccount.com)
GOOGLE_API_KEY environment variable is required — leave that empty. For region, add a single connection-scoped env var:GOOGLE_CLOUD_LOCATION=global— recommended default. Vertex’sglobalendpoint provides higher availability and is supported by current Gemini 2.x and 3.x models. Set a specific region (e.g.us-central1,europe-west4) if you need data residency (the global endpoint does not guarantee in-region processing) or if you plan to use Vertex features that don’t run onglobal(notably tuning, batch prediction for Anthropic / OpenMaaS models, and RAG corpus management — RAG requests still work on global). For chat/completion crews,globalis the right choice.
Service account impersonation is configured per-connection (not per-config) so a single workload identity pool can be reused for multiple service accounts with different Vertex permissions.
Bind the connection to a crew or deployment
Attach the LLM connection to a crew, Studio project, or deployment exactly as you would any other LLM connection. At kickoff, the running crew will request an OIDC token from AMP for this connection’s workload identity provider and exchange it for Vertex credentials in-process — no Google credentials are stored or pushed by AMP.
Runtime behavior
For Vertex connections backed by workload identity, the crew does not receive aGOOGLE_API_KEY or service account JSON as a static deploy-time env var. Instead, at kickoff, the running crew:
- Fetches an OIDC token from AMP, signed with AMP’s private key and scoped to your organization (audience = your workload identity provider).
- Writes the JWT to a temporary file in the execution environment.
-
Writes a Google Application Default Credentials (ADC) config of type
external_accountthat references the JWT file, your STS audience, and (optionally) the service account impersonation URL. -
Sets the following environment variables for the crew process:
No
Env var Value GOOGLE_APPLICATION_CREDENTIALSPath to the temporary ADC external_accountconfig fileGOOGLE_CLOUD_PROJECTYour GCP project number, parsed from the workload identity provider resource name (Google Gen AI SDK accepts either the project ID or the project number) GOOGLE_API_KEYand noGOOGLE_CLOUD_LOCATIONare set automatically. ConfigureGOOGLE_CLOUD_LOCATIONon your LLM connection in AMP (recommended default:global). -
From this point on,
google-auth(used by the Google Gen AI SDK) does the STS exchange and SA impersonation transparently on the first Vertex API call, and caches/refreshes the resulting access token automatically.
crewai>=1.14.3 (see Version requirements).
Long-running crews
Access tokens are automatically refreshed:- Vertex access tokens (1-hour TTL) are refreshed by
google-authin-process, transparently to your crew code. - The underlying OIDC subject token (also 1-hour TTL) is rotated before expiry on every kickoff entry point. The crew fetches a fresh OIDC JWT from AMP and rewrites the ADC token file; subsequent STS exchanges pick up the new JWT.
- Crews that run for less than 1 hour never trigger a refresh — the initial token covers the whole execution.
- Crews that run for multiple hours continue to function as long as kickoff entry points (sync hops, agent steps, etc.) fire during the execution; the refresh buffer ensures the OIDC token is rotated before STS rejects it.
- If a single Vertex API call runs for more than 1 hour (very unusual — typical Gemini responses return in seconds), the OIDC token can expire mid-request and the call will fail. This is the one scenario where token refresh cannot help.
Verification
Run a crew that uses the Vertex connection and tail the execution logs in AMP. A successfulgenerateContent or predict call confirms the full chain — OIDC mint → STS exchange → SA impersonation → Vertex — is wired correctly.
If the crew fails, see Troubleshooting below. Most issues trace back to the GCP-side configuration — the OIDC provider’s attribute-condition or the service account’s principalSet binding.
Inspecting on the GCP side
You can confirm tokens are being exchanged by looking at Cloud Audit Logs in your GCP project:- Service:
sts.googleapis.com→ methodgoogle.identity.sts.v1.SecurityTokenService.ExchangeToken - Service:
iamcredentials.googleapis.com→ methodGenerateAccessToken
ExchangeToken and one GenerateAccessToken entry; longer executions produce additional entries each time the OIDC token is rotated. The protoPayload.authenticationInfo includes the sub and organization_id claims, useful for audit and incident response.
Troubleshooting
| Symptom | Likely cause |
|---|---|
| AMP UI doesn’t show Workload Identity Configs | Feature isn’t enabled for your organization — contact CrewAI support. |
| AMP UI rejects attaching a config to an LLM connection | The connection’s provider must be Vertex (GCP). |
GCP STS returns PERMISSION_DENIED: The given credential is rejected by the attribute condition | Org UUID mismatch — typically the numeric org ID was used instead of the UUID, or the UUID in the attribute condition is wrong. |
GCP STS returns INVALID_ARGUMENT: Invalid JWT | Issuer URL in the provider doesn’t match https://app.crewai.com, or GCP’s JWKS cache is stale (wait up to 1 hour, or recreate the provider). |
generateAccessToken returns PERMISSION_DENIED | The pool member is missing roles/iam.workloadIdentityUser on the service account, or the principalSet in the binding uses the wrong attribute path. |
Vertex returns PERMISSION_DENIED on generateContent | The service account is missing roles/aiplatform.user (or an equivalent custom role) on the project. |
Crew fails immediately with DefaultCredentialsError: File <path> was not found | The ADC token file was cleaned up — typically because the execution process was forked after credentials initialized. Re-kickoff the crew. If it persists, bump crewai>=1.14.3 in your pyproject.toml and re-deploy. |
Crew fails with DefaultCredentialsError and no GOOGLE_APPLICATION_CREDENTIALS is set in the execution env | Your crew was deployed against a pre-1.14.3 crewai, so no ADC file was written and no API-key fallback exists for workload identity connections. Bump crewai>=1.14.3 in your pyproject.toml and re-deploy. |
Crew fails after ~1 hour with invalid_grant from STS | The OIDC subject token expired and refresh did not fire — typically because a single in-process call held the execution past the refresh buffer. If this reproduces, contact CrewAI support with the failing execution ID. |
Vertex calls fail with Unable to locate project | GOOGLE_CLOUD_PROJECT was not parsed — your workload identity provider resource name in AMP doesn’t match the projects/PROJECT_NUMBER/... format. Re-check the provider value copied from gcloud iam workload-identity-pools providers describe. |
Vertex calls fail with region/location errors | GOOGLE_CLOUD_LOCATION isn’t set on the LLM connection. Add it as a connection-scoped env var (global is the recommended default). |
Vertex returns model not found or not available in location | The chosen region doesn’t host the requested model. Switch the connection’s GOOGLE_CLOUD_LOCATION to global, or pick a region known to host the model. |
| Vertex calls fail to authenticate despite a working WI config | The model identifier uses the vertex_ai/ (LiteLLM) prefix instead of google/. Workload identity only works through the Google Gen AI SDK route — change the model to google/<model-name>. |
Security notes
- The
organization_idclaim is your security boundary. Your GCP attribute condition must restrict to your organization UUID. Without it, any CrewAI AMP organization could exchange a token through your pool. Thesubclaim contains the same UUID prefixed withorganization:— either could be used, butorganization_idmatches the bare-UUID form used in theattribute.organizationmapping andprincipalSetbinding. - Service account impersonation is the second boundary. The
principalSetbinding restricts impersonation to identities whoseorganizationattribute matches your UUID. Use it even when the attribute condition is set — defense in depth. - Issuer trust is one-way. GCP fetches AMP’s public JWKS over HTTPS. AMP never receives any GCP credential.
Appendix: minimum IAM for setup
The user running thegcloud commands above needs, on the target project:
roles/iam.workloadIdentityPoolAdmin— create pools and providersroles/iam.serviceAccountAdmin— create service accountsroles/resourcemanager.projectIamAdmin— bind project-level rolesroles/serviceusage.serviceUsageAdmin— enable required APIs
roles/owner on the project.
Related
- Single Sign-On (SSO) — Authentication for the AMP UI and CLI (separate system from LLM workload identity)
- Azure OpenAI Setup — Static-key alternative for Azure OpenAI
- GCP: Workload Identity Federation — Google’s reference docs
