Skip to content

chore: opensearch embedding model dimension configurability#248

Open
akhil-vamshi-konam wants to merge 1 commit intomasterfrom
chore-opensearch-embedding-dimension-config
Open

chore: opensearch embedding model dimension configurability#248
akhil-vamshi-konam wants to merge 1 commit intomasterfrom
chore-opensearch-embedding-dimension-config

Conversation

@akhil-vamshi-konam
Copy link
Collaborator

@akhil-vamshi-konam akhil-vamshi-konam commented Mar 24, 2026

Description

This PR updates self-hosting docs for Plane AI embeddings and OpenSearch so they match the current configuration model:

  • EMBEDDING_MODEL_ID → OPENSEARCH_ML_MODEL_ID wherever an existing OpenSearch ML model ID is documented.
  • Documents EMBEDDING_MODEL as required for query construction and Plane AI API startup behavior, and OPENSEARCH_EMBEDDING_DIMENSION as required so knn_vector mappings match the chosen model.
  • Expands the supported embedding models table in Plane AI with per-model dimensions (Cohere, OpenAI, Bedrock variants).
  • Adds a startup tip explaining the embedding-dimension check against OpenSearch and failure modes when config or indices disagree.
  • Adds “Changing the embedding dimension” with manage_search_index commands to rebuild indices and revectorize after model/dimension changes.
  • Aligns AWS OpenSearch embedding and Environment variables with the same variable names and PI-related rows.

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • Feature (non-breaking change which adds functionality)
  • Improvement (change that would cause existing functionality to not work as expected)
  • Code refactoring
  • Performance improvements
  • Documentation update

Screenshots and Media (if applicable)

Test Scenarios

References

Summary by CodeRabbit

Release Notes

  • Documentation
    • Updated OpenSearch embedding configuration instructions with new environment variable requirements for model ID and embedding dimension specifications.
    • Enhanced embedding model setup guides to include dimension details for supported providers and revised configuration examples.
    • Added guidance for changing embedding dimensions in existing deployments.

@vercel
Copy link

vercel bot commented Mar 24, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
developer-docs Ready Ready Preview, Comment Mar 24, 2026 11:32am

Request Review

@coderabbitai
Copy link

coderabbitai bot commented Mar 24, 2026

📝 Walkthrough

Walkthrough

Documentation updates standardize Plane's OpenSearch embedding configuration by replacing EMBEDDING_MODEL_ID with OPENSEARCH_ML_MODEL_ID and introducing OPENSEARCH_EMBEDDING_DIMENSION across configuration, environment variable, and setup guides. Embedding model documentation expands to include per-provider dimension specifications and index management procedures.

Changes

Cohort / File(s) Summary
OpenSearch Configuration Variables
docs/self-hosting/govern/aws-opensearch-embedding.md, docs/self-hosting/govern/environment-variables.md
Replaced EMBEDDING_MODEL_ID with OPENSEARCH_ML_MODEL_ID and added OPENSEARCH_EMBEDDING_DIMENSION variable (default: 1536). Reclassified EMBEDDING_MODEL from conditional to required. Updated environment variable tables and descriptions to reflect OpenSearch-specific configuration requirements.
Plane AI Embedding Documentation
docs/self-hosting/govern/plane-ai.md
Expanded embedding model documentation with per-provider dimension specifications (Cohere, OpenAI, AWS Bedrock). Updated configuration steps to use OPENSEARCH_ML_MODEL_ID and include OPENSEARCH_EMBEDDING_DIMENSION. Added startup validation guidance and new "Changing the embedding dimension" section with index recreation procedures.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Poem

🐰 Dimensions and models now align,
OpenSearch variables shine and refine,
From EMBEDDING_MODEL_ID to its new name,
Configuration docs join the game! 🚀

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely describes the main change: adding configurability for OpenSearch embedding model dimensions across self-hosting documentation.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch chore-opensearch-embedding-dimension-config

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
docs/self-hosting/govern/plane-ai.md (1)

291-291: Clarify conditional OPENSEARCH_ML_MODEL_ID wording in dimension-change guidance.

This sentence reads as if OPENSEARCH_ML_MODEL_ID is always user-configured; for auto-deployment setups, that can be misleading. Consider phrasing it as “if set / if using existing model ID.”

✏️ Suggested wording tweak
-If you update the model or manually override the dimension size by setting `OPENSEARCH_EMBEDDING_DIMENSION`, you must recreate your search indices so they adopt the new dimension size, then reindex and revectorize your workspace. Ensure that the model associated with your `OPENSEARCH_ML_MODEL_ID` and your `EMBEDDING_MODEL` configuration share this same dimension size.
+If you update the model or manually override the dimension size by setting `OPENSEARCH_EMBEDDING_DIMENSION`, you must recreate your search indices so they adopt the new dimension size, then reindex and revectorize your workspace. Ensure your `EMBEDDING_MODEL` and configured embedding model deployment (for example, `OPENSEARCH_ML_MODEL_ID` when you use an existing model ID) share the same dimension size.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/self-hosting/govern/plane-ai.md` at line 291, Reword the guidance
sentence to clarify that the OPENSEARCH_ML_MODEL_ID is conditional (only
relevant if the user has set or is using an existing model ID) — update the text
that mentions OPENSEARCH_EMBEDDING_DIMENSION, OPENSEARCH_ML_MODEL_ID, and
EMBEDDING_MODEL to say something like “if set / if using an existing model ID”
so readers running auto-deployments aren’t misled that they must always
configure OPENSEARCH_ML_MODEL_ID; ensure the sentence still instructs to
recreate indices and reindex/revectorize when the embedding dimension changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/self-hosting/govern/environment-variables.md`:
- Around line 236-238: Update the three environment variable descriptions for
BR_AWS_ACCESS_KEY_ID, BR_AWS_SECRET_ACCESS_KEY, and BR_AWS_REGION so they
reference Amazon Bedrock or "Bedrock embedding" generically instead of saying
"Bedrock Titan embedding"; edit the text for each entry to read something like
"AWS access key ID for Amazon Bedrock embedding" / "AWS secret access key for
Amazon Bedrock embedding" / "AWS region for Amazon Bedrock embedding" so it
correctly indicates the credentials apply to all Bedrock embedding models.

---

Nitpick comments:
In `@docs/self-hosting/govern/plane-ai.md`:
- Line 291: Reword the guidance sentence to clarify that the
OPENSEARCH_ML_MODEL_ID is conditional (only relevant if the user has set or is
using an existing model ID) — update the text that mentions
OPENSEARCH_EMBEDDING_DIMENSION, OPENSEARCH_ML_MODEL_ID, and EMBEDDING_MODEL to
say something like “if set / if using an existing model ID” so readers running
auto-deployments aren’t misled that they must always configure
OPENSEARCH_ML_MODEL_ID; ensure the sentence still instructs to recreate indices
and reindex/revectorize when the embedding dimension changes.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 87149e28-33be-467a-8e2a-079a6f88379d

📥 Commits

Reviewing files that changed from the base of the PR and between cb28431 and 5904a2b.

📒 Files selected for processing (3)
  • docs/self-hosting/govern/aws-opensearch-embedding.md
  • docs/self-hosting/govern/environment-variables.md
  • docs/self-hosting/govern/plane-ai.md

Comment on lines +236 to +238
| **BR_AWS_ACCESS_KEY_ID** | AWS access key ID for Bedrock Titan embedding | Conditional |
| **BR_AWS_SECRET_ACCESS_KEY** | AWS secret access key for Bedrock Titan embedding | Conditional |
| **BR_AWS_REGION** | AWS region for Bedrock Titan embedding | Conditional |
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

In AWS Bedrock, do the same AWS credentials (access key, secret key, session token, region) apply for invoking both Amazon Titan embedding models and Cohere embedding models, with model ID being the main selector?

💡 Result:

Yes, the same AWS credentials (access key, secret key, session token, and region) apply for invoking both Amazon Titan embedding models and Cohere embedding models in AWS Bedrock. The primary difference is the model ID specified in the InvokeModel API call: e.g., "amazon.titan-embed-text-v2:0" for Titan and "cohere.embed-english-v3" for Cohere. Both use the Bedrock Runtime service (bedrock-runtime) with standard AWS SigV4 authentication via IAM permissions like bedrock:InvokeModel.

Citations:


🏁 Script executed:

# Check if the file exists and examine lines around 236-238 and 316-318
wc -l docs/self-hosting/govern/environment-variables.md

Repository: makeplane/developer-docs

Length of output: 121


🏁 Script executed:

# View lines 230-245 to see the context of lines 236-238
sed -n '230,245p' docs/self-hosting/govern/environment-variables.md

Repository: makeplane/developer-docs

Length of output: 1734


🏁 Script executed:

# View lines 310-325 to see the context of lines 316-318
sed -n '310,325p' docs/self-hosting/govern/environment-variables.md

Repository: makeplane/developer-docs

Length of output: 3987


Update wording to indicate Bedrock credentials apply to all embedding models, not just Titan.

Lines 236-238 describe AWS credentials as "for Bedrock Titan embedding," but these same credentials work with any Bedrock embedding model (Titan, Cohere, etc.). The model ID is the selector, not the credentials. Change descriptions to reference "Amazon Bedrock" or "Bedrock embedding" generically to prevent confusion about model support.

Note: Lines 316-318 already use appropriate generic wording ("Amazon Bedrock integration") and do not require changes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/self-hosting/govern/environment-variables.md` around lines 236 - 238,
Update the three environment variable descriptions for BR_AWS_ACCESS_KEY_ID,
BR_AWS_SECRET_ACCESS_KEY, and BR_AWS_REGION so they reference Amazon Bedrock or
"Bedrock embedding" generically instead of saying "Bedrock Titan embedding";
edit the text for each entry to read something like "AWS access key ID for
Amazon Bedrock embedding" / "AWS secret access key for Amazon Bedrock embedding"
/ "AWS region for Amazon Bedrock embedding" so it correctly indicates the
credentials apply to all Bedrock embedding models.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants