Skip to content

[BUG](embedding-functions): compare values in validate_config_update#7115

Open
adityasingh2400 wants to merge 1 commit into
chroma-core:mainfrom
adityasingh2400:fix-validate-config-update-same-model-name
Open

[BUG](embedding-functions): compare values in validate_config_update#7115
adityasingh2400 wants to merge 1 commit into
chroma-core:mainfrom
adityasingh2400:fix-validate-config-update-same-model-name

Conversation

@adityasingh2400
Copy link
Copy Markdown

Several builtin embedding functions reject any config update whose new config contains a key such as model_name, even when the value matches the existing one. The check inside validate_config_update is if "model_name" in new_config: and overwrite_embedding_function passes update_embedding_function.get_config() as new_config, which always includes model_name. The result is that you cannot change any other persisted field, e.g. api_base, dimensions, organization_id, project_id, without also changing the model name, and the resulting error message claims the model name changed even when it did not.

This PR changes the check to compare values against old_config, matching the existing pattern in HuggingFaceEmbeddingServer and ChromaCloudSpladeEmbeddingFunction. The same shape is applied to every immutability guard in amazon_bedrock, cloudflare_workers_ai, chroma_cloud_qwen, cohere, google (Generative AI, Genai, Palm, Vertex), huggingface, jina, mistral, morph, nomic, ollama, open_clip, openai, perplexity, together_ai, and voyageai. Same-name updates now succeed, and changed values still raise the original error.

Adds parametrized regression tests in chromadb/test/ef/test_ef.py that construct each affected EF with a dummy API key, then assert that validate_config_update(cfg, cfg) does not raise and that swapping in a different model_name (or model) still raises. EFs whose third-party SDK is not installed locally are skipped rather than failing.

Several builtin EFs rejected any config update whose new config contained
a key like model_name, even when the value matched the existing one.
That made overwrite_embedding_function() impossible to use to change
other fields like api_base or dimensions, since get_config() always
includes model_name in the new config it produces.

The check now mirrors the existing pattern in HuggingFaceEmbeddingServer
and ChromaCloudSpladeEmbeddingFunction: only raise when the new value
differs from the old. Affects amazon_bedrock, cloudflare_workers_ai,
chroma_cloud_qwen, cohere, google, huggingface, jina, mistral, morph,
nomic, ollama, open_clip, openai, perplexity, together_ai, voyageai.

Adds parametrized regression tests covering same-value updates (must
not raise) and changed-value updates (must still raise).
@github-actions
Copy link
Copy Markdown

Reviewer Checklist

Please leverage this checklist to ensure your code review is thorough before approving

Testing, Bugs, Errors, Logs, Documentation

  • Can you think of any use case in which the code does not behave as intended? Have they been tested?
  • Can you think of any inputs or external events that could break the code? Is user input validated and safe? Have they been tested?
  • If appropriate, are there adequate property based tests?
  • If appropriate, are there adequate unit tests?
  • Should any logging, debugging, tracing information be added or removed?
  • Are error messages user-friendly?
  • Have all documentation changes needed been made?
  • Have all non-obvious changes been commented?

System Compatibility

  • Are there any potential impacts on other parts of the system or backward compatibility?
  • Does this change intersect with any items on our roadmap, and if so, is there a plan for fitting them together?

Quality

  • Is this code of a unexpectedly high quality (Readability, Modularity, Intuitiveness)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant