跳到主要内容

CohereEmbeddingFunction

CohereEmbeddingFunction is a class in pymilvus that handles encoding text into embeddings using Cohere embedding models to support embedding retrieval in Milvus.

pymilvus.model.dense.CohereEmbeddingFunction

Constructor

Constructs a CohereEmbeddingFunction for common use cases.

CohereEmbeddingFunction(
model_name: str = "embed-english-light-v3.0",
api_key: Optional[str] = None,
input_type: str = "search_document",
embedding_types: Optional[List[str]] = None,
truncate: Optional[str] = None,
**kwargs
)

PARAMETERS:

  • model_name (string)

    The name of the Cohere embedding model to use for encoding. You can specify any of the available Cohere embedding model names, for example, embed-english-v3.0, embed-multilingual-v3.0, etc. If you leave this parameter unspecified, embed-english-light-v3.0 will be used. For a list of available models, refer to Embed.

  • api_key (string)

    The API key for accessing the Cohere API.

  • input_type (string)

    The type of input passed to the model. Required for embedding models v3 and higher.

    • "search_document": Used for embeddings stored in a vector database for search use-cases.

    • "search_query": Used for embeddings of search queries run against a vector DB to find relevant documents.

    • "classification": Used for embeddings passed through a text classifier.

    • "clustering": Used for the embeddings run through a clustering algorithm.

  • embedding_types (List[str])

    The type of embeddings you want to get back. Not required and default is None, which returns the Embed Floats response type. Currently, you can only specify a single value for this parameter. Possible values:

    • "float": Use this when you want to get back the default float embeddings. Valid for all models.

    • "binary": Use this when you want to get back signed binary embeddings. Valid for only v3 models.

    • "ubinary": Use this when you want to get back unsigned binary embeddings. Valid for only v3 models.

  • truncate (string)

    One of NONE|START|END to specify how the API will handle inputs longer than the maximum token length.

    Passing START will discard the start of the input. END will discard the end of the input. In both cases, input is discarded until the remaining input is exactly the maximum input token length for the model.

    If NONE is selected, when the input exceeds the maximum input token length an error will be returned.

    Default: END

  • kwargs

    Allows additional keyword arguments to be passed to the model initialization. For more information, refer to Embed.

Examples

from pymilvus.model.dense import CohereEmbeddingFunction

cohere_ef = CohereEmbeddingFunction(
model_name="embed-english-light-v3.0",
api_key="YOUR_COHERE_API_KEY",
input_type="search_document",
embedding_types=["float"]
)