跳到主要内容

InstructorEmbeddingFunction

InstructorEmbeddingFunction is a class in pymilvus that handles encoding text into embeddings using the Instructor embedding model to support embedding retrieval in Milvus.

pymilvus.model.dense.InstructorEmbeddingFunction

Constructor

Constructs a MistralAIEmbeddingFunction for common use cases.

InstructorEmbeddingFunction(
model_name: str = "hkunlp/instructor-xl",
batch_size: int = 32,
query_instruction: str = "Represent the question for retrieval:",
doc_instruction: str = "Represent the document for retrieval:",
device: str = "cpu",
normalize_embeddings: bool = True,
**kwargs
)

PARAMETERS:

  • model_name (string)

    The name of the Mistral AI embedding model to use for encoding. The value defaults to hkunlp/instructor-xl. For more information, refer to Model List.

  • batch_size (int)

    The batch size used for the computation. It determines the number of sentences processed together in each batch.

  • query_instruction (string)

    Task-specific instruction that guides the model on how to generate an embedding for a query or question.

  • doc_instruction (string)

    Task-specific instruction that guides the model to generate an embedding for a document.

  • device (string)

    Specifies the torch.device to use for the computation. If not specified, the function uses the default device.

  • normalize_embeddings (bool)

    If set to True, the returned vectors will have a length of 1, indicating that they are normalized. In this case, similarity search would use the faster dot-product (util.dot_score), instead of cosine similarity.

  • kwargs

    Allows additional keyword arguments to be passed to the model initialization. For more information, refer to instructor-embedding.

Examples

from pymilvus.model.dense import InstructorEmbeddingFunction

ef = InstructorEmbeddingFunction(
model_name="hkunlp/instructor-xl", # Defaults to `hkunlp/instructor-xl`
query_instruction="Represent the question for retrieval:",
doc_instruction="Represent the document for retrieval:"
)