Skip to main content

Text Embedding

Text embedding models take a text input and return a vector representation of the text. These models are useful for a variety of tasks, including text classification, clustering, and similarity search.

Currently the TrueState platform supports the following text embedding models:

  • BAAI/bge-base-en-v1.5: A lightweight yet performant text embedding model.

The following code snippet shows how to generate embeddings for a dataset to prepare it for natural language search.

from truestate import Workflow
from truestate.jobs import ApplyEmbedding

make_data_searchable = ApplyEmbedding(
name="my_embedding_job",
description="Apply embedding model to input dataset",
input_dataset="input_dataset",
output_dataset="output_dataset",
inference_column_name="description",
)

my_workflow = Workflow(
name="example_workflow",
description="Product recommendation workflow",
jobs=[
make_data_searchable,
],
)

my_workflow.sync()