Skip to main content

Text Generation

Text generation models generate synthetic text from a prompt.

Use a text generation model

from truestate import models

# get the model
model = models.TextGenerator(name="Llama3.2-1b")

# generate text
text = model.generate(prompt="What is the meaning of life?")

Apply a text generation model to new data in batch

When applying a text generation model to new data, the inference method can be used to generate text from a prompt. The prompt parameter should contain the text to be generated, with placeholders for the input data. The placeholders should be enclosed in curly braces {} and should match the column names in the dataset.

For example, if you want your prompt to contain the text column from the dataset, you can include {text} as a placeholder in the prompt.

from truestate import models, datasets

# get the model
model = models.get(name="Llama3.2-1b")

# get the dataset
dataset = datasets.get(name="my-dataset")

# apply the model to the dataset
predictions = model.inference(
dataset=dataset,
prompt="Summarise this long-form text: {text}",
)

Fine-tune a text generation model

Text generation models can be fine-tuned on a custom dataset using the QLORA via the train method. The train method requires a dataset and a target column.

Currently the following models can be fine-tuned:

  • Llama3.2-1b
  • Llama3.2-3b
  • Phi3-mini-4k-instruct
  • Phi3-mini-128k-instruct
from truestate import datasets, models

# get the dataset
dataset = datasets.get(name="my-dataset")

# get the model
model = models.TextGenerator(name="Llama3.2-1b")

# fine-tune the model
fine_tuned_model = model.train(
dataset=dataset,
target_column="text",
)