Predictive analytics

Predictive analytics is one of the most high-leverage capabilities available to modern data teams. It enables organisations to move beyond observing what has happened (descriptive analytics) to anticipating what will happen—and acting proactively. From identifying which leads are most likely to convert, to forecasting revenue, to flagging customers likely to churn, predictive analytics allows organisations to unlock substantial strategic and operational value. This guide explains how to leverage TrueState’s platform to implement predictive analytics successfully—whether you’re starting out or expanding an existing program.

Why predictive analytics matters

Leveraging machine learning in a predictive capacity is a critical milestone in any organisation’s analytics maturity. Predictive models enable teams to:

Prioritise sales and support efforts more effectively
Allocate resources based on anticipated outcomes
Automate decisions that once required manual review
Run “what if” simulations to support planning

For many teams, transitioning from dashboards to models marks a shift from reactive to proactive strategy. This guide is designed to demystify that shift and show how AI-driven prediction can be deployed in your workflow using TrueState.

Key terms

Before diving into models, it’s important to clarify the foundational concepts:

Label (or Target): The outcome you are trying to predict (e.g., “Churned = Yes”)
Features: The input variables used by the model to make predictions
Training Data: A historical dataset where both features and labels are known
Inference: The act of applying a trained model to new, unseen data
Data Leakage: When the model is trained on information that would not be available at prediction time, leading to overly optimistic performance

These terms appear frequently throughout the modelling process in the TrueState platform.

Underlying models

“Predictive analytics” is a broad term that encompasses many model types. At its core, prediction involves fitting a model to historical data in order to generalise to future, unseen cases. There are two primary types of predictive modelling:

Classification: Predicts discrete categories (e.g., Will this customer churn? Yes/No)
Regression: Predicts continuous values (e.g., What is the expected spend next month?)

Additionally, there are three core data modalities commonly used in business settings:

Structured data: Tabular, row-based data such as CRM exports, usage logs, or transactional records
Text data: Natural language found in support tickets, product feedback, or customer notes
Image data: Often seen in domains like manufacturing or insurance, though less common in standard enterprise analytics

TrueState currently supports the following model types:

Tree-based models for structured classification: Effective for high-cardinality categorical data with nonlinear relationships
Tree-based models for structured regression: Useful for forecasting KPIs, revenue, or demand
Linear regression for structured regression: Simple, interpretable model ideal for risk-averse or regulated use-cases
Transformer models (RoBERTa) for text classification: Powerful model architecture well-suited to classifying sentiment, intent, or support topics in unstructured text

We’ve chosen these model types based on their robustness, interpretability, and practical impact in real-world business settings. Future platform updates will expand this model library.

Training a model in TrueState

TrueState simplifies the process of training predictive models into a guided workflow. Here’s how it typically works:

Define your objective: What are you trying to predict? Clearly define the label and ensure it’s available in your data.
Select a dataset: Choose a historical dataset with representative examples of the outcome.
Configure features: The platform will suggest candidate features, but you can include/exclude based on your domain knowledge.
Choose model type: Select from available models based on task type and data format.
Validate and score: Evaluate model performance using relevant metrics (explained below).
Run predictions: Once validated, apply the model to live or batch data for inference.

All of this is done within a visual, explainable interface—so you can remain confident about what the model is doing and why.

Understanding precision, recall and F1-score

When evaluating classification models, it’s important to understand how well your model is performing beyond just accuracy.

Precision: Of the predictions your model made for a given class (e.g., “Churn = Yes”), how many were actually correct? High precision means few false positives.
Use when the cost of false positives is high.
Recall: Of all the actual cases in a class, how many did the model correctly identify? High recall means few false negatives.
Use when the cost of missing true cases is high.
F1-score: The harmonic mean of precision and recall. This balances the trade-off between them, giving a single measure that accounts for both false positives and false negatives.
Use when you need a balanced performance measure and the class distribution is skewed.

For example, if you’re predicting customer churn and want to proactively retain users, recall might matter most—you want to identify as many at-risk customers as possible, even at the risk of a few false alarms.

Class balancing

Many real-world classification problems involve imbalanced classes—situations where one outcome is much more common than the other. For example:

Only 5% of users may churn
Only 2% of transactions may be fraudulent
Only 1 in 100 leads may convert

In such cases, models trained without class balancing may appear “accurate” simply by predicting the majority class every time—e.g., always predicting “No Churn” and still being 95% accurate. To address this, TrueState includes automatic class balancing techniques, including:

Resampling: Oversampling the minority class or undersampling the majority class during training to balance exposure.
Weighting: Assigning higher penalties to incorrect predictions on minority class examples, encouraging the model to take those cases seriously.

These adjustments are applied automatically in classification workflows where imbalance is detected, but users can override the behaviour or inspect the class distribution manually.

If you’re seeing high accuracy but low precision or recall for the minority class, check for imbalance and consider enabling manual weighting or resampling.

Avoiding data leakage

Data leakage is a common mistake that can invalidate a predictive model’s performance. It occurs when your model is trained on information that wouldn’t be known at prediction time. Examples of leakage include:

Using post-event data (e.g., renewal status) as a feature
Including time-based fields improperly (e.g., using future dates as inputs)
Accidentally leaking target variables through proxy features

To avoid this:

Carefully audit your feature set before training
Use timestamp-aware splitting for training vs. testing data
Simulate the real-time environment where predictions will be made

TrueState includes built-in checks and data lineage tracking to help you avoid leakage during training.

Common predictive analytics patterns

Here are common use-case patterns well-suited to the models currently available in the platform: Churn Prediction (Classification)
Target customers likely to cancel or disengage. Use structured usage and support data. Lead Scoring (Classification)
Predict which leads are likely to convert based on past CRM activity and engagement. Revenue Forecasting (Regression)
Estimate future sales or subscriptions based on historical performance and seasonality. Support Ticket Classification (Text Classification)
Automatically categorise inbound tickets to prioritise routing and triage. Product Usage Intent (Text Classification)
Classify customer-submitted feature requests or feedback by intent and priority.

Each pattern can be configured and deployed with minimal setup using pre-built templates in TrueState. For advanced needs, you can bring your own feature engineering via the platform’s transformation layer.

Next steps

Visit the Capabilities page for a breakdown of supported prediction types
Try building your first model with the guided walkthrough

Get Started

Essentials

Guides

Why predictive analytics matters

Key terms

Underlying models

Training a model in TrueState

Understanding precision, recall and F1-score

Class balancing

Avoiding data leakage

Common predictive analytics patterns

Next steps

Get Started

Essentials

Guides

​Why predictive analytics matters

​Key terms

​Underlying models

​Training a model in TrueState

​Understanding precision, recall and F1-score

​Class balancing

​Avoiding data leakage

​Common predictive analytics patterns

​Next steps

Why predictive analytics matters

Key terms

Underlying models

Training a model in TrueState

Understanding precision, recall and F1-score

Class balancing

Avoiding data leakage

Common predictive analytics patterns

Next steps