Using AI-driven analytics to predict and classify outcomes
Predictive analytics is one of the most high-leverage capabilities available to modern data teams. It enables organisations to move beyond observing what has happened (descriptive analytics) to anticipating what will happen—and acting proactively.From identifying which leads are most likely to convert, to forecasting revenue, to flagging customers likely to churn, predictive analytics allows organisations to unlock substantial strategic and operational value.This guide explains how to leverage TrueState’s platform to implement predictive analytics successfully—whether you’re starting out or expanding an existing program.
Leveraging machine learning in a predictive capacity is a critical milestone in any organisation’s analytics maturity.Predictive models enable teams to:
Prioritise sales and support efforts more effectively
Allocate resources based on anticipated outcomes
Automate decisions that once required manual review
Run “what if” simulations to support planning
For many teams, transitioning from dashboards to models marks a shift from reactive to proactive strategy.This guide is designed to demystify that shift and show how AI-driven prediction can be deployed in your workflow using TrueState.
“Predictive analytics” is a broad term that encompasses many model types. At its core, prediction involves fitting a model to historical data in order to generalise to future, unseen cases.There are two primary types of predictive modelling:
Classification: Predicts discrete categories (e.g., Will this customer churn? Yes/No)
Regression: Predicts continuous values (e.g., What is the expected spend next month?)
Additionally, there are three core data modalities commonly used in business settings:
Structured data: Tabular, row-based data such as CRM exports, usage logs, or transactional records
Text data: Natural language found in support tickets, product feedback, or customer notes
Image data: Often seen in domains like manufacturing or insurance, though less common in standard enterprise analytics
TrueState currently supports the following model types:
Tree-based models for structured classification: Effective for high-cardinality categorical data with nonlinear relationships
Tree-based models for structured regression: Useful for forecasting KPIs, revenue, or demand
Linear regression for structured regression: Simple, interpretable model ideal for risk-averse or regulated use-cases
Transformer models (RoBERTa) for text classification: Powerful model architecture well-suited to classifying sentiment, intent, or support topics in unstructured text
We’ve chosen these model types based on their robustness, interpretability, and practical impact in real-world business settings. Future platform updates will expand this model library.
When evaluating classification models, it’s important to understand how well your model is performing beyond just accuracy.
Precision: Of the predictions your model made for a given class (e.g., “Churn = Yes”), how many were actually correct? High precision means few false positives. Use when the cost of false positives is high.
Recall: Of all the actual cases in a class, how many did the model correctly identify? High recall means few false negatives. Use when the cost of missing true cases is high.
F1-score: The harmonic mean of precision and recall. This balances the trade-off between them, giving a single measure that accounts for both false positives and false negatives. Use when you need a balanced performance measure and the class distribution is skewed.
For example, if you’re predicting customer churn and want to proactively retain users, recall might matter most—you want to identify as many at-risk customers as possible, even at the risk of a few false alarms.
Many real-world classification problems involve imbalanced classes—situations where one outcome is much more common than the other. For example:
Only 5% of users may churn
Only 2% of transactions may be fraudulent
Only 1 in 100 leads may convert
In such cases, models trained without class balancing may appear “accurate” simply by predicting the majority class every time—e.g., always predicting “No Churn” and still being 95% accurate.To address this, TrueState includes automatic class balancing techniques, including:
Resampling: Oversampling the minority class or undersampling the majority class during training to balance exposure.
Weighting: Assigning higher penalties to incorrect predictions on minority class examples, encouraging the model to take those cases seriously.
These adjustments are applied automatically in classification workflows where imbalance is detected, but users can override the behaviour or inspect the class distribution manually.
If you’re seeing high accuracy but low precision or recall for the minority class, check for imbalance and consider enabling manual weighting or resampling.
Data leakage is a common mistake that can invalidate a predictive model’s performance. It occurs when your model is trained on information that wouldn’t be known at prediction time.Examples of leakage include:
Using post-event data (e.g., renewal status) as a feature
Including time-based fields improperly (e.g., using future dates as inputs)
Accidentally leaking target variables through proxy features
To avoid this:
Carefully audit your feature set before training
Use timestamp-aware splitting for training vs. testing data
Simulate the real-time environment where predictions will be made
TrueState includes built-in checks and data lineage tracking to help you avoid leakage during training.
Here are common use-case patterns well-suited to the models currently available in the platform:Churn Prediction (Classification)
Target customers likely to cancel or disengage. Use structured usage and support data.Lead Scoring (Classification)
Predict which leads are likely to convert based on past CRM activity and engagement.Revenue Forecasting (Regression)
Estimate future sales or subscriptions based on historical performance and seasonality.Support Ticket Classification (Text Classification)
Automatically categorise inbound tickets to prioritise routing and triage.Product Usage Intent (Text Classification)
Classify customer-submitted feature requests or feedback by intent and priority.
Each pattern can be configured and deployed with minimal setup using pre-built templates in TrueState. For advanced needs, you can bring your own feature engineering via the platform’s transformation layer.