Integrating data

Integrating external data is the first step in powering AI-driven analytics. TrueState makes it simple to connect to key storage systems, databases, business applications, or upload structured datasets—directly into a pipeline. Once connected, your data becomes accessible across the platform—enabling dashboards, transformations, AI agents, and enriched analytics workflows. This guide explains how to bring data into the platform, what sources are supported, how to manage credentials securely, and how to upload datasets manually when needed.

Ways to bring in data

Data can be integrated into TrueState using two primary methods:

Integration nodes – connect directly to external systems such as databases, cloud storage, or applications
CSV uploads – import structured flat files manually via the platform

All data sources feed into the Pipeline canvas, where you can combine, transform, and enrich them using AI-driven tools.

You can mix both methods in the same pipeline for fast, multi-source integration.

Uploading CSV files

CSV files are a simple and effective way to get started with structured data. They’re especially useful for working with exports from other systems or manual data inputs.

Upload guidelines:

Files must be formatted as clean tables
The first row must be a header (column names)
No metadata, empty rows, or extra formatting above the header
Column types should be consistent across rows
Remove Excel-specific formatting, summary rows, or totals

Date orientation:

For time-based data, we recommend vertical orientation—where each row represents a different date. This ensures smooth parsing, summarisation, and time-series analysis.

Not recommended (horizontal orientation):

Metric	Jan 2023	Feb 2023	Mar 2023
Revenue	10000	11000	10500

Recommended (vertical orientation):

Date	Metric	Value
2023-01-01	Revenue	10000
2023-02-01	Revenue	11000
2023-03-01	Revenue	10500

Use the Upload CSV option from the Datasets panel or add a CSV Upload node directly in the Pipeline canvas.

Very large files may take longer to process. Ensure all columns have unique and clearly labelled headers to avoid ingest errors.

Connecting to external systems

Use Integration nodes to bring in data from supported systems and cloud platforms. Each Integration node pulls data using credentials stored as secrets, which are securely managed through the platform.

Supported sources:

1. AWS S3

Access: Access Key ID and Secret
Format: CSV, JSON, Parquet
Use: Load data lakes, logs, and cloud pipeline exports

2. Google Cloud Storage (GCS)

Access: Service Account (JSON key)
Format: CSV, JSON, Parquet
Use: Bring in structured exports, training sets, and backup files

3. Azure Blob Storage

Access: Storage Account Name and Key
Format: CSV, JSON, Parquet
Use: Import telemetry, reports, and cloud exports

4. Salesforce (SOQL)

Access: OAuth via Connected App
Format: Query result (SOQL)
Use: Pull leads, opportunities, accounts, and custom object data

5. SharePoint (.pdf ingestion)

Access: OAuth client credentials
Format: PDF
Use: Ingest policies, contracts, and documents for LLM processing

PDFs from SharePoint are processed using multimodal LLMs for advanced document analysis. This incurs higher per-document costs. Only use where a few dollars per file is justified by business impact.

6. SQL Server

Access: Plain JSON secret with username and password
Format: Table or SQL query result
Use: Load internal ERP records, finance tables, or operational data

All credentials are securely encrypted and reusable across multiple pipelines.

Managing credentials (Secrets)

All external data sources require credentials stored securely as secrets. Secrets can be added in two places:

From the Secrets section (via main navigation)
Directly inside the Integration node configuration in the Pipeline canvas

Secrets are encrypted, versioned, and scoped to your organisation. Updating a secret does not break dependent pipelines.

Use consistent naming (e.g., gcs-prod, sql-finance, salesforce-sandbox) for clarity and reuse.

Credential setup by source

AWS S3

Generate an Access Key ID and Secret via AWS IAM
Go to Secrets → Create New → AWS S3
Paste the credentials and test the connection

GCP GCS

Create a Service Account with access to your bucket
Download the JSON key
Go to Secrets → Create New → GCS
Upload the key file and verify access

Azure Blob Storage

In Azure Portal, retrieve your Storage Account Name and Access Key
Go to Secrets → Create New → Azure Blob
Paste in the credentials and test

Salesforce

Create a Connected App with API access
Obtain the Consumer Key and Secret
Go to Secrets → Create New → Salesforce or use the Integration node to add it inline
Paste the credentials and complete OAuth authorisation

SharePoint

Register an app in Azure Active Directory
Retrieve the Client ID, Tenant ID, and Client Secret
Go to Secrets → Create New → SharePoint or add inline in a node
Authenticate using the credentials provided

SQL Server

Gather:
- Host (e.g., sql.company.com:1433)
- Database name
- Username and password
Create a plain JSON secret:

{
  "username": "report_user",
  "password": "secure123"
}

Go to Secrets → Create New → Plain-JSON or use the Integration node directly

Enter the host, port, and database in the Integration node; attach the secret to authenticate

Next steps

Upload a clean CSV or configure an external connection
Go to the Pipeline section
Add an Integration node to your pipeline
Clean your data with the help of the Pipeline agent (see our data cleaning guide for more information)
Connect the output to dashboards, models, or agents

Get Started

Essentials

Guides

Ways to bring in data

Uploading CSV files

Upload guidelines:

Date orientation:

Not recommended (horizontal orientation):

Recommended (vertical orientation):

Connecting to external systems

Supported sources:

1. AWS S3

2. Google Cloud Storage (GCS)

3. Azure Blob Storage

4. Salesforce (SOQL)

5. SharePoint (.pdf ingestion)

6. SQL Server

Managing credentials (Secrets)

Credential setup by source

AWS S3

GCP GCS

Azure Blob Storage

Salesforce

SharePoint

SQL Server

Next steps

Get Started

Essentials

Guides

​Ways to bring in data

​Uploading CSV files

​Upload guidelines:

​Date orientation:

​Not recommended (horizontal orientation):

​Recommended (vertical orientation):

​Connecting to external systems

​Supported sources:

​1. AWS S3

​2. Google Cloud Storage (GCS)

​3. Azure Blob Storage

​4. Salesforce (SOQL)

​5. SharePoint (.pdf ingestion)

​6. SQL Server

​Managing credentials (Secrets)

​Credential setup by source

​AWS S3

​GCP GCS

​Azure Blob Storage

​Salesforce

​SharePoint

​SQL Server

​Next steps

Ways to bring in data

Uploading CSV files

Upload guidelines:

Date orientation:

Not recommended (horizontal orientation):

Recommended (vertical orientation):

Connecting to external systems

Supported sources:

1. AWS S3

2. Google Cloud Storage (GCS)

3. Azure Blob Storage

4. Salesforce (SOQL)

5. SharePoint (.pdf ingestion)

6. SQL Server

Managing credentials (Secrets)

Credential setup by source

AWS S3

GCP GCS

Azure Blob Storage

Salesforce

SharePoint

SQL Server

Next steps