Raw bulk import

Automate ingestion of your raw datasets into Trebellar

Raw bulk import

Automate ingestion of your raw datasets into Trebellar

Trebellar can automatically pull raw data you already produce — occupancy feeds, badge events, work-order exports, utility usage, and anything else you want to put to work in the platform — and import it on a recurring schedule.

This page explains the two transport options (SFTP and object storage) and the two hosting options (your infrastructure or Trebellar-managed) that combine to form the four setup modes available from the admin panel.

All raw bulk import options are configured from the Trebellar admin panel at my.trebellar.app, under Organization → Settings → Data Management. For non-standard origins (e.g. FTPS, Azure Blob, vendor APIs without a generic S3 interface), reach out to your Customer Success contact and we’ll coordinate a bespoke connector.

At a glance

	Bring your own	Trebellar-managed
S3 / GCS bucket	You own the bucket; you grant Trebellar a read-only IAM principal	Trebellar provisions the bucket and shares write credentials with you
SFTP	You operate the SFTP server; we connect with a key or password you provide	Trebellar hosts the SFTP endpoint; you receive a username + password
Encryption at rest	Governed by your cloud provider / server configuration	AES-256 server-side, managed by Trebellar
Encryption in transit	TLS (S3/GCS) or SSH (SFTP) — enforced	TLS (S3/GCS) or SSH (SFTP) — enforced
Optional client-side encryption	Supported for SFTP via a customer-provided PGP public key	Supported for SFTP via a customer-provided PGP public key
Best for	Data residency or compliance regimes that require the data to stay in your tenancy	Teams that want the lowest-lift setup

How the pipeline works

Regardless of transport, the flow is the same:

You (or your upstream systems) drop one or more files into the configured location.
Trebellar polls the location on a schedule.
New files are fetched, validated, and enqueued for ingestion. Already-processed files are skipped based on a content hash, so re-uploads are idempotent.
Each file is transformed according to the dataset mapping you configured in the admin panel, then loaded into Trebellar.
The original file is moved to an archive/ prefix (or an archive/ directory on SFTP) with a timestamp suffix. If validation fails, it is moved to errors/ instead, alongside a sibling .error.json explaining what went wrong.

File-naming conventions and dataset schemas are defined per-organization in the admin panel. The transport options below only control how Trebellar reaches the files — the shape of the files themselves is independent.

Option 1 — Object storage (S3 or GCS)

Use object storage when your upstream systems already export to a bucket, or when you want to land files over HTTPS using the AWS/GCS SDKs.

Option 1a — Bring your own bucket

Recommended when the source data must stay in your cloud tenancy for compliance or contractual reasons.

You provide:

Field	Description
Bucket URL	Fully qualified URI — e.g. `s3://acme-trebellar-inbox` or `gs://acme-trebellar-inbox`
Region	Required for S3
Prefix (optional)	Subfolder inside the bucket — Trebellar will only look under this prefix. Defaults to `inbox/`
Client ID	Access key ID (S3) / service account client identifier (GCS) that Trebellar will use to read
Client secret	The matching secret. Trebellar encrypts this at rest and never displays it again after saving

Steps:

Create (or reuse) a bucket in your cloud provider.
Provision an IAM principal for Trebellar with the minimum permissions:
- S3: s3:GetObject, s3:ListBucket, s3:PutObject (used to move files into archive/ or errors/), and s3:DeleteObject on the configured prefix only.
- GCS: storage.objects.get, storage.objects.list, storage.objects.create, storage.objects.delete on the configured prefix only.
Generate long-lived credentials for that principal.
In Admin panel → Organization → Settings → Data Management, choose S3 / GCS → Bring your own bucket and paste the values above.
Click Test connection. Trebellar will try to connect against the source.
Pick a polling schedule and save.

Scope the IAM principal to the prefix you configured, not the whole bucket. Trebellar never needs access outside of inbox/, archive/, and errors/ under that prefix.

Locking the bucket to Trebellar’s IP ranges

If your security policy requires bucket access to be pinned to a known set of source IPs, Trebellar publishes a stable list of egress addresses used by the ingestion workers. You can attach these to your bucket policy (aws:SourceIp on S3 or a VPC Service Controls perimeter on GCS) so that reads, writes, and deletes are only accepted from Trebellar’s infrastructure.

To avoid drift between these docs and the live set, we don’t publish the ranges inline. To request the current list:

Contact your Customer Success representative, or email support@trebellar.com.
Specify the cloud provider (AWS or GCP) and the region your bucket lives in — we’ll send back the narrowest range that covers ingestion for that region.
Trebellar announces changes to the egress set at least 30 days in advance in the changelog. Subscribe to the changelog to be notified before you need to update your bucket policy.

Option 1b — Trebellar-managed bucket

Trebellar provisions a tenant-isolated bucket and issues you write-only credentials scoped to a single prefix. Your upstream systems use these to drop files exactly as they would against a bucket you own.

Steps:

In Admin panel → Organization → Settings → Data Management, choose S3 / GCS → Use Trebellar bucket.
The admin panel will display:
- The bucket URI and prefix to upload to
- A client ID and client secret that your systems should use
Point your upstream exporter at the bucket and start pushing files.

Credentials issued in this mode have PutObject only — they cannot list, read, or delete. This is by design so that a leaked credential cannot be used to exfiltrate previously-uploaded files.

Option 2 — SFTP

Use SFTP when your upstream systems can’t write to object storage — common with legacy BAS/BMS exports, facility-management vendors, and scheduled jobs running on on-prem servers.

Option 2a — Bring your own SFTP server

Trebellar connects to an SFTP endpoint you operate.

You provide:

Field	Description
Host	Hostname or IP of your SFTP server
Port	Defaults to `22`
Username	The account Trebellar will authenticate as
Auth method	Either an SSH public key or a password
Remote path	Absolute path on the server where files will be read from — e.g. `/home/trebellar/inbox`
Host key fingerprint (optional)	SHA-256 fingerprint of the server’s host key, pinned for strict host-key checking

Steps:

Create a dedicated SFTP user on your server with read/write access to the inbox path, and to the archive/ and errors/ subdirectories Trebellar will create.
In Admin panel → Organization → Settings → Data Management, choose SFTP → Bring your own server.
Paste the host, port, username, password or public SSH key and remote path.
(Optional) Paste the host key fingerprint to enforce strict host-key checking.
Click Test connection and then save.

Locking the SFTP server to Trebellar’s IP ranges

If your SFTP server is exposed to the internet and you’d like to restrict inbound connections to Trebellar only, request the current egress IP ranges for the ingestion workers from your Customer Success representative or at support@trebellar.com. Specify the region you expect connections from — we’ll send back the narrowest range. Add them to your firewall / security group rules on port 22.

We keep the live list out of these docs on purpose so customers never pin a stale range. Changes are announced 30 days in advance in the changelog.

Option 2b — Trebellar-managed SFTP endpoint

Trebellar spins up a per-organization SFTP endpoint. This is the fastest path when you want an SFTP drop but don’t want to run a server.

You receive:

A hostname — e.g. sftp.trebellar.app
A username — scoped to your organization
A choice of SSH key (recommended — you upload your public key) or a generated password
A fixed remote path — /inbox

That’s it — point your upstream job at the endpoint and start pushing.

Client-side encryption (both SFTP modes)

For SFTP you can optionally configure client-side PGP encryption so that the file contents are encrypted end-to-end, from your exporter all the way to Trebellar’s decryption step.

How it works:

In the admin panel, enable Client-side encryption for the SFTP configuration.
Trebellar generates a PGP keypair scoped to your organization and displays the public key. Download it and hand it to whoever writes files into the SFTP inbox.
Your exporter encrypts each file against the public key before upload — e.g. gpg --encrypt --recipient trebellar@yourorg .... Encrypted files should use the .gpg or .pgp extension.
Upon ingestion, Trebellar decrypts the file in memory using the private key (which never leaves our KMS) and then runs the usual validation pipeline.

You can also use your own PGP keypair instead of the Trebellar-generated one — upload the public key to the admin panel and keep the private key on your side, decrypting files yourself via an egress to your environment. Ask Customer Success if you need this mode; it’s a small variant of Option 2a.

Preparing your files

Initial data requirements

Before uploading, see the Initial Data Requirements article in our help center for the complete list of supported data types and the required fields for each.

Naming & folder conventions

Each upload folder should contain the dated snapshot of your data based on the date provided in the folder’s name.

For every snapshot, create a folder named in YYYY-MM-DD format using the date the data reflects. All files placed in that folder should correspond to that same snapshot date.

Each file within a folder should follow this naming pattern:

<datatype>_<date>

For example, if your employee and building data reflects your organization as of 2026-05-02, upload both files into:

2026-05-02/
   employees_2026-05-02.csv
   buildings_2026-05-02.csv

This tells Trebellar that all files in that folder are semantically linked and belong to the same reporting snapshot.

Other origins

Trebellar’s raw bulk importer is focused on SFTP and S3/GCS because those cover the vast majority of enterprise data export paths. If your source is something else — FTPS, Azure Blob Storage, a vendor-specific REST API, an email attachment pipeline, or a file drop behind a VPN — contact your Customer Success representative. We’ll scope a bespoke connector; most common variants can be onboarded in under a week.

Troubleshooting

Symptom	Likely cause
Files sitting in `inbox/` past the next poll	Credentials rotated upstream, or the IAM principal lost a grant. Check the Last sync status
File moved to `errors/` with a `schema_mismatch` error	The file does not match the dataset mapping — compare the file against the mapping specification
SFTP Test connection fails with `host_key_mismatch`	The pinned fingerprint no longer matches. Re-pin it from the admin panel if the change is expected
Ingestion stalls after PGP rotation	Upstream is still encrypting to the old public key. Cutover window is 7 days — update your exporter
Duplicate data appears after a re-upload	Shouldn’t happen — ingestion is idempotent per content hash. Contact support with the filename

At a glance

	Bring your own	Trebellar-managed
S3 / GCS bucket	You own the bucket; you grant Trebellar a read-only IAM principal	Trebellar provisions the bucket and shares write credentials with you
SFTP	You operate the SFTP server; we connect with a key or password you provide	Trebellar hosts the SFTP endpoint; you receive a username + password
Encryption at rest	Governed by your cloud provider / server configuration	AES-256 server-side, managed by Trebellar
Encryption in transit	TLS (S3/GCS) or SSH (SFTP) — enforced	TLS (S3/GCS) or SSH (SFTP) — enforced
Optional client-side encryption	Supported for SFTP via a customer-provided PGP public key	Supported for SFTP via a customer-provided PGP public key
Best for	Data residency or compliance regimes that require the data to stay in your tenancy	Teams that want the lowest-lift setup

How the pipeline works

Regardless of transport, the flow is the same:

You (or your upstream systems) drop one or more files into the configured location.
Trebellar polls the location on a schedule.
New files are fetched, validated, and enqueued for ingestion. Already-processed files are skipped based on a content hash, so re-uploads are idempotent.
Each file is transformed according to the dataset mapping you configured in the admin panel, then loaded into Trebellar.
The original file is moved to an archive/ prefix (or an archive/ directory on SFTP) with a timestamp suffix. If validation fails, it is moved to errors/ instead, alongside a sibling .error.json explaining what went wrong.

Option 1 — Object storage (S3 or GCS)

Use object storage when your upstream systems already export to a bucket, or when you want to land files over HTTPS using the AWS/GCS SDKs.

Option 1a — Bring your own bucket

Recommended when the source data must stay in your cloud tenancy for compliance or contractual reasons.

You provide:

Field	Description
Bucket URL	Fully qualified URI — e.g. `s3://acme-trebellar-inbox` or `gs://acme-trebellar-inbox`
Region	Required for S3
Prefix (optional)	Subfolder inside the bucket — Trebellar will only look under this prefix. Defaults to `inbox/`
Client ID	Access key ID (S3) / service account client identifier (GCS) that Trebellar will use to read
Client secret	The matching secret. Trebellar encrypts this at rest and never displays it again after saving

Steps:

Create (or reuse) a bucket in your cloud provider.
Provision an IAM principal for Trebellar with the minimum permissions:
- S3: s3:GetObject, s3:ListBucket, s3:PutObject (used to move files into archive/ or errors/), and s3:DeleteObject on the configured prefix only.
- GCS: storage.objects.get, storage.objects.list, storage.objects.create, storage.objects.delete on the configured prefix only.
Generate long-lived credentials for that principal.
In Admin panel → Organization → Settings → Data Management, choose S3 / GCS → Bring your own bucket and paste the values above.
Click Test connection. Trebellar will try to connect against the source.
Pick a polling schedule and save.

Scope the IAM principal to the prefix you configured, not the whole bucket. Trebellar never needs access outside of inbox/, archive/, and errors/ under that prefix.

Locking the bucket to Trebellar’s IP ranges

To avoid drift between these docs and the live set, we don’t publish the ranges inline. To request the current list:

Contact your Customer Success representative, or email support@trebellar.com.
Specify the cloud provider (AWS or GCP) and the region your bucket lives in — we’ll send back the narrowest range that covers ingestion for that region.
Trebellar announces changes to the egress set at least 30 days in advance in the changelog. Subscribe to the changelog to be notified before you need to update your bucket policy.

Option 1b — Trebellar-managed bucket

Steps:

In Admin panel → Organization → Settings → Data Management, choose S3 / GCS → Use Trebellar bucket.
The admin panel will display:
- The bucket URI and prefix to upload to
- A client ID and client secret that your systems should use
Point your upstream exporter at the bucket and start pushing files.

Credentials issued in this mode have PutObject only — they cannot list, read, or delete. This is by design so that a leaked credential cannot be used to exfiltrate previously-uploaded files.

Option 2 — SFTP

Use SFTP when your upstream systems can’t write to object storage — common with legacy BAS/BMS exports, facility-management vendors, and scheduled jobs running on on-prem servers.

Option 2a — Bring your own SFTP server

Trebellar connects to an SFTP endpoint you operate.

You provide:

Field	Description
Host	Hostname or IP of your SFTP server
Port	Defaults to `22`
Username	The account Trebellar will authenticate as
Auth method	Either an SSH public key or a password
Remote path	Absolute path on the server where files will be read from — e.g. `/home/trebellar/inbox`
Host key fingerprint (optional)	SHA-256 fingerprint of the server’s host key, pinned for strict host-key checking

Steps:

Create a dedicated SFTP user on your server with read/write access to the inbox path, and to the archive/ and errors/ subdirectories Trebellar will create.
In Admin panel → Organization → Settings → Data Management, choose SFTP → Bring your own server.
Paste the host, port, username, password or public SSH key and remote path.
(Optional) Paste the host key fingerprint to enforce strict host-key checking.
Click Test connection and then save.

Locking the SFTP server to Trebellar’s IP ranges

We keep the live list out of these docs on purpose so customers never pin a stale range. Changes are announced 30 days in advance in the changelog.

Option 2b — Trebellar-managed SFTP endpoint

Trebellar spins up a per-organization SFTP endpoint. This is the fastest path when you want an SFTP drop but don’t want to run a server.

You receive:

A hostname — e.g. sftp.trebellar.app
A username — scoped to your organization
A choice of SSH key (recommended — you upload your public key) or a generated password
A fixed remote path — /inbox

That’s it — point your upstream job at the endpoint and start pushing.

Client-side encryption (both SFTP modes)

For SFTP you can optionally configure client-side PGP encryption so that the file contents are encrypted end-to-end, from your exporter all the way to Trebellar’s decryption step.

How it works:

In the admin panel, enable Client-side encryption for the SFTP configuration.
Trebellar generates a PGP keypair scoped to your organization and displays the public key. Download it and hand it to whoever writes files into the SFTP inbox.
Your exporter encrypts each file against the public key before upload — e.g. gpg --encrypt --recipient trebellar@yourorg .... Encrypted files should use the .gpg or .pgp extension.
Upon ingestion, Trebellar decrypts the file in memory using the private key (which never leaves our KMS) and then runs the usual validation pipeline.

Preparing your files

Initial data requirements

Before uploading, see the Initial Data Requirements article in our help center for the complete list of supported data types and the required fields for each.

Naming & folder conventions

Each upload folder should contain the dated snapshot of your data based on the date provided in the folder’s name.

For every snapshot, create a folder named in YYYY-MM-DD format using the date the data reflects. All files placed in that folder should correspond to that same snapshot date.

Each file within a folder should follow this naming pattern:

<datatype>_<date>

For example, if your employee and building data reflects your organization as of 2026-05-02, upload both files into:

2026-05-02/
   employees_2026-05-02.csv
   buildings_2026-05-02.csv

This tells Trebellar that all files in that folder are semantically linked and belong to the same reporting snapshot.

Other origins

Troubleshooting

Symptom	Likely cause
Files sitting in `inbox/` past the next poll	Credentials rotated upstream, or the IAM principal lost a grant. Check the Last sync status
File moved to `errors/` with a `schema_mismatch` error	The file does not match the dataset mapping — compare the file against the mapping specification
SFTP Test connection fails with `host_key_mismatch`	The pinned fingerprint no longer matches. Re-pin it from the admin panel if the change is expected
Ingestion stalls after PGP rotation	Upstream is still encrypting to the old public key. Cutover window is 7 days — update your exporter
Duplicate data appears after a re-upload	Shouldn’t happen — ingestion is idempotent per content hash. Contact support with the filename