> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.trebellar.app/llms.txt.
> For full documentation content, see https://docs.trebellar.app/llms-full.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.trebellar.app/_mcp/server.

# Raw bulk import

Trebellar can automatically pull raw data you already produce — occupancy feeds,
badge events, work-order exports, utility usage, and anything else you want to put
to work in the platform — and import it on a recurring schedule.

This page explains the **two transport options** (SFTP and object storage) and the
**two hosting options** (your infrastructure or Trebellar-managed) that combine to
form the four setup modes available from the admin panel.

All raw bulk import options are configured from the Trebellar admin panel at
[my.trebellar.app](https://my.trebellar.app), under **Organization → Settings →
Data Management**. For non-standard origins (e.g. FTPS, Azure Blob, vendor APIs
without a generic S3 interface), reach out to your Customer Success contact and
we'll coordinate a bespoke connector.

## At a glance

|                                     | **Bring your own**                                                                 | **Trebellar-managed**                                                 |
| ----------------------------------- | ---------------------------------------------------------------------------------- | --------------------------------------------------------------------- |
| **S3 / GCS bucket**                 | You own the bucket; you grant Trebellar a read-only IAM principal                  | Trebellar provisions the bucket and shares write credentials with you |
| **SFTP**                            | You operate the SFTP server; we connect with a key or password you provide         | Trebellar hosts the SFTP endpoint; you receive a username + password  |
| **Encryption at rest**              | Governed by your cloud provider / server configuration                             | AES-256 server-side, managed by Trebellar                             |
| **Encryption in transit**           | TLS (S3/GCS) or SSH (SFTP) — enforced                                              | TLS (S3/GCS) or SSH (SFTP) — enforced                                 |
| **Optional client-side encryption** | Supported for SFTP via a customer-provided PGP public key                          | Supported for SFTP via a customer-provided PGP public key             |
| **Best for**                        | Data residency or compliance regimes that require the data to stay in your tenancy | Teams that want the lowest-lift setup                                 |

## How the pipeline works

Regardless of transport, the flow is the same:

1. You (or your upstream systems) drop one or more files into the configured location.
2. Trebellar polls the location on a schedule.
3. New files are fetched, validated, and enqueued for ingestion. Already-processed
   files are skipped based on a content hash, so re-uploads are idempotent.
4. Each file is transformed according to the dataset mapping you configured in the
   admin panel, then loaded into Trebellar.
5. The original file is moved to an `archive/` prefix (or an `archive/` directory
   on SFTP) with a timestamp suffix. If validation fails, it is moved to `errors/`
   instead, alongside a sibling `.error.json` explaining what went wrong.

File-naming conventions and dataset schemas are defined per-organization in the
admin panel. The transport options below only control **how Trebellar reaches
the files** — the shape of the files themselves is independent.

## Option 1 — Object storage (S3 or GCS)

Use object storage when your upstream systems already export to a bucket, or when
you want to land files over HTTPS using the AWS/GCS SDKs.

### Option 1a — Bring your own bucket

Recommended when the source data must stay in your cloud tenancy for compliance or
contractual reasons.

You provide:

| Field                   | Description                                                                                    |
| ----------------------- | ---------------------------------------------------------------------------------------------- |
| **Bucket URL**          | Fully qualified URI — e.g. `s3://acme-trebellar-inbox` or `gs://acme-trebellar-inbox`          |
| **Region**              | Required for S3                                                                                |
| **Prefix** *(optional)* | Subfolder inside the bucket — Trebellar will only look under this prefix. Defaults to `inbox/` |
| **Client ID**           | Access key ID (S3) / service account client identifier (GCS) that Trebellar will use to read   |
| **Client secret**       | The matching secret. Trebellar encrypts this at rest and never displays it again after saving  |

Steps:

1. Create (or reuse) a bucket in your cloud provider.
2. Provision an IAM principal for Trebellar with the **minimum** permissions:
   * **S3**: `s3:GetObject`, `s3:ListBucket`, `s3:PutObject` (used to move files into
     `archive/` or `errors/`), and `s3:DeleteObject` on the configured prefix only.
   * **GCS**: `storage.objects.get`, `storage.objects.list`, `storage.objects.create`,
     `storage.objects.delete` on the configured prefix only.
3. Generate long-lived credentials for that principal.
4. In **Admin panel → Organization → Settings → Data Management**, choose
   **S3 / GCS → Bring your own bucket** and paste the values above.
5. Click **Test connection**. Trebellar will try to connect against the source.
6. Pick a polling schedule and save.

Scope the IAM principal to the **prefix you configured**, not the whole bucket.
Trebellar never needs access outside of `inbox/`, `archive/`, and `errors/` under
that prefix.

#### Locking the bucket to Trebellar's IP ranges

If your security policy requires bucket access to be pinned to a known set of
source IPs, Trebellar publishes a stable list of **egress addresses** used by the
ingestion workers. You can attach these to your bucket policy (`aws:SourceIp` on
S3 or a VPC Service Controls perimeter on GCS) so that reads, writes, and deletes
are only accepted from Trebellar's infrastructure.

To avoid drift between these docs and the live set, we don't publish the ranges
inline. To request the current list:

1. Contact your **Customer Success** representative, or email
   [support@trebellar.com](mailto:support@trebellar.com).
2. Specify the cloud provider (AWS or GCP) and the region your bucket lives in —
   we'll send back the narrowest range that covers ingestion for that region.
3. Trebellar announces changes to the egress set at least **30 days in advance**
   in the [changelog](/changelog). Subscribe to the changelog to be notified
   before you need to update your bucket policy.

### Option 1b — Trebellar-managed bucket

Trebellar provisions a tenant-isolated bucket and issues you **write-only** credentials
scoped to a single prefix. Your upstream systems use these to drop files exactly as they
would against a bucket you own.

Steps:

1. In **Admin panel → Organization → Settings → Data Management**, choose
   **S3 / GCS → Use Trebellar bucket**.
2. The admin panel will display:
   * The **bucket URI** and **prefix** to upload to
   * A **client ID** and **client secret** that your systems should use
3. Point your upstream exporter at the bucket and start pushing files.

Credentials issued in this mode have `PutObject` only — they cannot list, read,
or delete. This is by design so that a leaked credential cannot be used to
exfiltrate previously-uploaded files.

## Option 2 — SFTP

Use SFTP when your upstream systems can't write to object storage — common with
legacy BAS/BMS exports, facility-management vendors, and scheduled jobs running
on on-prem servers.

### Option 2a — Bring your own SFTP server

Trebellar connects to an SFTP endpoint you operate.

You provide:

| Field                                 | Description                                                                              |
| ------------------------------------- | ---------------------------------------------------------------------------------------- |
| **Host**                              | Hostname or IP of your SFTP server                                                       |
| **Port**                              | Defaults to `22`                                                                         |
| **Username**                          | The account Trebellar will authenticate as                                               |
| **Auth method**                       | Either an **SSH public key** or a **password**                                           |
| **Remote path**                       | Absolute path on the server where files will be read from — e.g. `/home/trebellar/inbox` |
| **Host key fingerprint** *(optional)* | SHA-256 fingerprint of the server's host key, pinned for strict host-key checking        |

Steps:

1. Create a dedicated SFTP user on your server with read/write access to the inbox
   path, and to the `archive/` and `errors/` subdirectories Trebellar will create.
2. In **Admin panel → Organization → Settings → Data Management**, choose
   **SFTP → Bring your own server**.
3. Paste the host, port, username, password or public SSH key and remote path.
4. (Optional) Paste the host key fingerprint to enforce strict host-key checking.
5. Click **Test connection** and then save.

#### Locking the SFTP server to Trebellar's IP ranges

If your SFTP server is exposed to the internet and you'd like to restrict inbound
connections to Trebellar only, request the current **egress IP ranges** for the
ingestion workers from your **Customer Success** representative or at
[support@trebellar.com](mailto:support@trebellar.com). Specify the region you
expect connections from — we'll send back the narrowest range. Add them to your
firewall / security group rules on port 22.

We keep the live list out of these docs on purpose so customers never pin a stale
range. Changes are announced **30 days in advance** in the
[changelog](/changelog).

### Option 2b — Trebellar-managed SFTP endpoint

Trebellar spins up a per-organization SFTP endpoint. This is the fastest path when
you want an SFTP drop but don't want to run a server.

You receive:

* A **hostname** — e.g. `sftp.trebellar.app`
* A **username** — scoped to your organization
* A choice of **SSH key** (recommended — you upload your public key) or a
  **generated password**
* A fixed **remote path** — `/inbox`

That's it — point your upstream job at the endpoint and start pushing.

### Client-side encryption (both SFTP modes)

For SFTP you can optionally configure **client-side PGP encryption** so that the
file contents are encrypted end-to-end, from your exporter all the way to
Trebellar's decryption step.

How it works:

1. In the admin panel, enable **Client-side encryption** for the SFTP configuration.
2. Trebellar generates a **PGP keypair** scoped to your organization and displays the
   **public key**. Download it and hand it to whoever writes files into the SFTP inbox.
3. Your exporter encrypts each file against the public key before upload — e.g.
   `gpg --encrypt --recipient trebellar@yourorg ...`. Encrypted files should use the
   `.gpg` or `.pgp` extension.
4. Upon ingestion, Trebellar decrypts the file in memory using the private key (which
   never leaves our KMS) and then runs the usual validation pipeline.

You can also use your **own PGP keypair** instead of the Trebellar-generated one — upload
the public key to the admin panel and keep the private key on your side, decrypting files
yourself via an egress to your environment. Ask Customer Success if you need this mode;
it's a small variant of Option 2a.

## Preparing your files

### Initial data requirements

Before uploading, see the [Initial Data Requirements](https://help.trebellar.com/en/articles/13566406-initial-data-requirements) article in our help center for the complete list of supported data types and the required fields for each.

### Naming & folder conventions

Each upload folder should contain the dated snapshot of your data based on the date provided in the folder's name.

For every snapshot, create a folder named in `YYYY-MM-DD` format using the date the data reflects. All files placed in that folder should correspond to that same snapshot date.

Each file within a folder should follow this naming pattern:

```
<datatype>_<date>
```

For example, if your employee and building data reflects your organization as of 2026-05-02, upload both files into:

```
2026-05-02/
   employees_2026-05-02.csv
   buildings_2026-05-02.csv
```

This tells Trebellar that all files in that folder are semantically linked and belong to the same reporting snapshot.

## Other origins

Trebellar's raw bulk importer is focused on SFTP and S3/GCS because those cover the vast
majority of enterprise data export paths. If your source is something else — FTPS, Azure
Blob Storage, a vendor-specific REST API, an email attachment pipeline, or a file drop
behind a VPN — contact your **Customer Success** representative. We'll scope a bespoke
connector; most common variants can be onboarded in under a week.

## Troubleshooting

| Symptom                                                 | Likely cause                                                                                        |
| ------------------------------------------------------- | --------------------------------------------------------------------------------------------------- |
| Files sitting in `inbox/` past the next poll            | Credentials rotated upstream, or the IAM principal lost a grant. Check the **Last sync** status     |
| File moved to `errors/` with a `schema_mismatch` error  | The file does not match the dataset mapping — compare the file against the mapping specification    |
| SFTP **Test connection** fails with `host_key_mismatch` | The pinned fingerprint no longer matches. Re-pin it from the admin panel if the change is expected  |
| Ingestion stalls after PGP rotation                     | Upstream is still encrypting to the old public key. Cutover window is 7 days — update your exporter |
| Duplicate data appears after a re-upload                | Shouldn't happen — ingestion is idempotent per content hash. Contact support with the filename      |