Databricks

JustAI can export all downstream-ready variant metadata and message-level send records into your Databricks environment using a secure, shared S3 bucket (recommended) or Delta Sharing. This allows you to join JustAI variants with your internal engagement, attribution, LTV, and conversion models inside Databricks.

Data Ingress

JustAI can ingest engagement and custom event data exported from your Databricks workspace. Your team sets up the export job, and JustAI provides a shared S3 bucket to read from.

To enable this flow:

Set up a Databricks export job that writes to the shared S3 bucket provided by JustAI.
Share your Databricks workspace details so we can validate the ingestion setup.

Overview

JustAI will write daily exports containing:

Variant Metadata (copy_id → template_id, vars, attrs)
Message-Level Send Records (event_timestamp, message IDs, copy_id)

You will ingest these into Databricks using:

A Databricks-accessible S3 bucket provided by the client (recommended), or
Delta Sharing to receive a table directly (alternative).

Option A (Recommended) — S3 Export Into Your Databricks Workspace

Client Responsibilities

1) Create a shared S3 bucket or prefix for JustAI

You may use an existing bucket; JustAI only needs a dedicated prefix.

2) Create an IAM role that JustAI can assume

Minimum permissions:

s3:ListBucket
s3:GetObject
s3:PutObject
s3:DeleteObject

Example pattern:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowListingOfExportPrefix",
      "Effect": "Allow",
      "Action": "s3:ListBucket",
      "Resource": "arn:aws:s3:::<CLIENT_BUCKET_NAME>",
      "Condition": {
        "StringLike": {
          "s3:prefix": "<YOUR_EXPORT_PREFIX>/*"
        }
      }
    },
    {
      "Sid": "AllowReadWriteDeleteOnExportObjects",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:DeleteObject"
      ],
      "Resource": "arn:aws:s3:::<CLIENT_BUCKET_NAME>/<YOUR_EXPORT_PREFIX>/*"
    }
  ]
}

3) Add bucket policy to allow the JustAI role

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowJustAIList",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::<JUSTAI_AWS_ACCOUNT_ID>:role/<JUSTAI_EXPORT_ROLE_NAME>"
      },
      "Action": [
        "s3:ListBucket"
      ],
      "Resource": "arn:aws:s3:::<CLIENT_BUCKET_NAME>",
      "Condition": {
        "StringLike": {
          "s3:prefix": "<YOUR_EXPORT_PREFIX>/*"
        }
      }
    },
    {
      "Sid": "AllowJustAIObjectRW",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::<JUSTAI_AWS_ACCOUNT_ID>:role/<JUSTAI_EXPORT_ROLE_NAME>"
      },
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:DeleteObject"
      ],
      "Resource": "arn:aws:s3:::<CLIENT_BUCKET_NAME>/<YOUR_EXPORT_PREFIX>/*"
    }
  ]
}

4) Ingest the Parquet exports into Databricks

Using either:

Databricks Autoloader
Scheduled notebook
Delta Live Tables pipeline
Custom Spark job

JustAI Responsibilities

1) Provide IAM Role ARN

JustAI provides a role for the client’s bucket to whitelist.

2) Produce daily (or hourly) exports

Written to: s3://<client-bucket>/justai/exports/YYYY/MM/DD/HH/

Format: Parquet, partitioned by date/hour
Backfills run ad hoc and overwrite existing partitions

3) Write variant metadata tables

JustAI writes a companion file containing:

copy_id (JustAI copy ID)
template_id (JustAI template ID)
vars (subject/preheader/body/etc.)
attrs (persona, etc.)

4) Write message-level records

copy_id (JustAI copy ID)
template_id (JustAI template ID)
event_timestamp (Unix timestamp)
message_id (Iterable Message ID)
itbl_campaign_id (Iterable campaign ID)
itbl_template_id (Iterable template ID)

Data Structures

Message Send Events (Parquet Rows)

{
  "event_timestamp": 1736387200,
  "user_id": "abc123",
  "message_id": "000000000",
  "copy_id": "6b3f2dd3-1c57-4f56-bc26-89af7bb6cb30"
}

Variant Metadata

{
  "copy_id": "6b3f2dd3-1c57-4f56-bc26-89af7bb6cb30",
  "template_id": "welcome_email_1",
  "vars": {
    "subject": "Welcome to our Community",
    "preheader": "Let’s get started!",
    "body": "<html>..."
  },
  "attrs": {
    "persona": "learner",
    "age": "18-24"
  }
}

Your internal integrations file already documents Delta Sharing as an alternative export mechanism.

How it works:

Client creates a share in Unity Catalog
JustAI publishes tables into that share:
- justai.send_events
- justai.copy_variants

Client reads them directly in Databricks using:

SELECT * FROM delta.`/shares/justai/send_events`

This avoids S3 roles entirely but requires Unity Catalog.

End-to-End Workflow Summary

Client

Create S3 bucket + prefix
Create IAM role with Put/Get/List
Add bucket policy for JustAI role
Pull data into Databricks (Autoloader / DLT / Spark job)

JustAI

Provide IAM role ARN
Produce hourly/daily Parquet exports (partitioned)
Export variant metadata table
Export send events table
Perform backfills on request

Example Databricks Autoloader Setup

df = (spark.readStream
      .format("cloudFiles")
      .option("cloudFiles.format", "parquet")
      .load("s3://<bucket>/justai/exports/"))