Customer.io

JustAI can share data about which content each user received for the JustAI-integrated Customer.io campaigns. The easiest strategy is to set up a daily data export with a shared S3 bucket on their AWS account to read/write to.

Example Data Payload

Each record would reflect one JustAI API call with a user ID, a campaign ID, an action ID, and a journey ID that uniquely identifies an email/notification in Customer.io.

{
    "event_timestamp": <unix_timestamp>,
    "user_id": <string>,       // Customer.io
    "campaign_id": <number>,   // Customer.io
    "action_id": <number>,     // Customer.io
    "journey_id": <string>,    // Customer.io
    "copy_id": <uuid_string>,  // JustAI
    "template_id": <string>,   // JustAI
    // Record of strings, but depends on the template
    "vars": {
        "subject": <string>,
        "preheader": <string>,
        "body": <string>
    },
    // Record of strings, but depends on the template
    "attrs": {
        "persona": <string>,
        "age": <string>
    }
}

In JustAI, each variant has a UUID (copy_id) and a template ID. A template ID corresponds 1:1 with an email / push / etc within a Customer.io campaign, but there can be many variants per template.

The journey ID & action ID uniquely identifies an instance of an email / push / etc and is generated by Customer.io. In our dashboards, we’ll be aggregating the engagement metrics produced by Customer.io but grouped by copy_id and date to see the performance of each variant over time.

Implementation

This is just a default, and there may be other preferred approaches (direct to Snowflake, etc).

JustAI to provision an ARN role that will read/write to the shared AWS bucket.
Client to create bucket or path in existing bucket and grant read/write access to the role (1)
JustAI to export a backfill of data & to set up a daily export for new records.
Client to transfer data from S3 into Snowflake (for example).

Implementation Details

This is just a default, and there may be other preferred approaches (Avro, etc.).

The exported data to be in Parquet and written to a partitioned path like “…/YYYY/MM/DD/HH”
Backfills to be run adhoc & would overwrite any existing data.
The copy variables can be modified in the frontend, so the UUID => vars could be different. It’s generally the case that we will not modify them once they are being served unless there is a typo / etc.
The copy metadata could be set up as a separate table rather than flattening them if that is easier for downstream analysis / better storage.
Retention can be handled as a bucket policy.