Batch Egress Options

GCS

Overview

Banyan supports sending and receiving batch data at rest in Google Cloud Storage. We provide roles for reading your enriched transaction data and have some commonly used practices to make things easy to test your integration seamlessly.

Our data is held in region: us-central1 (Iowa) and retention is set to 10 years by default on the bucket.

Folder Structure

Within your GCS bucket there will be an /output folder that contains all of your offer redemption data from Banyan.

Output
Currently Banyan has two different data products: Enrich and Confirm. The /offer path within the output folder contains the data product for Confirm.

  • Example Path: gs:test-bucket/output/offer/2022-07-26/
  • Example File: 2022-07-26T16:23:09Z+22837398682.avro

AWS S3

Overview

Banyan supports data retrieval from S3 using a secure bucket policy approach. This allows you to access your enriched data using your standard AWS credentials - no complex cross-account role assumptions required.

πŸ“˜

Important

Unlike ingress, egress uses bucket policies rather than IAM roles. This means you'll use your regular AWS credentials to retrieve data, not a cross-account role.

Prerequisites

To retrieve data from S3, you'll need:

  • An S3 bucket provisioned by Banyan (either from your ingress setup or created specifically for egress)
  • Your AWS Account ID (Please provide this to us and we can toggle egress on for you)
  • AWS CLI installed and configured with your credentials

How It Works

Banyan grants your entire AWS account read-only access to your dedicated S3 bucket through a bucket policy. This means any IAM user or role in your account can retrieve the data using standard AWS credentials.

Folder Structure

Within your S3 bucket, you'll find an /output folder containing your enriched data:

s3://byn-{environment}-{type}-{partner_id}-{partner_name}/
└── output/
    β”œβ”€β”€ enrich/                              # Enriched transaction data
    β”‚   └── 2024-01-15/
    β”‚       └── 2024-01-15T16:23:09Z+22837398682.avro
    └── offer/                               # Offer redemption data (Confirm product)
        └── 2024-01-15/
            └── 2024-01-15T18:45:22Z+33948509793.avro

Data Products:

  • Enrich: Transaction enrichment data in the /output/enrich/ folder
  • Confirm: Offer redemption data in the /output/offer/ folder

Retrieving Your Data

Using AWS CLI

List available data:

# List all output data
aws s3 ls s3://byn-production-{type}-{partner_id}-{partner_name}/output/

# List enriched data for a specific date
aws s3 ls s3://byn-production-{type}-{partner_id}-{partner_name}/output/enrich/2024-01-15/

Download files:

# Download a specific file
aws s3 cp s3://byn-production-{type}-{partner_id}-{partner_name}/output/enrich/2024-01-15/data.avro ./local-data.avro

# Download all files for a date
aws s3 cp s3://byn-production-{type}-{partner_id}-{partner_name}/output/enrich/2024-01-15/ ./local-folder/ --recursive

# Sync data (efficient for regular transfers)
aws s3 sync s3://byn-production-{type}-{partner_id}-{partner_name}/output/ ./enriched-data/

Copy to your bucket:

# Copy all enriched data to your analytics bucket
aws s3 sync s3://byn-production-{type}-{partner_id}-{partner_name}/output/enrich/ \
            s3://your-analytics-bucket/banyan-enriched/ \
            --exclude "*.tmp"

Snowflake:

Banyan will share processed data with you through a secure view. You can access this data directly for analysis.

Steps:

  1. Access Shared Data:

    • Banyan will share a secure view with your account. Query the shared data as follows:
      SELECT * FROM <banyan_shared_view>;
      
  2. Continuous Updates:

    • Banyan is writing files and flushing them every hour or if they have hit XX MBs. This means that data can be updating throughout the day and not all at once. Oftentimes, merchant data comes in batches, so your matches to that data will in essence also be batched.

Actions:

  • Integrate Banyan’s shared data into your analytics workflows using tools like Looker, Tableau, etc.

Credentials & Security

  • Banyan AWS US East Account Locator: GTB18971
  • For other regions or cloud platforms, contact Banyan support.
  • Data is shared securely via direct shares and secure views, ensuring compliance with data privacy standards.

SFTP

Connection requirements

You can use any SFTP client (for example, FileZilla) that supports a SSH Private Key for authorization.

Credentials

Once the contract is signed, you will generate an SSH public/private key pair, and share the public key with us. Banyan will then provide you with the hostname and user name for your SFTP server, which you will access with your private SSH key.

In general:

  • Server address: YOUR_COMPANY_NAME.sftp.getbanyan.com
  • User: sftpuser
  • Authentication: SSH Private Key

Where to Retrieve Data

Once you are logged in to the server, please use the output folder under the data folder to retrieve your data.

Schedule

Banyan's egress system is event based. We will create events as they occur whether that is a match or offer redemption. Banyan flushes files to the SFTP server either every hour or 15MB of data, whichever happens first.

File format

  • Files will be sent in AVRO format

Caveats

When you’ll write your custom implementation to upload data to our SFTP server, make sure to take the following scenarios in to account:

  1. The server can become unavailable for a short period of time. Make sure to have in place a retry mechanism.
  2. The signature of the server might change due to hardware failure or changes in the hardware configuration. Make sure to take this into account (the URI won’t change).