Batch Egress Options
GCS
Overview
Banyan supports sending and receiving batch data at rest in Google Cloud Storage. We provide roles for reading your enriched transaction data and have some commonly used practices to make things easy to test your integration seamlessly.
Our data is held in region: us-central1 (Iowa) and retention is set to 10 years by default on the bucket.
Folder Structure
Within your GCS bucket there will be an /output folder that contains all of your offer redemption data from Banyan.
Output
Currently Banyan has two different data products: Enrich and Confirm. The /offer path within the output folder contains the data product for Confirm.
- Example Path: gs:test-bucket/output/offer/2022-07-26/
- Example File: 2022-07-26T16:23:09Z+22837398682.avro
AWS S3
Overview
Banyan supports data retrieval from S3 using a secure bucket policy approach. This allows you to access your enriched data using your standard AWS credentials - no complex cross-account role assumptions required.
Important
Unlike ingress, egress uses bucket policies rather than IAM roles. This means you'll use your regular AWS credentials to retrieve data, not a cross-account role.
Prerequisites
To retrieve data from S3, you'll need:
- An S3 bucket provisioned by Banyan (either from your ingress setup or created specifically for egress)
- Your AWS Account ID (Please provide this to us and we can toggle egress on for you)
- AWS CLI installed and configured with your credentials
How It Works
Banyan grants your entire AWS account read-only access to your dedicated S3 bucket through a bucket policy. This means any IAM user or role in your account can retrieve the data using standard AWS credentials.
Folder Structure
Within your S3 bucket, you'll find an /output folder containing your enriched data:
s3://byn-{environment}-{type}-{partner_id}-{partner_name}/
βββ output/
βββ enrich/ # Enriched transaction data
β βββ 2024-01-15/
β βββ 2024-01-15T16:23:09Z+22837398682.avro
βββ offer/ # Offer redemption data (Confirm product)
βββ 2024-01-15/
βββ 2024-01-15T18:45:22Z+33948509793.avro
Data Products:
- Enrich: Transaction enrichment data in the
/output/enrich/
folder - Confirm: Offer redemption data in the
/output/offer/
folder
Retrieving Your Data
Using AWS CLI
List available data:
# List all output data
aws s3 ls s3://byn-production-{type}-{partner_id}-{partner_name}/output/
# List enriched data for a specific date
aws s3 ls s3://byn-production-{type}-{partner_id}-{partner_name}/output/enrich/2024-01-15/
Download files:
# Download a specific file
aws s3 cp s3://byn-production-{type}-{partner_id}-{partner_name}/output/enrich/2024-01-15/data.avro ./local-data.avro
# Download all files for a date
aws s3 cp s3://byn-production-{type}-{partner_id}-{partner_name}/output/enrich/2024-01-15/ ./local-folder/ --recursive
# Sync data (efficient for regular transfers)
aws s3 sync s3://byn-production-{type}-{partner_id}-{partner_name}/output/ ./enriched-data/
Copy to your bucket:
# Copy all enriched data to your analytics bucket
aws s3 sync s3://byn-production-{type}-{partner_id}-{partner_name}/output/enrich/ \
s3://your-analytics-bucket/banyan-enriched/ \
--exclude "*.tmp"
Snowflake:
Banyan will share processed data with you through a secure view. You can access this data directly for analysis.
Steps:
-
Access Shared Data:
- Banyan will share a secure view with your account. Query the shared data as follows:
SELECT * FROM <banyan_shared_view>;
- Banyan will share a secure view with your account. Query the shared data as follows:
-
Continuous Updates:
- Banyan is writing files and flushing them every hour or if they have hit XX MBs. This means that data can be updating throughout the day and not all at once. Oftentimes, merchant data comes in batches, so your matches to that data will in essence also be batched.
Actions:
- Integrate Banyanβs shared data into your analytics workflows using tools like Looker, Tableau, etc.
Credentials & Security
- Banyan AWS US East Account Locator:
GTB18971
- For other regions or cloud platforms, contact Banyan support.
- Data is shared securely via direct shares and secure views, ensuring compliance with data privacy standards.
SFTP
Connection requirements
You can use any SFTP client (for example, FileZilla) that supports a SSH Private Key for authorization.
Credentials
Once the contract is signed, you will generate an SSH public/private key pair, and share the public key with us. Banyan will then provide you with the hostname and user name for your SFTP server, which you will access with your private SSH key.
In general:
- Server address:
YOUR_COMPANY_NAME.sftp.getbanyan.com
- User:
sftpuser
- Authentication: SSH Private Key
Where to Retrieve Data
Once you are logged in to the server, please use the output
folder under the data
folder to retrieve your data.
Schedule
Banyan's egress system is event based. We will create events as they occur whether that is a match or offer redemption. Banyan flushes files to the SFTP server either every hour or 15MB of data, whichever happens first.
File format
- Files will be sent in AVRO format
Caveats
When youβll write your custom implementation to upload data to our SFTP server, make sure to take the following scenarios in to account:
- The server can become unavailable for a short period of time. Make sure to have in place a retry mechanism.
- The signature of the server might change due to hardware failure or changes in the hardware configuration. Make sure to take this into account (the URI wonβt change).
Updated 20 days ago