# Setting up event batch file delivery In addition to [real-time notifications](https://developers.pismo.io/pismo-docs/docs/setting-up-real-time-event-delivery), event notifications and data can be sent in files via a configured integration channel to cloud storage - Amazon Web Services (AWS), Google Cloud Platform (GCP) - or downloaded using a Secure File Transfer Protocol (SFTP) service. This article describes the set up required on Pismo's part and your part for each method. Event data is batched for 5 minutes or until it reaches 10MB in size and then saved as a new AWS [S3 bucket](https://aws.amazon.com/s3/) object containing event [JSON](https://www.json.org/json-en.html) data to the `/main_stream/` path. For information security, no S3 bucket is ever exposed publicly. A [Key Management Service (KMS)](https://aws.amazon.com/kms/) key is created for each bucket so that stored objects get encrypted and only opened when you download it. # Setting up event file delivery to an AWS account The following shows the data flow for event file delivery to an AWS account: ![Image shows the data flow for event file delivery to an AWS account.](https://files.readme.io/d3c4ba5-AWSedFlow.JPG "AWSedFlow.JPG") 1. As processing occurs, event files are generated that are saved in a dedicated S3 bucket exclusive to your Org. 2. A listener registers the new file and generates a [new bucket object event](https://developers.pismo.io/pismo-docs/docs/event-data#new-event-file-notification). 3. The event is delivered as a real-time notification to your [SNS topic](https://docs.aws.amazon.com/sns/latest/dg/sns-create-topic.html). This requires you to also have set up [real-time event delivery to your AWS account](https://developers.pismo.io/pismo-docs/docs/setting-up-real-time-event-delivery#setting-up-real-time-delivery-to-an-aws-account). 4. Your AWS account receives the event and identifies the bucket, object's path, and size. 5. Your routine creates a S3 session using a pre-authorized [Identity Access Management (IAM) Role](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles.html) to assume Pismo's IAM Role ([`sts:AssumeRole`](https://docs.aws.amazon.com/STS/latest/APIReference/API_AssumeRole.html)), using [AWS SDK/CLI](https://aws.amazon.com/cli/) to access Pismo's S3 bucket. The file is then downloaded ( [`s3:GetObject`](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObject.html)) and copied to a local target in your environment. Steps 3, 4, and 5 can be changed according to your preferences. A worker or application can replace a [Lambda function](https://aws.amazon.com/lambda/), for example, to receive notifications and download files. A file-based system, database, or other storage solution can replace a destination S3 bucket. ## Setup tutorial Pismo provides a tutorial that steps you through the configurations below - [AWS event file configuration tutorial](https://developers.pismo.io/pismo-docs/docs/aws-event-file-configuration-tutorial). ## Pismo configuration * **S3 bucket** - An S3 bucket is created to handle file transfers exclusively for your Org. S3 bucket's [Amazon Resource Name (ARN)](https://docs.aws.amazon.com/general/latest/gr/aws-arns-and-namespaces.html) sample:\ `arn:aws:s3:::pismo-dataplatform-tn-376a4170-3c93-4676-b294-ec0b4241c7ab` * **IAM Role** - An IAM Role is created to allow file downloading from the bucket. This Role grants `s3:GetObject` permission to your data bucket. That allows file transfers and ties the `STS:AssumeRole` action to the ARN of the resource responsible for file downloading (Lambda function, for example). IAM Role's ARN sample:\ `arn:aws:iam::{pismo_aws_account_id}:role/dataplatform-consumer-tn-376a4170-3c93-4676-b294-ec0b4241c7ab` Your routine must assume this IAM Role before accessing the S3 Bucket to retrieve files. Initially, no resource has permission to execute `AssumeRole`. Permission for this action can be set after you let Pismo know your IAM Role. ## Your configuration Configure the following in your AWS environment: * **Set up real-time event delivery** - You must be able to receive Pismo event notifications in real-time to know when a new event file is available for download. To set this up, see [Setting up real-time event delivery to an AWS account](https://developers.pismo.io/pismo-docs/docs/setting-up-real-time-event-delivery#setting-up-real-time-event-delivery-to-an-aws-account). * **IAM Role** - Each processing resource at AWS needs an IAM Role permission. Pismo enables the Role the resource uses to execute an `STS:AssumeRole` on the Pismo side. Whenever your routine (Lambda, worker, application, or other method) runs, it should explicitly invoke an `STS:AssumeRole` to the Pismo's IAM Role (according to the ARN explained previously). Once your configuration is complete, you can then execute an `STS:AssumeRole` to download S3 objects. # Setting up event file delivery to a GCP account The following shows the data flow for event file delivery to a GCP account: ![Image shows the data flow for event file delivery to a GCP account.](https://files.readme.io/4129cfc-GCPedFlow.JPG "GCPedFlow.JPG") 1. As processing occurs, event files are generated that are saved in a dedicated S3 bucket exclusive to your Org. 2. A listener registers the new file and generates a [new bucket object event](https://developers.pismo.io/pismo-docs/docs/event-data#new-event-file-notification). 3. The event is delivered as a real-time notification to your [Pub/Sub](https://cloud.google.com/pubsub). This requires you to also have set up [real-time event delivery to your GCP account](https://developers.pismo.io/pismo-docs/docs/setting-up-real-time-event-delivery#setting-up-real-time-event-delivery-to-a-gcp-account). 4. A Pismo account event consumer receives the notification. 5. The file is copied to your GCP storage. ## Setup tutorial Pismo provides a tutorial that steps you through the configurations below - [GCP event file configuration tutorial](https://developers.pismo.io/pismo-docs/docs/gcp-event-file-configuration-tutorial). ## Pismo configuration Pismo creates an Identity Access Management (IAM) [service account](https://cloud.google.com/iam/docs/service-accounts) to interact with your GCP account resources. ## Your configuration Configure the following in your GCP environment: * **Project identifier** - Provide Pismo with your account's `project_id`. * **Bucket cloud storage** - Provide Pismo with a [Google cloud storage](https://cloud.google.com/storage) bucket identifier. * **Bucket access** - Configure write permission for Pismo's account service so it can write objects to your GCP bucket. To begin event file delivery, contact your Technical Account Manager asking for the service account ID related to your GCP integration and provide your GCP project ID and bucket identifier. # Setting up event file delivery using SFTP If you don't want, or are unable, to exchange integration data from cloud accounts between your organization and Pismo, or if you are not using AWS or GCP, the supported cloud providers for event file delivery, you have the option to consume the event files using an SFTP service. The Pismo SFTP service abstracts S3 and serves it as an SFTP layer. When you connect using SFTP, you are, in fact, accessing the Pismo S3 bucket. > 📘 Accessing JSON schemas > > Keep in mind that, while you can access event [JSON schemas rendered as HTML](https://developers.pismo.io/events/docs/event-base-main-event), you need an [AWS account to access JSON schemas](https://developers.pismo.io/pismo-docs/docs/event-data#json-schemas-in-s3-bucket) in Pismo's AWS documentation S3 storage bucket. Your SFTP access is allowed using a login username and [SSH RSA Key](https://serverpilot.io/docs/how-to-use-ssh-public-key-authentication/). No password is generated, just create an SSH RSA Key pair (or use an existing one) and provide Pismo with its public key which is then linked to your login username for access. The public key should be similar to this: ``` ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCwskg9UmvUCUCscqNPgpSMzMUOcpSLzESz+d8RFa+YLkMEO5doyLskdisssjtxFz8A+fW/m4XVB+DyLHoS8pxmRfML/+DyhIb40GJbGV0xcAYC0IrmNXb8ldzMU0FGyIh5r2dZtH5mK9MeDBIZeYASrVLjbyflUy6JWCUUKYFSnm1eCIzmgGYHfYQqf+doCSKItGxpB9G3HhvBXsEvlka93aZRkGiEGTNHDGQ1NooksIKJSltk2ik1XfSADZfex0xrKBmlq/uy2/HXZ3lPQrIaN9fswA7+BLES/s9LZ9C0FC2uD2AwQbpJigqUihHuC7q+zIssWWWksjdkxb1jOljvej ``` Open up a ticket at the Pismo [Service Desk](https://developers.pismo.io/pismo-docs/docs/service-desk) with your public keys attached and request SFTP access for your organization. > 📘 > > You can associate up to 8 keys with your organization. Each Org ID has its own set of credentials, so sandbox and production access are not the same. ## Accessing the SFTP service After you receive your login information from Pismo, you can connect to the SFTP service. Different environments have different access endpoints: | Environment | Endpoint | Port | | :------------------ | :-------------------------- | :--- | | Development/Sandbox | sftp-main.data.pismolabs.io | 22 | | Production | sftp-main.data.pismo.io | 22 | Make sure you are using the correct endpoint with the correct credentials to retrieve your files. Connecting to production using sandbox credentials is not going to work. To connect to the SFTP service, you can use an SFTP client like [Filezilla](https://filezilla-project.org/client_features.php) or [WinSCP](https://winscp.net/eng/index.php), or implement a system integration/automation in Java, .Net, Go, Python, or other programming language. ## Knowing when files are available To be notified when there is a new file available, you need to [set up real-time event notifications](https://developers.pismo.io/pismo-docs/docs/setting-up-real-time-event-delivery) through a cloud account. Otherwise, you will have to periodically open a connection and check for new files.