Scheduled reports

At client request, Pismo can run batch jobs to generate report files, including those that report daily account limits and those required by financial regulations.

Report file generation and paths

Report files are written to your organization's dedicated S3 bucket. To start receiving report files, contact your Pismo representative to configure access to your cloud storage container.

📘

Most of Pismo’s clients use Amazon Web Services (AWS) S3 buckets as their cloud storage containers, so this section assumes you're using AWS. However, Pismo does support other cloud services, such as Google Cloud Platform (GCP) and Microsoft Azure. Contact your Pismo representative for information about using a cloud service other than AWS.

The jobs that generate report files all run once a day at 5:00 AM UTC. When a job executes, it searches for records that meet filter conditions. If any are found, it writes the data to a file that it creates in your cloud storage container. Files are saved in Parquet format.

📘

In most cases, a report file is not generated if no records are found that match the filter conditions.

Files are written and saved according to type – daily, monthly, or full – and date. The full type is explained below. The URL takes the form [path to S3 bucket]/reports/[job_name]/[type]/[date values]/[filename].parquet, where [filename] represents a PySpark hash.

For example:
s3://pismo-dataplatform-tn -55317847-57cd-45a3-8aed-a8dadd63cc6b/reports/accounting_events/type=daily/year=2020/month=1/day=10/[filename].parquet
In this example, the value for job_name is "accounting_events". You can find the value for job_name in the documentation for the specific report file.

Files and paths are generated on the following basis:

  • Daily – In this case, the type is daily and the date partitioning values are /year=YYYY/month=MM/day=DD/ . For example, .../reports/accounting_events/type=daily/year=2020/month=1/day=10/[filename].parquet

  • Monthly – At your request, or when there is a need for periodic reprocessing, a job executes for a closed month of data. In this case, the type is monthly and the date partitioning is year=YYYY/month=MM. For example, .../reports/accounting_events/type=monthly/year=2020/month=1/[filename].parquet

  • Full – You can generate a complete file, taking into account all past job data without a date filter. In this case, the type is full and the partitioning corresponds to the file's generation date. For example: .../reports/accounting_events/type=full/year=2020/month=1/day=10/[filename].parquet

🚧

The date partition values in the report correspond to the last available data date, not to the date when the job was executed. For example, for a daily job that runs on March 11, the date partition values correspond to March 10.