Scheduled report files

At client request, Pismo can run batch jobs to generate report files, including those that report daily account limits and those required to meet Brazilian Central Bank regulations:

Report file generation and paths

πŸ“˜

Most of Pismo’s clients use Amazon Web Services (AWS) S3 buckets as their cloud storage containers, so this section assumes you're using AWS. Pismo does support other cloud services, however, such as Google Cloud Platform (GCP) and Microsoft Azure. Contact your customer representative if you would like to use a cloud service other than AWS.

Report files are written to the Amazon Web Services (AWS) S3 bucket that serves as a cloud storage container for your organization. To start receiving report files, contact your Pismo representative to configure access to your cloud storage container.

The jobs that generate report files all run once a day at 2:00 AM. When a job executes, it searches for records that meet filter conditions. If any are found, it writes the data to a file that it creates in your cloud storage container. Files are saved in AWS parquet format.

πŸ“˜

In most cases, if no records are found that match the filter conditions, a report file is not generated. However, in some cases, the job might generate a file with no data.

Files are written and saved according to type – daily, monthly, or full – and date. The full type is explained below. The URL takes the form [path to S3 bucket]/reports/[job_name]/[type]/[date values]/[filename].parquet, where [filename] represents a Spark hash.

For example:
s3://pismo-dataplatform-tn -55317847-57cd-45a3-8aed-a8dadd63cc6b/reports/accounting_events/type=daily/year=2020/month=1/day=10/[filename].parquet
In this example, the value for job_name is "accounting_events". You can find the value for job_name in the documentation for the specific report file.

Files and paths are generated on the following basis:

  • Daily – In this case, the type is daily and the date partitioning values are /year=YYYY/month=MM/day=DD/ . For example, .../reports/accounting_events/type=daily/year=2020/month=1/day=10/[filename].parquet

  • Monthly – At your request, or when there is a need for periodic reprocessing, a job executes for a closed month of data. In this case, the type is monthly and the date partitioning is year=YYYY/month=MM. For example, .../reports/accounting_events/type=monthly/year=2020/month=1/[filename].parquet

  • Full – You can generate a complete file, taking into account all past job data without a date filter. In this case, the type is full and the partitioning corresponds to the file's generation date. For example. .../reports/accounting_events/type=full/year=2020/month=1/day=10/[filename].parquet

🚧

The date partition values in the report correspond to the last available data date, not to when the job is executed. For example, for a daily job that runs on 2022/03/11, the date partition values correspond to 2022/03/10.