Scheduled report files
At client request, Pismo can run batch jobs to generate report files, including those that report daily account limits and those required to meet Brazilian Central Bank regulations:
Report file generation and paths
Most of Pismo’s clients use Amazon Web Services (AWS) S3 buckets as their cloud storage containers, so this section assumes you're using AWS. Pismo does support other cloud services, however, such as Google Cloud Platform (GCP) and Microsoft Azure. Contact your customer representative if you would like to use a cloud service other than AWS.
Report files are written to the Amazon Web Services (AWS) S3 bucket that serves as a cloud storage container for your organization. To start receiving report files, contact your Pismo representative to configure access to your cloud storage container.
The jobs that generate report files all run once a day at 2:00 AM. When a job executes, it searches for records that meet filter conditions. If any are found, it writes the data to a file that it creates in your cloud storage container. Files are saved in AWS parquet format.
In most cases, if no records are found that match the filter conditions, a report file is not generated. However, in some cases, the job might generate a file with no data.
Files are written and saved according to type – daily, monthly, or full – and date. The full type is explained below. The URL takes the form [path to S3 bucket]/reports/[job_name]/[type]/[date values]/[filename].parquet
, where [filename] represents a Spark hash.
For example:
s3://pismo-dataplatform-tn -55317847-57cd-45a3-8aed-a8dadd63cc6b/reports/accounting_events/type=daily/year=2020/month=1/day=10/[filename].parquet
In this example, the value for job_name
is "accounting_events". You can find the value for job_name
in the documentation for the specific report file.
Files and paths are generated on the following basis:
-
Daily – In this case, the type is
daily
and the date partitioning values are/year=YYYY/month=MM/day=DD/
. For example,.../reports/accounting_events/type=daily/year=2020/month=1/day=10/[filename].parquet
-
Monthly – At your request, or when there is a need for periodic reprocessing, a job executes for a closed month of data. In this case, the type is
monthly
and the date partitioning isyear=YYYY/month=MM
. For example,.../reports/accounting_events/type=monthly/year=2020/month=1/[filename].parquet
-
Full – You can generate a complete file, taking into account all past job data without a date filter. In this case, the type is
full
and the partitioning corresponds to the file's generation date. For example..../reports/accounting_events/type=full/year=2020/month=1/day=10/[filename].parquet
The date partition values in the report correspond to the last available data date, not to when the job is executed. For example, for a daily job that runs on 2022/03/11, the date partition values correspond to 2022/03/10.
Updated 19 days ago