Scheduled reports
At client request, Pismo can run batch jobs to generate report files, including those that report daily account limits and those required by financial regulations.
Report file generation and paths
Report files are written to your organization's dedicated S3 bucket. To start receiving report files, contact your Pismo representative to configure access to your cloud storage container.
Most of Pismo’s clients use Amazon Web Services (AWS) S3 buckets as their cloud storage containers, so this section assumes you're using AWS. However, Pismo does support other cloud services, such as Google Cloud Platform (GCP) and Microsoft Azure. Contact your Pismo representative for information about using a cloud service other than AWS.
The jobs that generate report files all run once a day at 5:00 AM UTC. When a job executes, it searches for records that meet filter conditions. If any are found, it writes the data to a file that it creates in your cloud storage container. Files are saved in Parquet format.
In most cases, a report file is not generated if no records are found that match the filter conditions.
Files are written and saved according to type – daily, monthly, or full – and date. The full type is explained below. The URL takes the form [path to S3 bucket]/reports/[job_name]/[type]/[date values]/[filename].parquet
, where [filename] represents a PySpark hash.
For example:
s3://pismo-dataplatform-tn -55317847-57cd-45a3-8aed-a8dadd63cc6b/reports/accounting_events/type=daily/year=2020/month=1/day=10/[filename].parquet
In this example, the value for job_name
is "accounting_events". You can find the value for job_name
in the documentation for the specific report file.
Files and paths are generated on the following basis:
-
Daily – In this case, the type is
daily
and the date partitioning values are/year=YYYY/month=MM/day=DD/
. For example,.../reports/accounting_events/type=daily/year=2020/month=1/day=10/[filename].parquet
-
Monthly – At your request, or when there is a need for periodic reprocessing, a job executes for a closed month of data. In this case, the type is
monthly
and the date partitioning isyear=YYYY/month=MM
. For example,.../reports/accounting_events/type=monthly/year=2020/month=1/[filename].parquet
-
Full – You can generate a complete file, taking into account all past job data without a date filter. In this case, the type is
full
and the partitioning corresponds to the file's generation date. For example:.../reports/accounting_events/type=full/year=2020/month=1/day=10/[filename].parquet
The date partition values in the report correspond to the last available data date, not to the date when the job was executed. For example, for a daily job that runs on March 11, the date partition values correspond to March 10.
Updated 9 days ago