yaml_exec: Execute all agent and informant YAML tasks

Description

The yaml_exec() function takes all relevant pointblank YAML files in a directory and executes them. Execution involves interrogation of agents for YAML agents and incorporation of informants for YAML informants. Under the hood, this uses yaml_agent_interrogate() and yaml_informant_incorporate() and then x_write_disk() to save the processed objects to an output directory for access to fresh results.

The output RDS files are named according to the object type processed, the target table, and the date-time of processing. For convenience and modularity, this setup is ideal when a table store YAML file (typically named "tbl_store.yml" and produced via the tbl_store() and yaml_write() workflow) is available in the directory, and when table-prep formulas are accessed by name through tbl_source().

A typical directory of files set up for execution in this way might have the following contents:

a "tbl_store.yml" file for holding table-prep formulas (created with tbl_store() and written to YAML with yaml_write())
one or more YAML agent files to validate tables (ideally using tbl_source())
one or more YAML informant files to provide refreshed metadata on tables (again, using tbl_source() to reference table preparations is ideal)
an output folder (default is "output") to save serialized versions of processed agents and informants

Usage

yaml_exec(
  path = NULL,
  files = NULL,
  write_to_disk = TRUE,
  output_path = NULL,
  keep_tbl = FALSE,
  keep_extracts = FALSE
)

Arguments

path

The path that contains the YAML files for agents and informants.

files

A vector of YAML files to use. By default, yaml_exec() will attempt to process every valid YAML file but supplying a vector here limits the scope to the specified files.

write_to_disk

Should the processing include a step that writes output files to disk? This uses x_write_disk() to write RDS files and uses the base filename of the agent/informant YAML file, adding the date-time to the output filename.

output_path

The output path for any generated output files. By default, this will be a subdirectory of the provided path called "output".

keep_tbl, keep_extracts

For agents, the table may be kept if it is a dataframe object and extracts (collections of table rows that failed a validation step) may also be stored. By default, both of these options are set to FALSE.

Function ID

11-8

Description

Usage

Arguments

Function ID

See Also