⚠️There's a newer version (1.3.3) of this package.Take me there.

CausalQueries

https://integrated-inferences.github.io/CausalQueries/

CausalQueries is a package that lets you declare binary causal models, update beliefs about causal types given data and calculate arbitrary estimands. Model definition makes use of dagitty functionality. Updating is implemented in stan.

  • See vignettes for a guide to getting started.

  • See here for a guide to using CausalQueries along with many examples of causal models

Installation

To install the latest stable release of CausalQueries:

install.packages("CausalQueries")

To install the latest development release :

install.packages("devtools")
devtools::install_github("integrated-inferences/CausalQueries")

Causal models

Causal models are defined by:

  • A directed acyclic graph (DAG), which provides the set of variables, a causal ordering between them, and a set of assumptions regarding conditional independence. If there is no arrow from A to B then a change in A never induces a change in B.
  • Functional forms. Functional forms describe the causal relationships between nodes. You often have to make strong assumptions when you specify a functional form; fortunately however if variables are categorical we do not need functional forms in the usual sense. The DAG implies a set of "causal types." Units can be classed together as of the same causal type if they respond to the same way to other variables. For instance, a type might be the set of units for which X=1 and for which Y=1 if and only if X=1. The set of causal types grows rapidly with the number of nodes and the number of nodes pointing into any given node. In this setting imposing functional forms is the same as placing restrictions on causal types: such restrictions reduce complexity but require substantive assumptions. An example of a restriction might be "Y is monotonic in X."
  • Priors. In the standard case, the DAG plus any restrictions imply a set of parameters that combine to form causal types. These are the parameters we want to learn about. To learn about them we first provide priors over the parameters. With priors specified the causal model is complete (it is a "probabilistic causal model") and we are ready for inference. Setting priors is done using the set_priors function and many examples can be seen by typing ? set_priors.R.

A wrinkle:

  • It is possible that nodes are related in ways not captured by the DAG. In such cases dotted curves are sometimes placed between nodes on a graph. It is possible to specify such possible unobservable confounding in the causal model. This has implications for the parameter space.

Inference

Our goal is to form beliefs over parameters but also over more substantive estimands:

  • With a causal model in hand and data available about some or all of the nodes, it is possible to make use of a generic stan model that generates posteriors over the parameter vector.

  • Given updated (or prior) beliefs about parameters it is possible to calculate causal estimands of inference from a causal model. For example "What is the probability that X was the cause of Y given X=1, Y=1 and Z=1."

Credits etc

The approach used in CausalQueries is developed in Humphreys and Jacobs 2023 drawing on work on probabilistic causal models described in Pearl's Causality (Pearl, 2009). We thank Ben Goodrich who provided generous insights on using stan for this project. We thank Alan M Jacobs for key work developing the framework underlying the package. Our thanks to Jasper Cooper for contributions generating a generic function to create Stan code, to Clara Bicalho who helped figure out the syntax for causal statements, to Julio S. Solís Arce who made many key contributions figuring out how to simplify the specification of priors, and to Merlin Heidemanns who figured out the rstantools integration and made myriad code improvements.

Copy Link

Version

Install

install.packages('CausalQueries')

Monthly Downloads

709

Version

1.1.0

License

MIT + file LICENSE

Maintainer

Till Tietz

Last Published

April 10th, 2024

Functions in CausalQueries (1.1.0)

collapse_data

Make compact data with data strategies
get_all_data_types

Get all data types
democracy_data

Development and Democratization: Data for replication of analysis in *Integrated Inferences*
default_stan_control

default_stan_control
draw_causal_type

Draw a single causal type given a parameter vector
get_data_families

get_data_families
get_nodal_types

Get list of types for nodes in a DAG
get_param_dist

Get a distribution of model parameters
drop_empty_families

Drop empty families
construct_commands_other_args

make_par_values
expand_data

Expand compact data object to data frame
decreasing

Make monotonicity statement (negative)
expand_wildcard

Expand wildcard
get_ambiguities_matrix

Get ambiguities matrix
get_event_probabilities

Draw event probabilities
find_rounding_threshold

helper to find rounding thresholds for print methods
get_type_names

Get type names
get_estimands

helper to get estimands
data_to_data

helper to generate a matrix mapping from names of M to names of A
get_causal_types

Get causal types
get_query_types

Look up query types
get_prior_distribution

Get a prior distribution from model
get_posterior_distribution

Get the posterior distribution from a model
get_parameter_matrix

Get parameter matrix
expand_nodal_expression

Helper to expand nodal expression
get_parameter_names

Get parameter names
get_type_prob_c

generates one draw from type probability distribution for each type in P
get_type_prob_multiple

Draw matrix of type probabilities, before or after estimation
lipids_data

Lipids: Data for Chickering and Pearl replication
make_par_values_stops

make_par_values_stops
make_par_values

make_par_values
make_data

Make data
get_type_prob

Get type probabilities
make_model

Make a model
perm

Produces the possible permutations of a set of nodes
make_ambiguities_matrix

Make ambiguities matrix
get_type_distributions

helper to get type distributions
list_non_parents

Returns a list with the nodes that are not directly pointing into a node
make_parameter_matrix

Make parameter matrix
interacts

Make statement for any interaction
make_nodal_types

Make nodal types
institutions_data

Institutions and growth: Data for replication of analysis in *Integrated Inferences*
observe_data

Observe data, given a strategy
parameter_setting

Setting parameters
parents_to_int

Helper to turn parents_list into a list of data_realizations column positions
make_parameters_df

function to make a parameters_df from nodal types
print.nodes

Print a short summary for a causal_model nodes
get_parents

Get list of parents of all nodes in a model
get_parmap

Get parmap: a matrix mapping from parameters to data types
get_type_prob_multiple_c

generates n draws from type probability distribution for each type in P
query_to_expression

Helper to turn query into a data expression
query_model

Generate estimands dataframe
grab

Grab
interpret_type

Interpret or find position in nodal type
is_a_model

Check whether argument is a model
make_parmap

Make parmap: a matrix mapping from parameters to data types
make_prior_distribution

Make a prior distribution from priors
print.event_probabilities

Print a short summary for event probabilities
print.stan_summary

Print a short summary for stan fit
minimal_data

Creates a data frame for case with no data
minimal_event_data

Creates a compact data frame for case with no data
print.dag

Print a short summary for a causal_model DAG
print.posterior_event_probabilities

Print a short summary of posterior_event_probabilities
prep_stan_data

Prepare data for 'stan'
plot_model

Plots a DAG in ggplot style using a causal model input
print.statement

Print a short summary for a causal_model statement
queries_to_types

helper to get types from queries
n_check

n_check
nodes_in_statement

Identify nodes in a statement
gsub_many

Recursive substitution
increasing

Make monotonicity statement (positive)
make_data_single

Generate full dataset
print.parameters_prior

Print a short summary for causal_model parameter prior distributions
print.parameters

Print a short summary for causal_model parameters
print.parents_df

Print a short summary for a causal_model parents data-frame
query_distribution

Calculate query distribution
print.causal_model

Print a short summary for a causal model
set_sampling_args

set_sampling_args From 'rstanarm' (November 1st, 2019)
simulate_data

simulate_data is an alias for make_data
print.causal_types

Print a short summary for causal_model causal-types
print.type_prior

Print a short summary for causal-type prior distributions
make_events

Make data in compact form
prior_setting

Setting priors
summarise_distribution

helper to compute mean and sd of a distribution data.frame
non_decreasing

Make monotonicity statement (non negative)
non_increasing

Make monotonicity statement (non positive)
print.type_posterior

Print a short summary for causal-type posterior distributions
update_causal_types

Update causal types based on nodal types
st_within

Get string between two regular expression patterns
substitutes

Make statement for substitutes
set_confound

Set confound
set_ambiguities_matrix

Set ambiguity matrix
set_prior_distribution

Add prior distribution draws
restrict_by_labels

Reduce nodal types using labels
summary.causal_model

Summarizing causal models
set_restrictions

Restrict a model
realise_outcomes

Realise outcomes
te

Make treatment effect statement (positive)
update_model

Fit causal model using 'stan'
print.model_query

Print a tightened summary of model queries
print.nodal_types

Print a short summary for causal_model nodal-types
print.parameters_posterior

Print a short summary for causal_model parameter posterior distributions
print.parameters_df

Print a short summary for a causal_model parameters data-frame
reveal_outcomes

Reveal outcomes
restrict_by_query

Reduce nodal types using statement
set_parameter_matrix

Set parameter matrix
set_parmap

Set parmap: a matrix mapping from parameters to data types
uncollapse_nodal_types

uncollapse nodal types
unpack_wildcard

Unpack a wild card
type_matrix

Generate type matrix
add_dots

Helper to fill in missing do operators in causal expression
CausalQueries-package

'CausalQueries'
data_type_names

Data type names
complements

Make statement for complements
construct_commands_alter_at

make_par_values
clean_params

Check parameters sum to 1 in param_set; normalize if needed; add names if needed
CausalQueries_internal_inherit_params

Create parameter documentation to inherit
collapse_nodal_types

collapse nodal types
check_query

Warn about improper query specification and apply fixes
add_wildcard

Adds a wildcard for every missing parent
clean_condition

Clean condition
causal_type_names

Names for causal types
clean_param_vector

Clean parameter vector
check_args

helper to check arguments
construct_commands_param_names

make_par_values
check_string_input

Check string_input