dlairflow documentation

Reusable code components for building Apache Airflow® DAGs.

Requirements

Apache Airflow

This package is intended to work with Apache Airflow®. If you install dlairflow with pip, it should install Airflow for you.

Felis

This package is intended to work with Felis. If you install dlairflow with pip, it should install Felis for you.

fits2db

fits2db converts FITS files into data that can be streamed (piped) directly into a database. fits2db requires a C compiler, development libraries for PostgreSQL, MySQL and SQLite, and the cfitsio library. See the README file for compile instructions. Once compiled, ensure that fits2db is present in PATH.

PostgreSQL

Clients

fits2db is often used in conjunction with psql, the PostgreSQL command-line client. psql must be installed on the system and present in PATH. There are PostgreSQL client packages available for most standard Linux and macOS systems.

In addition to psql, dlairflow requires pg_dump and pg_restore. These are usually all included together in the same client package.

Airflow support

The package apache-airflow-providers-postgres must be installed. If you install dlairflow with pip, it should install apache-airflow-providers-postgres automatically

Scratch space

Some dlairflow functions and returned task will need to create intermediate files. We refer to this as “scratch” space.

Optional packages

fitsverify

fitsverify checks FITS files for compliance with the FITS Standard. fitsverify requires a C compiler and the cfitsio library. Once compiled, ensure that fitsverify is present in PATH.

Environment Variables

DLAIRFLOW_SCRATCH_ROOT

Used to specify per-user scratch space. See dlairflow.util.user_scratch() for further details.

PATH

This is the standard shell PATH variable. Several command-line utilities used by dlairflow need to be in PATH, see above.