Simplify Airflow DAG Creation and Maintenance with Hamilton in 8 minutes
<p>This post walks you through the benefits of having two open source projects, <a href="https://github.com/dagworks-inc/hamilton" rel="noopener ugc nofollow" target="_blank">Hamilton</a> and <a href="https://airflow.apache.org/" rel="noopener ugc nofollow" target="_blank">Airflow</a>, and their <a href="https://en.wikipedia.org/wiki/Directed_acyclic_graph" rel="noopener ugc nofollow" target="_blank">directed acyclic graphs</a> (DAGs) work in tandem. At a high level Airflow is responsible for orchestration (think macro) and Hamilton helps author clean and maintainable data transformations (think micro).</p>
<p>For those that are unfamiliar with Hamilton, we point you to an interactive overview on <a href="http://www.tryhamilton.dev/" rel="noopener ugc nofollow" target="_blank">tryhamilton.dev</a>, or our other posts, e.g. like this <a href="https://towardsdatascience.com/functions-dags-introducing-hamilton-a-microframework-for-dataframe-generation-more-8e34b84efc1d" rel="noopener" target="_blank">one</a>. Otherwise we will talk about Hamilton at a high level and point to reference documentation for more details. For reference I’m one of the co-creators of Hamilton.</p>
<p>For those still mentally trying to grasp how the two can run together, the reason you can run Hamilton with Airflow, is that Hamilton is just a library with a small dependency footprint, and so one can add Hamilton to their Airflow setup in no time!</p>
<p>Just to recap, Airflow is the industry standard to orchestrate data pipelines. It powers all sorts of data initiatives including ETL, ML pipelines and BI. Since its inception in 2014, Airflow users have faced certain rough edges with regards to authoring and maintaining data pipelines:</p>
<p><a href="https://towardsdatascience.com/simplify-airflow-dag-creation-and-maintenance-with-hamilton-in-8-minutes-e6e48c9c2cb0"><strong>Read More</strong></a></p>