![]() ![]() Here isĪ screenshot to illustrate my point. Whatever the case may be, outliers are not uncommon. More so when my daughter moves, the sensor is not attached well or You might say that the cronjob was working just fine and if it isn’t broken, why fix it ? IĮven commented during the writing that using Airflow felt like bird hunting with a scud missile.ĭespite my comment, I have a simple and good reason to make use of Airflow. To Airflow and I am also throwing a dbt incremental materialization into the mix. So today, I am back to offload the scheduling from the cron daemon I used a cronjob to execute the script with a comment about accomplishing Ip’s to find the rpi, ssh’ing to the single board computer, installing a PostgreSQL instance andįinally writing a script to query my influxDB instance, transform the data and push the data to my I was successful in installing a fresh OS, configuring the rpi to run headless, scanning the local The motivation was to save my time-series data for longer than 30-days since my free InfluxDB CloudĪccount, only has a 30-day retention policy. To hear all of the Q&A, replay the video (starting ~19:30) and visit the #events-dbt-live-expert-series Slack channel to see topics raised in the chat.My previous two posts involved installing a postgres server on a cheap, spare raspberry pi 3b+. Gitlab open-sources their data engineering infrastructure, including comprehensive docs and examples of how they use dbt Core with Airflow.Īfter the demo, Sung and fellow Solutions Architect Matt Cutini answered a range of attendee questions.Shopify Engineering recently shared lessons learned from running Apache Airflow at scale, to much discussion from others in the data-engineering community.Considerations for using the dbt CLI + BashOperator, or using the KubernetesPodOperator for each dbt job.Code examples for a quick start in your local environment.Install the dbt Cloud Provider, which enables you to orchestrate and monitor dbt jobs in Airflow without needing to configure an API.The Right Path for Your TeamĬonsider the skills and resources on your team, versus what is needed to support each path:įor those who are ready to move on to configuration, below are guides to each approach: Airflow + dbt Cloud TIP: Scrub to ~8:03 in the video to see what this might look like, and stay until ~13:37 to see Sung demo a “smart” rerun. With a combination of dbt and Airflow, each member of a data team can focus on what they do best, with clarity across analysts and engineers on who needs to dig in (and where to start) when data pipeline issues come up. dbt hones in on a subset of those jobs – enabling team members who use SQL to transform data that has already landed in the warehouse.Airflow helps orchestrate jobs that extract data, load it into a warehouse, and handle machine-learning processes.Guides to setting up dbt Core or Cloud + AirflowĪirflow and dbt share the same high-level purpose: to help teams deliver reliable data to the people they work with, using a common interface to collaborate on that work.īut the two tools handle different parts of that workflow:.Read Sung’s full blog post, from which the session and recap below are adapted.How does dbt differ from Airflow, and how (or why) might some teams use both? In the latest session, Sung Won Chung, Senior Solutions Architect at dbt Labs, addressed a question he hears often on the job: The dbt Live: Expert Series features Solution Architects from dbt Labs, taking live audience questions and covering topics like how to design a deployment workflow, how to refactor stored procedures for dbt, or how to split dev and prod environments across databases. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |