Welcome to this week's edition of the newsletter! This week's newsletter is packed with opportunities to learn and grow, so be sure to dive in and explore all that it has to offer. Whether you're looking to expand your knowledge and skillset, stay up to date on industry trends, or connect with other professionals, we've got you covered. Don't miss out on this week's exciting content!
Featured tools of the week
- Datafold: is a data reliability platform that aims to help data professionals excel in their work. Its tools are designed to improve the speed and reliability of data pipelines by automating testing processes.
Datafold has raised a total of $22.2M in funding over 4 rounds. Their latest funding was raised on Nov 9, 2021, from a Series A round.
- Great Expectations: this is the leading tool for validating, documenting, and profiling your data to maintain quality and improve communication between teams.
Featured data stack of the week
- Veed: is an online video editing platform that makes creating videos easy and accessible to everyone. Millions of creators around the world use Veed to tell stories, create content, and grow their audience.
Good reads and resources
- Data Quality Monitoring in Apache Airflow with whylogs: This article discusses how to use the whylogs tool within an Apache Airflow workflow to monitor the quality of data in a data pipeline. whylogs is an open standard for data logging that allows users to verify the quality of their data by running a constraints validation suite and generating drift reports. The article includes instructions for installing the whylogs Airflow provider and using it to extend a DAG (Directed Acyclic Graph) with whylogs operators to validate data constraints and generate a summary drift report. The article also provides an example of using whylogs within an Airflow DAG to monitor the quality of data in a wine classification dataset. By Murilo Mendonça
- Technology Abstraction in the Data Mesh: This article discusses the importance of considering technology abstraction when implementing a Data Mesh, a paradigm for managing data that aims to create business value in the long term. The author argues that implementing a Data Mesh with a single technology or data platform may not be sustainable in the long term due to the risk of technological change and the uncertainty of the data vendor market. The article suggests that adopting a multi-technology approach and using open standards can help to ensure the persistence and interoperability of a Data Mesh implementation. The author also discusses the importance of considering the business context and user needs when choosing technologies for a Data Mesh implementation. By P Platter
Upcoming data events, webinars, and summits :
- Join the meet-up on Tuesday, 24 January at 6:00 pm at Mesh-AI HQ in Liverpool Street to get started with some of the hottest trends in Data for 2023. There will be two talk sessions.
Talk 1: Andrew Jones from GoCardless will be giving a talk on data contracts.
Talk 2: Steve Goodman from Tide will be giving a talk on machine learning (ML) explainability.
Register for the event here
Data startup funding news of the week
- Husprey raises €3M in a seed round to replace dashboards with data notebooks. Husprey's data notebooks streamline the process of data analysis and decision-making, helping companies make better use of their data and drive growth. Read the full story here.
- Fabulous is hiring a 'Senior Analytics Engineer
Stack: GitHub, GitHub actions, dbt, BQ, Fivetran, Metabase, Looker Studio
- Fivetran is hiring Analytics Engineer - Architecture
Location : Oakland, California, United States
- fireflyhealth is hiring a Clinical Data Analyst
Stack: Snowflake, dbt, Looker
🔥 on Twitter
Just for fun 😀
What do you think about our weekly Newsletter?
If you have any suggestions, want us to feature an article, or list a data engineering job, hit us up! We would love to include it in our next edition😎
About Moderndatastack.xyz - We're building a platform to bring together people in the data community to learn everything about building and operating a Modern Data Stack. It's pretty cool - do check it out :)