MDS Newsletter #27
Happy Wednesday (the weekend is just 2 days away😉)! Till then enjoy this new edition of the MDS Newsletter where we bring all the latest happenings in data space so that you don't miss out on any😃
Let's dive right into it!👇
This week's question: What are the inconvenient/harsh truths about data jobs?
You can send answers by replying to the email or writing to us at [email protected]
Last week's question-: What should be the mission/ purpose of a data team?
The purpose / mission of the data team is to help organizations make more intelligent decisions using insights from data. However, the tricky part is how to not get stuck doing data pulls, English-to-SQL translations and one-time-use dashboards. This requires strong leadership and a seat on the executive board.
Ergest Xheblati, Senior Technical Product Manager at EverQuote
Featured tools of the week
- Ataccama is a global software company delivering a unified platform for automated data quality, MDM, and metadata management. It specializes in complex enterprise data governance solutions that provide sustainable, long-term value.
Category: Data Discovery and Data Cataloging
Founded in 2007, Ataccama acquired Tellstory on Feb 25, 2021.
- Nexla makes it simple for anyone to create scalable data flows. Teams working with data get a no/low-code unified experience to integrate, transform, provision, and monitor data for any use case.
Category: Data Mesh and Reverse ETL
Nexla has raised a total of $15.5M in funding over 2 rounds. Their latest funding was raised on 13th Oct 2021 from a Series A round.
Good reads and resources
- Our data stacks aren’t built for change: In the last decade, the data industry has seen a wave of two major data tools. The first wave was of tools like Airflow and Hadoop that helped in unlocking the Big Data and the second wave was of tools that were built with better interfaces- the modern data stack tools. But now it's time to take a step back and wonder if the problem is not about our tools, but instead, about the abstraction, we're betting the farm on.
In this article, Nick Handel has discussed how DAGs, the center of the paradigm that exists today, are static representations of organizations that are ever-evolving and if data professionals want to focus on the higher value of work, they need a revolution in their stack similar to the lines of what tools like 'Terraform' did for DevOps Engineers.
- Data Lakehouse vs. Data Lake: Data lake and data warehouse are established terms when it comes to storing Big Data, but this shouldn't be confused with the fact that these terms are not synonymous. A data lake is a large pool of raw data whereas a warehouse is a repository for structured, filtered data and purpose-built data. Data warehouses are being replaced by modern, often cloud-based systems such as Data Lakes which has caused certain problems as data in data lakes is very raw and unstructured. This is where data lakehouse comes into the picture
Read this article by Christianlauer to understand what exactly is a Data Lakehouse and how they are based on Data Lakes.
- Why Analytics Sucks: Autonomy, mastery, and purpose are three key ingredients to feel motivated and fulfilled about the work one does. But all three are often absent in analytics. Analysts are generally looped in at the end of decision-making processes, consigning their autonomy and development of mastery to narrow scope and purpose is often just half a sentence sent as a Slack request. The role of analysts is reduced to data vending machines: request in, data out.
Read this article by Robert Yi where he has discussed in detail the main problem with the work of an analyst, why most of them leave the industry, and the main culprit behind this problem.
- What is Headless BI?: With the explosive growth of cloud data warehouses, an entire ecosystem of new data tools has emerged — transformation, testing, quality, observability, to name a few. On top of this modern data stack, we’ve seen a proliferation of new applications, including dashboards, embedded analytics, automation tools, and vertical-specific reporting tools. And, this boom in the modern data stack necessitates a new generation of business intelligence: Headless BI.
In this article Artyom Keydunov has discussed Headless BI, its 4 major components and has also argued why it should be open source.
- Getting Rid of raw data with Jens Larsson: In the all-new episode of The Data Engineering Show, Jens explains how he and his team killed the notion of raw data at Tink and walks us through Google, Spotify, and Ark Kapital data stacks.
Upcoming data events and summits
- Continual is organising a webinar on 'The New AI Dream Teams: Empowering Data Teams to Deliver AI Solutions at Scale' on April 6, 2022.
Join the panel discussion about the growing demand for operational AI workloads on the modern data stack and fluidity & evolution in data professionals’ skills and development practices, including the rise of analytics engineering: what it is and how it’s helping organizations realize being data-driven.
- TDWI is hosting a virtual event on 'Governing Data and Analytics' from April 6–7, 2022.
The speakers will bring you up to speed on the latest trends in governance of data and analytics, expose the best practices, and show you how to drive a successful governance program across your business.
Data startup funding and acquisition news
- Astronomer raised $213M in a Series C round of funding and acquired Datakin.
Astronomer is a modern data orchestration platform, powered by Airflow that enables data professionals to build, run, and observe pipelines as code.
This round of funding was led by Insight Partners, with participation by Meritech Capital, Salesforce VC, jp morgan, K5 Global, Venrock, and Sierra Venture. With this round of funding, the total funding for Astronomer stands at $282.9M.
- Postman is hiring a ‘Senior Business Intelligence Analyst’
Data Stack- dbt, looker, Redshift, Power BI
- Nord Security is hiring a ‘Data Analyst’
Data Stack- dbt, Bigquery, Airflow, Airbyte, Looker
- Zivver is hiring an ‘Analytics Engineer’
Location- Amsterdam, Netherlands
Data Stack-dbt, Looker, Snowflake, Snowplow
- Tucows is hiring a ‘Data Engineering Manager’
Location- Remote, Canada & USA
Data Stack- Stitch, dbt, Snowflake, Airflow, Fivetran, Looker
- Gravie is hiring a ‘Data Engineer’
Data Stack- Airflow, dbt, Redshift, Tableau
If you like this newsletter (I know you do😉), share it with your friends. It will take 10 seconds for you to share this, but took us 10 hours to prepare. Send us some love 💖
Do you have any suggestions, or want us to feature an article, or list a data engineering job, hit us up! We would love to include it in our next edition😎
About Moderndatastack.xyz - We're building a platform to bring together people in the data community to learn everything about building and operating a Modern Data Stack. It's pretty cool - do check it out :)