It's Wednesday and we are back with a new edition of the MDS newsletter. In this week's newsletter read about the challenges and opportunities with open source, the definitive guide to Reverse ETL, Data and AI Summit 2022 keynote talks, and the unsolved problem of Modern Data Stack. Also, don't forget to share your feedback!
Let's dive into this week's edition👇
Amazing People in Data
Meet Kelly: She served the U.S. Air force as a Linguistic Instructor and later took a turn in her career and went on to build data teams across different organisations. Now she works at Wellthy scaling their data team and as an Adjunct Professor at Creighton University. You’ll be blown away by her experience because we sure were! In this interview read about how she stumbled into data science, her advice to new data leaders, and how she built not just data teams but the whole data architecture in different organisations she worked with. Read the full conversation here.
Featured tools of the week
- Upsolver is the only SQL pipeline platform for cloud data lakes. It empowers any data practitioner to design pipelines that deliver continuous analytics-ready data in days, not the months required when hand-coding and orchestrating Spark.
Upsolver has raised a total of $42M in funding over 4 rounds. Their latest funding was raised on Apr 6, 2021 from a Series B round.
- Rockset is a real-time analytics database service for serving low latency, high concurrency analytical queries at scale. It builds a Converged Index™ on structured and semi-structured data from OLTP databases, streams, and lakes in real-time and exposes a restful SQL interface.
Rockset has raised a total of $61.5M in funding over 3 rounds. Their latest funding was raised on Oct 27, 2020 from a Series B round.
Featured data stack of the week
Ruby Labs is a consumer-led tech company disrupting self-help and wellness markets. Its mission is to empower their customers to make self-care a way of life. Here's how they have organised their data stack.
Want us to feature your data stack? Add it here.
Good reads and resources
- Accelerating Open Source growth: the ever expanding network effect? 3 key takeaways from fast growing companies and inputs from their founders: Open source leaders have historically been able to generate tremendous growth and adoption. Emmanuel Cassimatis asked four fast-growth open source companies(Airbyte, dbt, Jina, Prefect) about the challenges and opportunities with open source. He listed the origin of the companies, how open source helped them, and what strategy they used for growing the community. He also talked about the three takeaways to accelerate open-source growth. According to him both open and closed-source software seem to have bright days ahead. Open source especially will likely continue seeing bright days as the network effects and community involvement become faster and more prevalent.
- What is Reverse ETL? The Definitive Guide: Data warehouses are here to stay, but the problem is it is only accessible to technical users who know how to write SQL, so the platform you purchased to eliminate data silos has inevitably become a data silo itself. This is exactly why Reverse ETL is so important. Tejas Manohar and Luke Kline designed a guide for everything you should know about Reverse ETL. In this guide, learn everything there is to know about Reverse ETL, the need for Reverse ETL, how it fits into modern data stack, various use cases, how to choose a Reverse ETL tool, and much more.
- Data + AI Summit 2022: Recapping 11 Major Announcements across 4 Keynotes: On Monday, June 27, Databricks kicked off the Data + AI Summit 2022 with 5,000 people attending in San Francisco and 60,000 joining virtually. In this article Prukalpa listed DAIS 2022 keynote talks, covering everything from Spark Connect and Unity Catalog to MLflow and DBSQL. Some of the major announcements were: Databricks launched Spark Connect, now users will be able to access Spark from any device, Delta Lake 2.0 is now fully open-sourced, Databricks announced Unity Catalog, a unified governance layer for all data and AI assets, and the launch of Services, a full end-to-end deployment of ML models inside a lakehouse. Read the blog to know more about the new developments happening in the data space.
Data startup funding news
- Tecton today announced that it raised $100 million in a Series C round that brings the company’s total raised to $160 million.
This round was led by Kleiner Perkins, with participation from Databricks, Snowflake, Andreessen Horowitz, Sequoia Capital, Bain Capital Ventures and Tiger Global. Read the full story here.
Upcoming data events and summit
- Data Mash vol. 3 is going to be held on Thursday, July 14, 2022 at 9:00 PM PDT.
This is the 3rd edition of our monthly meetup + talk series for data practitioners, founders, makers, and all-around cool folks building exciting things in the data ecosystem. Short lightning talks, long discussions.
- Monte Carlo Data announced its annual IMPACT summit 2022: The Data Observability Summit will be held on October 25 -26, 2022. This is going to be a hybrid event.
Apply to speak at IMPACT 2022: Selected speakers will join the stage with pioneers and visionaries responsible for some of the most impactful data movements to date, from cloud data lakes and warehouses to the mainstream evangelism of data as a profession. The last date to apply is 31st Aug 2022.
- Ramp is hiring a 'Software Engineer - Data Platform'
Location- New York, Miami, Remote
Data Stack- Airflow, Spark, Kafka, Snowflake, AWS
- Favor is hiring a 'Director, Head of Business Analytics'
Data Stack- Looker, Tableau, or Domo
- TextNow is hiring a 'Senior Data Engineer'
Location- Waterloo, Ontario
Data Stack- Redshift, Snowflake, Airflow, Spark
🔥 On Twitter
Just for fun😄
If you are enjoying this newsletter series please consider forwarding this to a friend! If a friend sent you this, get the next newsletter by signing up here
What do you think about our weekly Newsletter?
If you have any suggestions, want us to feature an article, or list a data engineering job, hit us up! We would love to include it in our next edition😎
About Moderndatastack.xyz - We're building a platform to bring together people in the data community to learn everything about building and operating a Modern Data Stack. It's pretty cool - do check it out :)