7 min read

MDS Newsletter #23

MDS Newsletter #23

Hey all👋

I hope you had a great start to this week.

Last month we launched the inaugural version of the MDS Rocketship Awards to honor the movers & shakers of modern data space. We are announcing one category winner each day. So, if you haven't already, go take a look at the winners and find out which tool is leading what category. Here is the link to announcements - Twitter thread & MDS Awards page.

Let's dive into this week's edition!

Community Speaks

Last week's question- What's that one piece of advice you'll give to companies who have just started building their data teams?

Here are some really thoughtful answers that we received

Companies just getting started should focus on the quality of their data and developing a shared language before they jump at building models. There is a huge difference in having collected the data and having organized it. Short version, hire data engineers first!
Ameen Kazerouni, Chief Data & Analytics Officer Orangetheory Fitness

Based on this excellent blog, I advise colleagues to avoid 'doing data science' and focus intently on solving business problems.
Often this starts with not hiring a data scientist (first, or only). Instead there must be an investment in strong foundations, so hiring in data/analytics engineers and data analysts to work on the stack and business concept understanding first will get you to value much faster.
Dina Mohammad-Laity, VP Data, Feeld

Our advice to companies just getting started building their data teams is to learn more about the different skill sets of data specialists, and know what kind of profile they're looking for.
Team, Weld

Don’t boil the ocean. If you are starting a data team at an existing company there are practically an infinite number of ways to start. The important part is to start and finding ways to deliver value on the way to building out your stack.
Jacob Matson,VP of Finance & Operations, Simetric, Inc.

Bring in senior and modern data engineers before hiring juniors or even thinking about Data Science
Jitse-Jan V.,Head of Data, Lend Invest

Get to know how Data Scientists, Analysts, and Engineers work. Then, figure out how Software and DevOps Engineers can collaborate with them. By doing so, you'll leverage the best that each one can do for your company, from business to technology.
Ricardo Mendes, Head of Data, CI&T

This week's question- Since Women's Day is around the corner and we are celebrating and recognizing all the wonderful women around us, here's the question, who are the women in data that have inspired you through their work and achievement?

You can send answers by replying to the newsletter email or using the 'contact us' section on our website.

The need to automate workflows has been around for a long time; originally, 'cron' was the go-to tool for automation. But it's not viable anymore! Why?

Increased data volume and its complexity.

This has led to new-gen workflow orchestration tooling for the MDS and that is Workflow Orchestration

A workflow refers to any repeated software process; these processes may be defined in code or be entirely manual. Workflow orchestration then is the act of managing and coordinating the configuration and state of such automated processes.

Read an amazing article and Twitter thread on 'Workflow Orchestration' by Chris White, CTO Prefect.

  • Databand helps data engineers scale their infrastructure while maintaining data health standards so their organization can build better data products.

    Category: Data Quality Monitoring

    Databand has raised a total of $14.5M in funding over 3 rounds. Their latest funding was raised on 1st Dec from a Series A round.
  • Correlated is a product-led revenue platform that uses insights from the people using your product to alert your sales and revenue team and trigger the next best actions, helping you exceed your expansion quota every quarter.

    Category: PLG CRM

    Correlated has raised a total of $8.3M in funding over 2 rounds. Their latest funding was raised on 4th Aug 2021 from a Seed round.

Good reads and resources

  • Data & Analytics Trends to Watch in 2022: 2021 was a good year for the data industry. A lot of new trends led to hot discussions among the data practitioners, data mesh, headless BI to name a few. And many of these trends from 2021 are still going strong in 2022. This year is looking promising too, as in just one & a half months we have active discussions on newer concepts like “bundling” of the modern data stack(there were a few around unbundling as well). So what trends are going to make a buzz in 2022?

    In this article, Taylor Brownlow has shared her predicted 10 trends for Data & Analytics in 2022. She has discussed each one of her predicted trends in detail.
  • Analytics Stacks for Startups: Everyday 1000s of startups are popping up from around the world, few survive the harsh realities of the market rest of them get dead pretty soon. Hardship doesn’t stop here for those who survive. Startups that survive & scale faces a lot of problems due to increased data volume, small teams, and whatnot.

    And due to this, there are some questions that you might have a hard time finding the answer to in your data; such as “What are the margins of our products after returns and taking vouchers into account?” or “What influences the fulfillment rate of customer orders in our marketplace?” The reason can be messy data or data being spread with no integration with each other. Or maybe someone has to manually repeat this time-consuming and error-prone analysis every other week because you have no way to automate it.

    The increased data volume poses a challenging issue for growing startups, managing & making proper use of data while scaling.

    In this article, Jan Katins has talked about a Data Warehouse (DWH) tech stack, which will enable your startup to make data-driven decisions, allow the data team to iterate fast and without unnecessarily straining any engineering resources, and still be reasonably future-proof.
  • Why Becoming a Data-Driven Organization Is So Hard:  Being data-driven has been a priority for companies for decades — but many have seen mixed results. Why? Based on a new survey of executives, company culture is a harder hurdle to clear than any technical problem. On top of that, the continuing explosion of the amount of data and growing concerns over privacy and data ownership keep making the task harder. In this article, Randy Bean has shared 3 principles to help companies overcome the barriers to becoming data-driven.
  • Reflections On Designing A Data Platform From Scratch: Building a data platform is a complex journey that requires a significant amount of planning to do well. It requires knowledge of the available technologies, the requirements of the operating environment, and the expectations of the stakeholders. In this episode of Data Engineering Podcast, Tobias Macey, the host of the show, reflects on his plans for building a data platform and what he has learned from running the podcast that is influencing his choices.
  • How to get more power from your data analytics engine: With the value of business data growing, we need to think about where it is worth investing in getting the most out of it. Maybe, it's not where you would expect. In this amazing writeup, Petr Janda has discussed how the data has changed the way businesses operate, and extracting the value of data is on top of the mind of every business leader. He shared some great points on how to go about extracting more value from data using your data analytics engine - The path towards data-driven business, The hardship of the Data Team, & Data beyond ‘a data team’.
  • Building a Data Engineering Center of Excellence: In the current business environment where the data-driven approach to conducting business is getting more important each year, the need for skilled data engineers has never been greater. In this article, Richie Bachala has discussed the essential components of a functioning data engineering practice and why data engineering is becoming increasingly critical for businesses today, and how you can build your very own Data Engineering Center of Excellence!

Upcoming data events

  • Acceldata is organizing a session 'Fine Wine and Optimizing Data Pipelines 'on March 4th, 2022.
    Theme: What’s the big deal about data observability?

    This session will focus on how, when, and where enterprises can apply data observability to successfully architect, operate and optimize complex data systems at scale.

    Register here.

Call For Speakers

  • Airflow is organising 'Airflow Summit 2022' from May 23 to May 27, 2022.
    It will be a free, hybrid event where members of the Airflow community will watch sessions and network.

    Apache Airflow Summit is looking for great speakers who will provide engaging and interesting content for the event.

    Click here to register yourself as a speaker.
    Last day of submission: 14 Mar 2022.
  • Apache Beam is organising 'Beam Summit 2022' from July 18th-20th.

    The Beam Summit brings together experts and the community to share the exciting ways they are using, changing, and advancing Apache Beam and the world of data and stream processing.

    Summit your CPF here
    Last day of submission: 15 Mar 2022.

Data Startup funding news

  • Orkes raised $9.3 million in funding!

    Orkes is a Cloud-native microservices and workflow orchestration powered by Netflix Conductor.

    This round of funding was led by Battery Ventures and Vertex ventures with participation from angel investors in the round including Mahendra Ramsinghani and Gokul Rajaram and seasoned executives at Fortune 100 companies.

    Read here
  • dbt raised $222 million in a Series-D round at a $4.2 billion valuation!

    dbt is a development framework that combines modular SQL with software engineering best practices to make data transformation reliable, fast, and fun.

    This funding round was led by existing investor Altimeter with participation from Databricks, GV, Salesforce Ventures, and Snowflake.

    Read here
  • Redpanda data raised $50million in Series B funding!

    Redpanda data is a modern streaming data platform for all the developers

    his round of funding was led by GV Venture, along with existing investors lightspeed, and haystack Venture Capital.

    Read here

MDS Jobs

  • Shipyard is hiring a 'Data Community Advocate'
    Location: Remote/US
    Apply here
  • Bol.com is hiring an 'Analytics Engineer'
    Location: Analytics Engineer
    Apply here
    Check out Bol.com data stack here
  • Convoy is hiring a 'Business Intelligence Engineer'
    Location: Seattle, Washington
    Apply here
    Check out Convoy's data stack here
  • Workable is hiring a 'Data Analytics Engineer'
    Location: Barcelona, Catalonia, Spain
    Apply here

What's 🔥 on Twitter

Just for fun

If you like this newsletter (I know you do😉 ), share it with your friends. It will take 10 seconds for you to share this, but took us 10 hours to prepare. Send us some love 💖

Do you have any suggestions, or want us to feature an article, or list a data engineering job, hit us up! We would love to include it in our next edition😎

About Moderndatastack.xyz‌‌We're building a platform to bring together people in the data community to learn everything about building and operating a Modern Data Stack. It's pretty cool - do check it out :)