MDS Newsletter #44

Hey Community Members,👋

Last month, we launched a new series “Amazing People in Data”. With this series, our main goal was to celebrate the great minds in data space by sharing their data journey with you. We have learned a lot from these interviews and here are our top
5 learnings from 5 Amazing People in Data.

  • Cube Dev is headless business intelligence for building data applications. It helps provide access to data, organize it, and deliver it to every tool so developers can build powerful, fast, and consistent data applications.

    Cube Dev has raised a total of $21.7M in funding over 2 rounds. Their latest funding was raised on Jul 19, 2021 from a Series A round.
  • Alteryx unleashes the power of data analytics to help people everywhere solve business and societal problems. It unifies analytics, data science, and business process automation in one, end-to-end platform to accelerate digital transformation.
  • Clearbit is the marketing data engine for customer interactions. Clearbit helps businesses grow by providing tools that help them deeply understand their customers, identify future prospects, and highly personalize every single marketing and sales interaction. Here's what data tools they are using:

Good reads and resources

  • Why rising cloud costs are the silent killers of data platforms: Modern Data Stack tools add a tremendous amount of value and reduce a lot of the complexity that comes with building data platforms. But when you move up in the value chain we often see these tools as the biggest cost drivers in data platforms. And this is what makes Kris Peeters most concerned. He believes rising cloud costs are the silent killers of data platforms and if you want to reduce cloud costs to the max, then invest in a stellar data team that can focus on building the platform and make sure you can switch between DIY, open-source tooling and managed services
  • Spark Streaming as a Service: As the company started to grow Riskified faced many issues with managing the old Spark Streaming. So they built a self-serve Spark Streaming Infrastructure. In this article, Or Sagiv described how the team at Riskified solve each of these issues and served ~15 teams and 80+ individual streams by users alone. He described all steps they implemented to deal with the bureaucracy, stream integrations, downtime, and load on the Spark driver.

Journals

  • The Modern Data Stack Ecosystem: Spring 2022 Edition: Read this journal by Jordan Volz where he put together the most crucial components of the MDS and the main tools and vendors in each component. Talking about the future, he wrote the need for MDS vendors to innovate at a breakneck speed and provide integrations and capabilities that far exceed what is currently present in legacy systems. Also, real-time use cases are another area where the MDS is poised to move in the future.
  • What's the Difference Between Data Wrangling vs Data Cleansing vs Data Transformations: Data wrangling is the process of restructuring, cleaning, and enriching raw data into the desired format whereas data cleaning is the process of prepping data for analysis by amending or removing incorrect data within a dataset. As the amount of data rapidly increases, so does the importance of data wrangling and data cleansing. Manually wrangling and cleaning data takes a lot of work. JD Prater wrote a journal about how you can automate these manual processes without writing a line of code.

Community Highlights


What would dashboards 2.0 look like? (they are far away from being dead)
Of late, there have been many posts/articles on dashboards being dead - deathofdashboards.com and https://go.thoughtspot.com/e-book-dashboards-are-dead.html But many of us in the data industry believe this is far away from the truth. So what would the new age dashboards look like?

Here's what MDS Community had to say about it👇

If you also want to be a part of these interesting conversations, sign-up here to become a member of the MDS Community!

Upcoming data events and summits

  • Operational Analytics Club, a community by Census is hosting Summer Community Days on July 28 & 29.

    Summer Community Days is a free, virtual & practitioner-first conference hosted by the Operational Analytics Club featuring community expert sessions, keynotes, data hackathons, and local IRL happy hours in major US cities.

    Apply here to be a part of this event

Data startup funding and acquisition news

MDS Jobs

  • Procore is hiring a ‘Senior Business Intelligence Analyst’
    Location- United States
    Data Stack- dbt, Snowflake, Airflow
    Apply here
  • Chief is hiring a ‘Senior Analytics Engineer'
    Location- USA
    Data Stack- Segment, Redshift, dbt, Looker
    Apply here
  • NutriSense is hiring a ‘Senior Data Engineer'
    Location- Remote
    Data Stack- dbt, Postgres, AWS
    Apply here

🔥On Twitter

Just for fun😄

If you are enjoying this newsletter series please consider forwarding this to a friend! If a friend sent you this, get the next newsletter by signing up here.

What do you think about our weekly Newsletter?

Love it | It's great |  Good | Okay-ish | Meh

If you have any suggestions, want us to feature an article, or list a data engineering job, hit us up! We would love to include it in our next edition😎


About Moderndatastack.xyz‌‌ - We're building a platform to bring together people in the data community to learn everything about building and operating a Modern Data Stack. It's pretty cool - do check it out :)