4 min read

MDS Newsletter #45

MDS Newsletter #45

Hey Folks,

It's Wednesday and we are here in your inbox with some amazing updates from data space. In this week's edition, learn how to build a modern data team by bridging the gap between analytics and AI, a few customizable SQL syntaxes, and a roundtable for all the data practitioners reading this.

Out of 530 data companies listed on MDS.xyz, Tecton with 46 likes is leading on top🚀

Go and upvote for your favorite MDS company!🧡

  • StarTree is the real-time analytics platform that brings together the scale, freshness, speed, and ease of use necessary for any company to make that vision a reality. Founded by the creators of Apache Pinot™, StarTree’s technology has been proven at scale at leading companies such as LinkedIn, Uber, Stripe, and Walmart.

    StarTree has raised a total of $28M in funding over 2 rounds. Their latest funding was raised on May 5, 2021 from a Series A round
  • PipeRider is a data quality toolkit for data professionals. With PipeRider you can profile your data sources, create highly customizable data quality assertions, and generate insightful reports.

    PipeRider allows you to define the shape of your data once, and then use the data checking functionality to alert you to changes in your data quality.
  • HealthJoy simplifies the healthcare experience. Our platform connects your employees with the right benefits at the right time and makes staying healthy easy. It brings together everything employees need to navigate their care journey, including live support.

    Here's how they have organised their data stack to give your employees the best healthcare experience.

Journals

A special shoutout to Benedetta Cittadin for being the Top Contributor to MDS Journal

  • Building a Modern Data Team: From Analytics to AI-  Enterprise companies segment data teams that are prone to selecting disparate data tools and processes that lead to complex pipelines and negatively affect getting results back to the business. Each team may have a solid process for their scope of practice, but what about the handoffs they must negotiate? Do these tools and processes aid inter-team collaboration? Well, Jordan Volz has an answer for you. He wrote a journal on how we can bridge the gap between the analytics team and the machine learning team.  He suggested certain steps for building better data processes. As he correctly said having different platforms for different groups on a data team will make the production process multiple times more challenging - A problem that is often overlooked.
  • Data Governance - A Thought Leader's Perspective: Many issues related to automated discrimination exist because nobody thought about using different data to train their model. Organizations should have a mindset that quality is not something you tap at the end; you need to bring it into your lifecycle very early on. Read this journal by Benedetta Cittadin where she wrote about her discussion with Dan Power on implementing Data Quality for a successful Data Governance Strategy. According to Dan, end-users analysts need to be enabled to use data without being programmers. Over the past four and five years, people have finally picked up that data governance is not a nice-to-have anymore but a need-to-have.

Good reads and resources

  • The 5 Hardest Things to Do in SQL:  The switch over to cloud data warehouses increases the utility of complex SQL versus python. But using SQL has its own set of downsides. Josh Berry described specific transformations that are the most painful to learn and perform in SQL and provide the actual SQL needed. If tomorrow you faced issues with SQL for using data spine, pivot table, and time series aggregations, he also included customizable SQL syntax for you in this article.

Community Speaks

What's your thought on this? Let us know here

Upcoming data events and summits

  • Continual.ai is organising the 'Data Practitioner Roundtable: Operationalizing ML' on August 16, 2022.

    The Data Practitioner Roundtable is a periodic event hosted by Continual where data practitioners share their experience building, maintaining, and using data systems. Learn directly from these practitioners and extract valuable insights from their expertise.

    Register for the event here.

MDS Jobs

  • Loop Returns is hiring an 'Analytics Manager'
    Location- Remote
    Check out Loop Returns data stack here
    Apply here
  • Newfront is hiring a 'Senior Data Engineer'
    Location- Remote
    Data Stack- BigQuery, dbt, Airflow, Fivetran, Tableau
    Apply here
  • Nextbite is hiring a 'Data Analyst'
    Location- Remote
    Data Stack- Snowflake, DBT, Fivetran, Tableau
    Apply here

🔥 On Twitter

Just for fun 😀

If you are enjoying this newsletter series please consider forwarding this to a friend! If a friend sent you this, get the next newsletter by signing up here

What do you think about our weekly Newsletter?

Love it | It's great |  Good | Okay-ish | Meh

If you have any suggestions, want us to feature an article, or list a data engineering job, hit us up! We would love to include it in our next edition😎


About Moderndatastack.xyz‌‌ - We're building a platform to bring together people in the data community to learn everything about building and operating a Modern Data Stack. It's pretty cool - do check it out :)