4 min read

MDS Newsletter #110

MDS Newsletter #110

Want to be a part of an amazing conference of forward-thinkers and doers in the modern data stack space? Does the idea of seamlessly orchestrating data across different cloud-based platforms and storage systems ignite your curiosity? If yes, this edition is for you! Within these digital pages, you'll find links to two upcoming events, insightful reading resources, interesting data tools and a lot more! Dive in and let your data-driven aspirations take flight!

  • Grouparoo: Grouparoo is an open-source framework that specializes in data management, optimization, and analysis. The company aids in the frictionless transfer of data between various cloud-based utilities and data storage systems, fostering a harmonious workflow amongst Data, Product, and operational teams including Marketing, Sales, and Support. The platform is known for its extensive set of integrations which include salesforce, MySQL, Postgres, Snowflake etc.
  • Husprey: Husprey is an all in-one tool for data analytics. It aims at providing data teams with an integrated workspace where they can best accomplish their mission. The company's platform provides data notebooks that are accessible with a link and organized in workspaces where they can be easily found, to make decisions based on it, enabling businesses and data teams to collect business requirements, gather technical elements, and build knowledge around the data.
  • Trackingplan: Trackingplan is an all-in-one automated observability and quality assurance solution for data, analytics, and marketing teams. It ensures uninterrupted tracking and attribution, empowering users to create alerts for critical aspects, troubleshoot tracking issues, document data flows, and maintain data accuracy. It offers real-time issue detection, email and Slack integration, replaces outdated spreadsheets, and establishes a central source of truth for seamless teamwork and problem resolution. This streamlines quality assurance for analytics, improving coordination between data introduction, maintenance, and consumption teams.

    Here is the data stack of Trackingplan:

Good reads and resources

  • Simplifying Data Transformation in Redshift: An Approach with DBT and Airflow: Using dbt and Airflow makes changing data in Redshift easier, quicker, and better, turning tough tasks into simple ones. Dive deep into the article by Cicero Moura, who outlines how data engineers routinely need to perform Extractions, Transformations and Loadings but find it difficult with Amazon Redshift due to the complexities of SQL Scripts and dependencies. However, this transformation has been made smoother with the introduction of dbt (Data Build Tool) and Airflow. He explains that the combination of dbt and Airflow simplifies the data transformation process by harmonizing with Redshift, enabling control and testability of transformations with benefits such as simplified syntax, easy extension with macros, automated documentation, and comprehensive logs. This formulation aids teams to save operational resources, escalate the data delivery speed, and enhances the quality of data products.
  • ETL vs ELT for Analytics Backend: ETL is like carefully organizing your books before putting them on the shelf, while ELT is like chucking them on the shelf first and then moving them around as you need! Read more into it as Jacques Sham discusses the differences between ETL (Extract, Transform, Load) and ELT (Extract, Load, Transform) processes for analytics backend. Sham outlines that ETL is a more traditional, structured approach wherein data is extracted, transformed to fit into a predefined schema, and then loaded into a data warehouse for analysis. On the other hand, ELT, a newer approach, first loads the raw data into the system and then applies transformations on an as-needed basis. ELT takes advantage of modern cloud storage and computing capabilities, allowing far more flexibility and scalability while efficiently dealing with unstructured data. However, Jacques notes that this approach requires robust data governance processes to avoid chaotic and non-compliant data management.

Upcoming data events, summits and webinars

  • The Open Source Analytics Conference (OSA CON) is a must-attend event for anyone passionate about open-source analytics. Details:

    📅  December 12-14, 2023
    📍  Virtual

    OSA CON offers a unique opportunity to immerse oneself in the world of cutting-edge data solutions. At OSA CON, attendees can gain invaluable insights, connect with industry experts, and discover the latest trends in data ingestion, orchestration, databases, infrastructure, governance, visualization, and artificial intelligence. This conference is a gathering of forward-thinkers and doers in the modern data stack, making it the premier destination for those eager to embrace the future of analytics. Register here
  • Join the data revolution at revAlation, powered by Alation, in Sydney 2023. Details:

    📅  November 9, 2023
    📍  Museum of Contemporary Art, Sydney

    This event promises to transform data users into data radicals. At revAlation Sydney 2023, participants will delve into the world of data like never before. The event provides a unique opportunity to connect with a diverse community of data professionals from across the Asia Pacific region, fostering growth, collaboration, and the exchange of valuable knowledge and experiences. Register here

MDS Jobs

  • Mammoth Growth is hiring Senior Analytics Engineer
    Location: USA, Canada
    Stack: SQL, dbt, Snowflake, BigQuery, Redshift, Fivetran, Hightouch, Looker, Tableau, Sigma
    Apply here
  • EnergyHub is hiring Data Engineering Manager
    Location: US
    Stack: Snowflake, dbt, Tableau, Python, Airflow, Fivetran, Metaplane:
    Apply here
  • Dollar Shave Club is hiring Senior Analytics Engineer
    Location: US
    Stack: Tableau, dbt, Redshift, Fivetran:
    Apply here

Just for fun 😀

Are you always hungry for more information and updates about the ever-evolving world of data?

Well, you're in luck! By following us on LinkedIn and Twitter, you'll gain access to all the latest and greatest data content!

But wait, there's more! We want to hear from you - rate us here and let us know how we're doing.

Love it | It's great |  Good | Okay-ish | Meh

We welcome any suggestions, articles you would like us to showcase, or data engineering job listings that you may have. Don't hesitate to get in touch with us at [email protected] and we would be delighted to incorporate your input into our next edition.

About Moderndatastack.xyz‌‌ - We're building a platform to bring together people in the data community to learn everything about building and operating a Modern Data Stack. It's pretty cool - do check it out :)