6 min read

MDS Newsletter #88

MDS Newsletter #88

We've just wrapped up the epic MDS Rocketship Awards 2023, and we're bursting with excitement! 🏆 We've had an incredible time over the past 30 days!
So, here's a big shoutout to all the winners! You guys absolutely rocked it and of course, a massive THANKS goes out to our awesome jury members for their precious votes in determining the best tools! You all shine bright like diamonds! 🌟
Take a look at all the Winners, Categories and Jury members:

Modern Data Show S02 E14

S02 E14: Transforming Data Pipelines for the Future: An Interview with Sean Knapp, CEO of Ascend.io: Uncover the secret to turning data engineering into a superpower! As Sean Knapp, the CEO and founder of Ascend.io, joined us and discussed the value of depth and breadth in capturing the entire data value chain, emphasizing the need for an automation layer to adapt to the evolving data landscape. Ascend's platform enables intelligent data pipeline creation and management, with a dynamic control plane that detects and responds to changes in real-time across extensive pipeline networks. Sean further explored the potential of generative AI in data engineering & his optimism about the future of the modern data stack, foreseeing consolidation and the emergence of new parallel spaces in the data ecosystem.

You can listen to this episode on Spotify, Google Podcast, YouTube, Apple Podcast and Amazon Music

  • Osmos: It is a data ingestion platform that offers advanced capabilities for businesses to streamline their data handling processes. Osmos enables companies to significantly reduce the time required for data ingestion, transforming a process that used to take months into a matter of minutes. Implementation and operations teams can effortlessly ingest clean data without the need for developers, as Osmos eliminates the barriers typically associated with data ingestion.
  • StarRocks: An open-source, high-performance analytical database, StarRocks offers fast, fresh, and flexible analytics without compromising on quality. Regardless of the scenario, users can expect extremely fast query performance, surpassing other popular solutions by 3 to 10 times when running queries against single or multiple tables, and querying local tables or data stored in data lakes. With StarRocks, real-time analytics are guaranteed, ensuring the latest data for up-to-date insights. The database's flexibility empowers users to unleash the full potential of their data, adapting to various use cases and easily scaling their business on demand.
  • BukuWarung: BukuWarung is a digital platform that focuses on building the digital infrastructure for Micro, Small, and Medium Enterprises (MSMEs), catering to the growing demand for financial services and productivity tools. With a large user base, BukuWarung aims to provide a wide range of services, including micro-loans and payment solutions, to meet the unique needs of MSMEs. As the platform expands and accumulates transactional data, it can effectively manage risks and potentially evolve into a digital MSME neo-bank, offering comprehensive financial services from deposits to insurance.

    Here are the data tools of BukuWarung:

Good reads and resources

  • A Comprehensive Guide to Data Warehouse Architecture in 2023: Components, Design, and Best Practices: Data warehouses are crucial for businesses today, providing a centralized repository for data storage, management, and analysis. In this comprehensive guide written by Dr. Nilimesh Halder, data warehouse architecture is explored, covering its components, design principles, and best practices. He delves into the layers of data sources, data integration and ETL, data storage, metadata, and data access and analytical tools. He emphasizes the importance of clear objectives, data integration processes, optimized data storage, data security, governance, performance monitoring, scalability, and future growth. By following these guidelines, organizations can build robust and effective data warehouses that enable data-driven decision-making and provide valuable insights in the data-driven economy.
  • Airflow 2.6: A New Milestone in Data Engineering: Apache Airflow 2.6 has been released, It is a powerful tool for data engineers that enables them to build, schedule, and monitor complex data pipelines effortlessly. The new version introduces several exciting features and improvements. Maxime Haegeman, author of this article explains that the Graph View in Grid View makes it easier to navigate through large Directed Acyclic Graphs (DAGs). Additionally, the @continuous schedule allows users to trigger DAGs continuously without the need for complicated scheduling. Notifiers in Airflow 2.6 eliminate the need for writing callbacks, enabling users to receive notifications through channels like Slack and Email. The new airflow connections test command helps users avoid connection failures by allowing them to test their connections proactively. Other enhancements include managing policies at scale, limiting task concurrency at the DAG Run level, and efficiently sharing data frames between tasks using the Pandas Serializer. Maxime concludes by acknowledging the contributions of the Airflow community and invites readers to explore the possibilities of Airflow 2.6.

Test your Data Analytics Skills

Guess what? We stumbled upon this super awesome Discord channel packed with data enthusiasts. And let me tell you, it's a treasure of mind-bending puzzles!

Picture this: there's a mysterious data puzzle, a government agency, and even some lost computers in the mix. Now, here's the exciting part: if you can crack the code and unlock the missing data from the dataset, you could win yourself a shiny custom-made computer. How cool is that?
So, join the Discord channel to uncover all the juicy details and get ready to put your data analytics skills to the test: https://discord.gg/kGhgX4Wwvn

Upcoming data events, summits and webinars

  • Introducing Generation AI, the premier event for the global data community, taking place at
    📍 Moscone Center in San Francisco  
    📅 June 26–29, 2023.
    Join thousands of data engineers, scientists, analysts, and leaders as they converge to explore the latest innovations in data and AI.
    Engage with a diverse network of data professionals from around the world, and unlock the full potential of data and AI at the Data + AI Summit. Don't miss out - register now!
  • Unlock the Power of Data for Business Growth at Datatechvibe's flagship Data and Analytics Summit, Velocity. With the Data & Analytics Market booming, this summit is a must-attend event for business leaders looking to adapt and thrive in the digital age. Join them on
    📅 June 6th-7th, 2023
    📍 Dubai Marina, UAE
    Over 30 expert speakers will share their insights and expertise across 20 industries. The summit offers thought-provoking discussions and hyper-focused sessions on topics such as data monetization, predictive analytics, data ethics, governance, real-time data visualization, data modeling languages, and scaling with data in the era of IoT and edge computing. Register here!

MDS Jobs

  • MoonPay is hiring Senior Data Analyst - Finance
    Location: Remote, United States
    Stack: dbt, Google BigQuery, and Looker
    Apply Here
  • Cameo is hiring Analytics Engineer
    Location: Remote, United States
    Stack: Snowflake, BigQuery, Redshift, dbt, Tableau
    Apply Here
  • ACLU is hiring Director of Analytics Engineering
    Location: Remote, United States
    Stack: AWS Redshit, Python, dbt, SQL
    Apply Here

Just for fun 😀

Are you always hungry for more information and updates about the ever-evolving world of data?

Well, you're in luck! By following us on LinkedIn and Twitter, you'll gain access to all the latest and greatest data content!

But wait, there's more! We want to hear from you - rate us here and let us know how we're doing.

Love it | It's great |  Good | Okay-ish | Meh

We welcome any suggestions, articles you would like us to showcase, or data engineering job listings that you may have. Don't hesitate to get in touch with us and we would be delighted to incorporate your input into our next edition.

About Moderndatastack.xyz‌‌ - We're building a platform to bring together people in the data community to learn everything about building and operating a Modern Data Stack. It's pretty cool - do check it out :)