5 min read

MDS Newsletter #81

MDS Newsletter #81

Greetings, Data Folks! Brace yourselves for a fresh batch of resources on all things MDS. We're back with a bang, whether you're ready for us or not! This week's newsletter is packed with two exciting data summits, an article where we delve into the differences between batch and real-time data ingestion and much more!!! Plus, we know you wouldn’t want to miss listening to the process of building a company’s data stack on our latest podcast show!

So, what are you waiting for? Dive into this week's newsletter and stay ahead of the curve in the world of data!

Modern Data Show S02 E08

S02 E08 Data-Driven Fitness: An Inside Look into Urban Sports Club's Innovative Data Platform with Artur Yatsenko, Head of Data Platform at Urban Sports Club: In the latest episode of the Modern Data Show, we are joined by Artur Yatsenko to discuss the company's platform, it's evolving data stack, and the challenges faced while building it. Artur shared insights on adopting open-source software and tools for data management and implementing data as a product strategy.

You can listen to this episode on Spotify, YouTube, Google Podcast, Apple Podcast and Amazon Music.

  • Octopai: It is an Automated Data Intelligence platform that empowers data teams with multilayered data lineage, data discovery, and data catalogue, enabling them to trace their assets, understand the data flow in the organization, and trust their resources. Octopai can instantly navigate through some of the most complex data landscapes, whether entirely on-prem, in the cloud, or a combination of the two.
  • MOSTLY AI: MOSTLY AI’s synthetic data sets look just as real as a company’s original customer data with just as many details, but without the original personal data points – thus helping companies comply with privacy protection regulations such as GDPR, and ensuring models are fair and unbiased.

    MOSTLY AI has raised a total of $31.1M in funding over 3 rounds. Their latest funding was raised on Jan 11, 2022, from a Series B round.
  • Shuddl: Shuddl is a logistics company that uses predictive AI and geo-aware booking logic to connect shippers with the nearest available freight space on demand. The company offers 100% last-on, first-off delivery sequence, automated notifications, and machine learning facilities to ensure on-time delivery and driver performance. The company also offers effortless integration with an open API and a low-cost EDI partner. With constant visibility, accurate pick-up notifications, and prompt arrival times, shippers can track their cargo and ship with confidence.

    Here are the data tools of Shuddl:


Good reads and resources

  • AWS Services for Data Engineers | Part #1 — Data Ingestion: Get ready to embark on a journey of discovery as readers delve into the world of AWS data services, part #1 explores the art of data ingestion and uncovers the top tools for data engineers. Gaurav Thalpati breaks down the differences between batch and real-time data ingestion and reveals the essential AWS services to use for each, including AWS Glue, AWS Database Migration Service, AWS Schema Conversion Tool, Amazon Kinesis Data Streams, and Amazon Managed Streaming for Apache Kafka. With personal anecdotes and expert insights, readers can gain a better understanding of AWS data services. He also shares key considerations and mistakes to avoid, so readers can become AWS data ingestion Pro in no time!
  • How to use dbt-expectations to detect data quality issues: In a world where data is everything, data quality has become a top priority for organizations. Madison Schott, the author of this article, delves into the importance of testing data sources and models to avoid downstream data quality issues. She introduces readers to dbt-expectations, a popular data transformation tool that provides testing for sources, models, columns, and seeds within a dbt project. She explains that tests in dbt-expectations are written in YAML templates using SQL, Jinja, and dbt macros, making it a powerful tool for ensuring data quality. It also compares dbt-expectations to other dbt testing packages, such as dbt-utils and dbt-audit-helper, and provides instructions for installing dbt-expectations. Overall, Schott's article provides a comprehensive guide to using dbt-expectations to ensure data quality.

Upcoming data events, webinars and summits

  • "The Industrial Data Summit" brought to you by The Manufacturer magazine, is the premier event for manufacturing data professionals. With its sixth year fast approaching, this gathering promises to deliver valuable insights into the role of data and analytics in business operations. This summit is the perfect opportunity for data-minded professionals to come together and learn from one another, ensuring they are equipped with the knowledge and skills to stay competitive in the industry. The event is scheduled for April 26, 2023, in Birmingham.

    Register for the event here
  • Secure your place and join the conversation to drive the industry forward by attending this physical event "Data & Analytics Insight Summit" from April 18th to 20th, 2023 in America. This summit is a must-attend event for those looking to build new connections with like-minded leaders, gain new insights to de-risk projects and establish new partnerships to accelerate growth.

    Register for the event here

MDS Jobs

  • Gladly is hiring a Senior Analytics Engineer
    Location: USA (remote)
    Stack: dbt, Snowflake, Looker
    Apply here
  • Elastic is hiring a Cloud Data Engineer
    Location: USA (remote)
    Stack: dbt, Fivetran, Tableau
    Apply here
  • Discord is hiring a Senior Data Engineer
    Location: USA (remote)
    Stack: BigQuery SQL, Looker, Airflow, dbt
    Apply here

Just for fun 😀

Are you always hungry for more information and updates about the ever-evolving world of data?

Well, you're in luck! By following us on LinkedIn and Twitter, you'll gain access to all the latest and greatest data content!

But wait, there's more! We want to hear from you - rate us here and let us know how we're doing.

Love it | It's great |  Good | Okay-ish | Meh

We welcome any suggestions, articles you would like us to showcase, or data engineering job listings that you may have. Don't hesitate to get in touch with us as and we would be delighted to incorporate your input into our next edition.

About Moderndatastack.xyz‌‌ - We're building a platform to bring together people in the data community to learn everything about building and operating a Modern Data Stack. It's pretty cool - do check it out :)