MDS Newsletter #90

Guess what! We've scoured the web to bring you the latest and greatest resources in the world of data. So what are you waiting for? Dive right in and explore all that this edition has to offer.

  • Min.io: Minio is an open-source object storage server with Amazon S3-compatible API. It provides the facility to build cloud-native applications portable across all major public and private clouds. It provides protection to the data against hardware failures using erasure code and bitrot detection.

    MinIO has raised a total of $126.3M in funding over3 rounds. Their latest funding was raised on Jan 26, 2022, from a Series B round.
  • Arcion: It offers a real-time, enterprise database replication platform with Change Data Capture (CDC) technology for automatic schema conversion, end-to-end replication, and flexible deployment. Their highly distributed, highly-parallel architecture supports 10x faster data replication and guarantees zero data loss and end-to-end data consistency. Arcion's zero-maintenance data pipelines reduce the total cost of ownership through log-based CDC, efficient data compression, and Read Once, Write Multiple technologies.

    Arcion has raised a total of $18.2M in funding over 3 rounds. Their latest funding was raised on Feb 17, 2022, from a Series A round.
  • TrovaTrip: TrovaTrip makes it possible for topic experts, creatives, and entrepreneurs to host trips around the globe with their communities. Travel and learn from yogis, photographers, bloggers, outdoor enthusiasts and more.

    Here are the data tools of TrovaTrip:

Good reads and resources

  • How to Manage Schemas and Handle Standardization: "Schema management and data standardization can be a tough nut to crack," says Tomer Peleg, as he discusses the applicative considerations and solutions to handle communication protocols and data standardization in an event-driven architecture. Tomer explains that the main concept of event-driven architecture is to decouple services by using events to trigger and communicate. Schemas are the APIs used by event-driven services, and to ensure data reliability and consistency, schemas are used as a contract that allows downstream consumers to adapt to changes and process data seamlessly. Tomer, who works at Riskifed, chose to use Avro as its serializing data format and manages all schemas in a centralized GitHub repository distributed by domains. He also discusses data standardization and the ability to provide support and alignment to a wide range of coding languages and frameworks used by development teams.
  • When GitHub Actions Get Painful to Troubleshoot, Try This Instead: GitHub Actions workflows can be difficult to troubleshoot, but what if you could write, validate, and run your CI/CD workflow locally first before pushing it to GitHub? In this article by Anna Geller, she discusses a solution to this problem involving building a custom workflow using an event-driven workflow system like Kestra that runs upon a push event to the default Git branch emitted by a GitHub webhook. Anna explains that this custom workflow can be triggered locally before committing it to GitHub, providing several benefits, including syntax validation, autocompletion, and a helpful topology view to detect issues early on.

Upcoming data events, summits and webinars

  • Step into the fascinating world of data at TDWI Munich 2023 taking place on June 20-22, 2023 at MOC Munich! As the flagship event of the TDWI e. V., TDWI Munich is dedicated to fostering knowledge exchange among data experts from all backgrounds. Attendees can immerse themselves in three days of immersive learning, networking, and professional growth.
    Delve into a wide array of topics, including Advanced Analytics, Data Architecture, Data Strategy, Agile BI, IoT & Digital Twins, AI, Data Science, Cloud, Digitization, and much more. This event offers a unique opportunity to gain valuable insights and become part of a vibrant community. Secure your spot now by clicking here.
  • Join Data on the Rocks with LaunchDarkly, Airbyte, & Castor Doc in Las Vegas, on June 27th from 6:00-9:00 pm (San Francisco-PDT). No talks or presentations, just a fantastic chance to network with fellow professionals. Relax, enjoy light bites and drinks, and connect with the teams from Census, LaunchDarkly, Airbyte, & Castor Doc. Don't miss out on this incredible opportunity to unwind and make valuable connections and register here.

MDS Jobs

  • Monzo is hiring Senior Data Scientist, Operations
    Location: UK Remote
    Stack: SQL, Python
    Apply here
  • phData is hiring Lead Data Engineer
    Location: India Remote
    Stack: Snowflake, AWS, Azure, GCP, Hadoop, Databricks
    Apply here
  • Ro is hiring Senior Analytics Engineer
    Location: US
    Stack: dbt, Snowflake, Looker
    Apply here

Just for fun 😀

Are you always hungry for more information and updates about the ever-evolving world of data?

Well, you're in luck! By following us on LinkedIn and Twitter, you'll gain access to all the latest and greatest data content!

But wait, there's more! We want to hear from you - rate us here and let us know how we're doing.

Love it | It's great |  Good | Okay-ish | Meh

We welcome any suggestions, articles you would like us to showcase, or data engineering job listings that you may have. Don't hesitate to get in touch with us and we would be delighted to incorporate your input into our next edition.

About Moderndatastack.xyz‌‌ - We're building a platform to bring together people in the data community to learn everything about building and operating a Modern Data Stack. It's pretty cool - do check it out :)