Hello, data enthusiasts! Has your weekly data dose been given to you yet? This week’s newsletter is jam-packed with a lot of informative substance including an insightful article that breaks down the differences between Data Warehouse, Data Lake, and Data Lakehouse architectures. Plus, we've featured a powerful tool that combines scalability, dynamic data generation, and integration with any test environment.
So get your data dose below and expand your knowledge horizons 👇
Featured tools of the week
- Genrocket: Genrocket automates the design of synthetic test data, offering the only platform that combines enterprise-class scalability, dynamic data generation, integration with any test environment, and value for money. Their solution reduces the time and costs associated with manual test data creation, and their platform is highly compatible and customizable.
- Kaskada: Kaskada offers a solution for real-time data analysis, allowing users to understand and react to events in the context of past and present information. The company's product provides a high-level, declarative query language that enables users to work across streams and bulk data sources without sacrificing power or convenience. Kaskada's query language is built on a new abstraction called timelines, which offers the benefits of SQL while retaining the ability to reason about temporal context, time travel, sequencing, and time series. The queries can be used unchanged in both batch and streaming modes.
Featured stack of the week
- L'Oréal: L'Oréal is the world's largest manufacturer of high-quality cosmetics, perfumes, hair care, and skincare products. Its brands are found in over 150 countries and include such well-known names as Lancôme, Maybelline, Garnier, Redken, and Matrix.
Here are the data tools of L'Oréal:
Good reads and resources
- Benchmarking database architectures: Data Warehouse, Data Lake and Data Lakehouse: Are you curious about the differences between Data Warehouse, Data Lake, and Data Lakehouse architectures? Data Warehouses are designed to store historical data for advanced queries and analysis, while Data Lakes allow for the ingestion of exorbitant amounts of data without concern for structure. However, both solutions have limitations. The Data Lakehouse architecture combines the strengths of both to offer low-cost storage accessible by multiple data processing engines, raw data access, data manipulation, and flexibility. Read on to learn more about each architecture's strengths, limitations, and key metrics to consider when choosing the right solution for your business by Serigne DIAW.
- In "An Engineering Guide to Data Creation - A Data Contract Perspective - Part 1," Ananth Packkildurai explains the importance of the data creation process in successful data-driven organizations. He describes the simplified workflow engine of Uber’s ride-sharing business process and how data engineering captures events at each step to add value to the business. He delves into the architectural patterns for data creation, including Event Sourcing, Change Data Capture (CDC), and Outbox pattern. Packkildurai also outlines the limitations and drawbacks of each architecture pattern, providing a comprehensive guide to help readers understand which architecture is suitable for their organization.
Upcoming data events, summits and webinars
- If you're a website owner or a marketer looking for a reliable alternative to Google Analytics, we've got some exciting news for you. On April 26, 2023, at 10 am PDT / 1 pm EDT, Hex is hosting a webinar that will introduce you to a new and powerful analytics stack.
Say goodbye to Universal Analytics and hello to RudderStack, Snowflake, dbt, and Hex. With this powerful combination of tools, you'll gain unparalleled insight into individual behavior on your website and app. Register now and take the first step towards optimizing your website's performance!
- Get ready to unlock the power of your data like never before! Join QlikWorld on April 17-20, 2023, for a one-of-a-kind experience. This year, QlikWorld is offering the latest insights, the hottest trends, and the most innovative solutions for activating your data. And for those unable to attend in person, It is offering a limited virtual experience, including the QlikWorld General Sessions.
Don't miss out on this incredible event. Register for the QlikWorld Livestream and join in on the celebration of unlocking the power of data.
Data startup funding news
- Cybersyn raises $62.9M in Series A funding round led by Snowflake with participation from Coatue and Sequoia Capital: Cybersyn is a data-as-a-service (DaaS) company, focused on making the world’s economic data available to governments and businesses and enable a new generation of decision-makers. This marks the first instance where Snowflake, the data cloud behemoth, has led a startup funding round, as it endeavors to grow its data marketplace to enable enterprise customers to access live, readily available datasets.
Founder of Cybersyn: Alex Izydorczyk
- Eventbrite is hiring Senior Analytics Engineer
Location: Remote, Spain
Stack: Snowflake, Tableau, Airflow
- Jasper is hiring Senior Data Engineer
Location: USA (Remote or Hybrid)
Stack: dbt, Airflow, Kafka
- Capital One is hiring Senior Manager, Data Engineering
Stack: Snowflake, Hive, Kafka
🔥 Trending on Twitter
Just for fun 😀
Are you always hungry for more information and updates about the ever-evolving world of data?
But wait, there's more! We want to hear from you - rate us here and let us know how we're doing.
We welcome any suggestions, articles you would like us to showcase, or data engineering job listings that you may have. Don't hesitate to get in touch with us as and we would be delighted to incorporate your input into our next edition.
About Moderndatastack.xyz - We're building a platform to bring together people in the data community to learn everything about building and operating a Modern Data Stack. It's pretty cool - do check it out :)