Oct 12, 2022 4 min read

MDS Newsletter #55

It's Wednesday and we're back with the MDS Newsletter. We hope you're enjoying our new podcast - The Modern Data Show. If you have any suggestions for an episode or guest that we should invite. Hit us up!😃

Modern Data Show S01 E05

S01 E05: Streaming Data and the Modern Real-Time Data Stack, Lightspeed Ventures: With the modern data stack evolving constantly, the next thing to look forward to is a real-time data stack, where companies are not just producing data in real-time but also consuming it on a real-time basis. In this latest episode of the Modern Data Show, we discuss the same with our guest Nnamdi Iregbulem, who has invested in a lot of modern real-time data stack tools. Listen Now👇

You can also listen to all the episodes on Apple Podcast, Spotify, Google Podcast, YouTube and Amazon Music.

Featured tools of the week

Acceldata has developed the enterprise Data Observability Cloud to help enterprises build and operate great data products. Their Data Observability Cloud ensures the reliability of your pipelines and provides visibility into your data environments to identify, investigate, prevent, and remediate data issues.

Acceldata has raised a total of $45.6M in funding over 3 rounds. Their latest funding was raised on Sep 28, 2021 from a Series B round.
RisingWave Labs is a cloud-native streaming database that uses SQL as the interface language. It is designed to reduce the complexity and cost of building real-time applications. They have raised tens of millions of US Dollars from top-tier VC funds and angels.

Featured data stack of the week

BukuWarung is a startup focused on building bookkeeping, digital payments, and e-commerce solutions for MSMEs in Indonesia. Here's how their data stack looks like

Good reads and resources

Forrester changed the way they think about data catalogs, and here’s what you need to know: The data industry is in the middle of a fundamental shift in how we think about metadata. Gone are the days when IT teams used to create an “inventory of data” that listed its metadata and spent more time implementing and updating these tools than actually using them. With the advent of Data Catalog 2.0 in the early 2010s, the focus shifted towards data stewardship and integrating data with business context. But along with it came a challenge of adoption and companies found that people rarely used their expensive data catalogs. For a while, the data world thought that machine learning was the solution.

Now the industry is witnessing a shift again. This summer, Forrester scrapped its Wave report on “Machine Learning Data Catalogs” to make way for one on “Enterprise Data Catalogs for DataOps". In this article, Prukulpa writes about the latest sign of a major shift in how data space thinks about metadata as a follow-up to Forrester's recent publication
What to Do When Business and Data Teams Are Disconnected: A smooth working and coordination between the organisation's data team and the business team lead to better decision-making by producing more reliable insights and driving business towards being data-driven. However most of the time this isn't the case. There often is a disconnect between data teams and business teams. This could mean a team tracking analytics using a separate external platform, different from your warehouse, or the data team deleting a dashboard they thought was no longer relevant to the business. While this is a difficult problem to solve, it can be done. Madison Schott shares how can businesses bridge the communication gap between data and business teams.

Journal

Why Can’t I Find the Right Data?: There may be many reasons why a company adopts a data catalog like- answering questions about where to find data and what datasets to use, improving data quality for making business decisions and empowering your stakeholders to self-serve the data, and more importantly, the right data. In this post, Pardhu shares his experience of building DataHub at LinkedIn and the learnings from 100+ interviews with data leaders and practitioners at various companies. He shares why data catalogs fall short of overcoming the fragmentation to deliver a fully self-served data discovery experience.

Upcoming data events and summits

Monte Carlo is back with its annual conference, IMPACT: The Data Observability Summit. RSVP to hear from some of the industry’s most prominent voices, as well as the broader community of data leaders and architects paving the way forward for reliable data.

IMPACT will be a hybrid event scheduled from October 25 to 26, 2022. Register here for the event.
Castor and Bigeye are organising event on 'Building Trust in Data Teams' on November 10th. Join the discussion to learn about how data discovery and data observability can help you leverage your data better and establish deep-rooted trust in your data team.

Register here.

MDS Jobs

Takeda is hiring a ' Data Platform Engineer'
Location: Boston, Massachusetts
Stack: Databricks and Tableau
Apply here
EnergyHub is hiring a Data Engineering Manager
Location: Remote (US)
Stack: Python, Airflow, Snowflake, dbt, Tableau
Apply here
Qogita is hiring a 'Business Intelligence Analyst'
Location: UK, Netherlands
Stack: dbt, Looker, Snowflake and Fivetran
Apply here

🔥 on Twitter

Just for fun😃

If you are enjoying this newsletter series please consider forwarding this to a friend! If a friend sent you this, get the next newsletter by signing up here

Love it | It's great | Good | Okay-ish | Meh

If you have any suggestions, want us to feature an article, or list a data engineering job, hit us up! We would love to include it in our next edition😎

About Moderndatastack.xyz‌‌ - We're building a platform to bring together people in the data community to learn everything about building and operating a Modern Data Stack. It's pretty cool - do check it out :)