Happy Halloween Folks🎃
It's Wednesday and we are here in your inbox with some amazing updates from data space. In this week's edition, read all about technical and business metadata, testing and monitoring the data platform at scale and can small organisations start their data observability journey.
The Modern Data Show S01 E08
On software engineering approach towards data observability with Shane Murray, Field CTO at Monte Carlo Data: For early-stage startups, sometimes bringing in full-fledged data observability can be overkill. Even if an established organisation starts monitoring their data quality, it's often hard to judge if it is a tech problem or a people problem. In the latest episode of the Modern Data Show, Shane Murray, who went on from being a customer of Monte Carlo to later joining them as their field CTO, helps us understand these problems and how the Monte Carlo tool, using software engineering principles, is addressing the issue of data downtime.
Featured tools of the week
- Census is an operational analytics platform that enables you to sync your trusted analytics data from your hub into operational tools that your business teams use on a daily basis.
Census has raised a total of $80.3M in funding over 3 rounds. Their latest funding was raised on Feb 7, 2022 from a Series B round.
- Bigeye is an automated data quality monitoring platform. Bigeye helps analytics and data engineering teams effortlessly monitor the freshness and quality of data at scale. Instant alerts and a no-code interface help the team rapidly detect and respond to data issues before mission-critical dashboards and machine-learning models are impacted.
Bigeye has raised a total of $66M in funding over 3 rounds. Their latest funding was raised on Sep 16, 2021 from a Series B round.
Featured data stack of the week
- Keyrus is a global consultancy that develops data and digital solutions for performance management, combining business and technical expertise, Keyrus helps companies uncover the most value possible from data while optimizing digital strategy and customer experience.
Here's their data stack👇
Good reads and resources
- The Challenge of Connecting Technical with Business Metadata: Over the last couple of years data catalogs have steadily evolved and have emerged as one of the 'must haves' for the organisation. Lisa Ehrlinger defines the core functionalities of a data catalog as -technical metadata management and business metadata management. Technical metadata is automatically collected from the IT database infrastructure. Apart from technical metadata, a lot of knowledge resides within the people of an organization, that is, the “business context” or business metadata. The article clarifies the difference between technical and business metadata and why both kinds are required. This distinction is important for CDOs and data leaders to assess the degree of automation that can be achieved for the deployment of a data catalog.
- Testing & Monitoring the Data Platform at Scale: Operating a data platform in a large data-intensive organisation is a task in itself and is full of challenges. The data teams have to make sure the data they have carefully ingested, modelled & delivered to stakeholders is not erroneous and meets their expectations. Testing and monitoring can help organisations to keep the data quality in check. Jacob Holland writes about how Checkout.com is using various tools like dbt tests and Monte Carlo to ensure the reliability of these complex data pipelines.
- The diagnostic analytics gap: Dashboards are informative, they give an indication of whether a metric is going down or up and that's it. They do not provide an insight into 'WHY' as a particular change has happened. Only the 'why' behind these changes can drive recommendations and actions. This leads to missed opportunities to drive business impact through data and analytics. Analytics is still primarily descriptive in most organizations (i.e. WHAT’s happening). Few data and business teams have nailed diagnostic analytics (i.e., WHY this happened). In this article, Joao Sousa writes about why the diagnostic analytics gap is relevant and how to close it.
Upcoming data events and summit
- Rivery is hosting a webinar on 'The Data Team’s Journey to Positive ROI: Making Analytics Operational' on 8th November at 8 AM PT.
In this live session, join Taylor McGrath (Rivery’s VP of Data Labs) for answers to the questions data teams are currently facing:
1. How to show positive ROI
2. What data team changes are required
3. Organizational changes to consider
4. How to truly make analytics a source of growth. Reserve your spot here.
- Castor and Bigeye are organising an event on 'Building Trust in Data Teams' on November 10th. Join the discussion to learn about how data discovery and data observability can help you leverage your data better and establish deep-rooted trust in your data team. Register here.
- Paladin is hiring a 'Head of Data & Analytics Engineering'
Location: Remote (USA)
Stack: dbt, Redshift, Segment
- Takeda is hiring a ' Data Platform Engineer'
Location: Boston, Massachusetts
Stack: Databricks and Tableau
- WorkStep is hiring a 'data analyst'
Data stack: Fivetran, dbt, BigQuery
🔥 on Twitter
Just for fun😃
What do you think about our weekly Newsletter?
If you have any suggestions, want us to feature an article, or list a data engineering job, hit us up! We would love to include it in our next edition😎
About Moderndatastack.xyz - We're building a platform to bring together people in the data community to learn everything about building and operating a Modern Data Stack. It's pretty cool - do check it out :)