MDS Newsletter #40
Hey Folks,
We recently posted a thread about building Data Products where we listed some of the best resources for you all. Check out the thread here
Amazing People in Data
Meet Andrew Engel, Chief Data Scientist at RasgoML who went from solving math problems as a professor to building ML models to solve real-life business problems. Read his story to know how he became a full-stack data scientist 👇
If you know anyone that we should be speaking with for this series - Do let us know and stay tuned for more exciting stories!
Featured tools of the week
- Actiondesk is a spreadsheet interface that connects to your SaaS and databases so you can work on your data live. With Actiondesk, you can automate your reports like your weekly Sales recap over Slack or Email. That way, all your team stays on top of everything without having to log in to a tool.
Actiondesk raised a funding of $3.9 million in a seed round on May 24, 2022. - Scribble Data is an MLOps product company, which provides foundational blocks on which enterprises build their ML models and analysis. Scribble Data modular feature store, Enrich, comprises a number of pre-built feature engineering apps to help data teams cut time-to-market for each data science use case including unified metrics, customer behavioral modeling, and recommendations.
Scribble Data has raised a total of $2.2M in funding over 2 rounds. Their latest funding was raised on Mar 15, 2022 from a Seed round.
Featured Stack of the week
Wellthy is a caregiving support service for families with complex, chronic, and ongoing care needs. It is transforming family care through personalized care support that you can control from an online dashboard. Look how Wellthy has organised its data stack
Goods reads and resources
- The State of Data Engineering 2022: This blog post will take you to the recent developments happening in the data engineering space including different data tools that have made this space what it is today. Some of the key takeaways from this blog: 1/ Consolidation in product offerings across several data categories. 2/ Object storage and Analytics engine solutions will co-exist in the coming future. 3/ Metastore has yet to receive a better alternative solution. 4/ Airflow is still the biggest open-source product. 5/ Data catalog will become a standard.
Read this article by Einat Orr to know more in-depth about what holds in the future of data engineering. - Catalog & Cocktails Podcast: Join Juan and Tim as they explore everything interesting about data and metadata management, DataOps, knowledge graphs, and more. Listen to conversations about data that is both lighthearted and thoughtful as you hear it from the diverse group of thought leaders in the data space.
Journal
- Survival of the Fittest: ETL vs. ELT: This blog by Benedetta Cittadin aims to give an overview of ETL and ELT processes, describe the difference between them, and outline which tools are now available for organizations. ETL and ELT can serve your data integration purpose in different ways. So, to choose the solution that’s right for you, you need to consider factors like the data you have, the type of storage you use, and the (long-term) needs of your business.
Community Speaks
This week's question: Do you think the cost of a cloud data warehouse is a problem?
You can answer the question here
Last week's question: What’s that one thing you wished dbt had?
I really wish dbt had more support for the semantic (metadata) layer. It already has support to add metadata to models via YAML, e.g. label, description, joins, data type. However, it would be great if there were a way to standardize metadata and make it more "active". For example, the dbt Cloud metadata API could provide hooks to update metadata. Perhaps even a metadata management UI, as I know managing YAML text is a pain point for many people. An improved semantic layer can enable "semantic-free" downstream tools that all speak the same language. Here's an article expanding on this concept: https://towardsdatascience.com/semantic-free-is-the-future-of-business-intelligence-27aae1d11563. As an example, here are docs showing how the FlexIt Analytics BI tool integrates with dbt for "semantic-free" BI: https://learn.flexitanalytics.com/docs/dbt
Andrew Taft, Founder @ FlexIt Analytics
Upcoming data summits and events
- Data Management Summit 2022, Italy edition will be held on 7th July 2022 which is going to be a hybrid event.
A fundamental summit for CIO, CTO, CDO, System Directors, and Data Scientists who implement emerging technologies to solve new technological challenges and align with new business opportunities.
Know more about the event and registration here.
MDS Jobs
- Procore is hiring a ‘Senior Business Intelligence Analyst’
Location - Remote, United States
Data Stack- dbt, Snowflake, Tableau, Fivetran, Airflow
Apply here - QuotaPath is hiring a ‘Senior Analytics Engineer’
Location - Remote, Austin
Data Stack- Snowflake, Redshift, Big Query, Looker
Apply here - Velocity Global is hiring a ‘ Senior Data Engineer’
Location - Remote- Anywhere
Data Stack- dbt, Snowflake, Tableau, Fivetran, Airflow
Apply here
🔥on Twitter
Just for fun😀
If you have any suggestions, want us to feature an article, or list a data engineering job, hit us up! We would love to include it in our next edition😎
Subscribe to our Newsletter, Follow us on Twitter and LinkedIn, and never miss data updates again.
About Moderndatastack.xyz - We're building a platform to bring together people in the data community to learn everything about building and operating a Modern Data Stack. It's pretty cool - do check it out :)