MDS Newsletter #53
In this week's newsletter read all about building a data analytics team, selecting the right set of tooling for your business needs, data contract ( currently the most talked topic in data space) and upcoming data events organised by Atlan and Monte Carlo (RSVP now!)
Modern data show S01 E03
Selecting the right set of data tools is important as it can have a long-term strategic impact on your business. You can choose between commercial or open source tooling and can even custom-build it according to your needs. In this episode, we discussed the factors to be considered while making this decision with our guests, Lucas and Addison from Hudl. We also took a deep dive into self-serve analytics, data governance, observability and much more. Listen Now 👇
You can also listen to all the episodes on Apple Podcast, Spotify, Google Podcast, YouTube and Amazon Music
Featured tools of the week
- Streamlit is a faster way to build and deploy data apps. Streamlit's open-source Python library makes it easy to create and share, custom web apps for machine learning and data science.
Streamlit has raised a total of $62M in funding over 3 rounds. Their latest funding was raised on Apr 7, 2021 from a Series B round.
- Segment simplifies the process of collecting data and connecting new tools, allowing you to spend more time using your data, and less time trying to collect it. You can use Segment to track events that happen when a user interacts with the interfaces.
Featured data stack of the week
- Branch energy is an energy supplier that uses modern data science and engineering tools to lower our customer bills, all with 100% renewable energy and while planting many trees.
If you want us to feature your data stack, publish it here.
Good reads and resources
- Scaling our Data Stack with Kafka and Real-Time Stream Processing: Whatnot, a community marketplace, built a new foundational piece of our data stack: a real-time streaming & processing platform. But as the company grew and the complexity of the problem that need to be solved increased they introduce an “event bus” into their stack with Apache Kafka as the backbone. As Whatnot's platform grew, they realised that coupling to transactional systems can make analysis much harder than it needs to be. This is when they turned to the event bus. This article by, Zack Klein will give you an overview of this system, the reasons to build it, and some of the strategic decisions Whatnot's engineering team made around testing, event schematization, event serialization, and stream processing.
- How to Build Your Data Analytics Team: The three main pillars for an organisation to become a data-driven business are -data strategy, governance and analytics. In this article, Louise de Leyritz talks in-depth about building a strong data analytics team for your organisation. The ability of an organisation to leverage the data ultimately depends on the strength of this team, and how symbiotic it is with the rest of your business. Louise explores the importance of understanding the data maturity level of your organisation so that you can build a data team suited to your business needs and aligned with your business strategy.
- The Best Data Contract is the Pull Request: Data contracts can help us prevent data quality issues by formalizing interactions and handovers between different systems (and teams) handling data. With the rise in discussion around data contracts, Gleb in this article writes why the idea of data contracts throughout the stack is extremely powerful but remains largely aspirational for most data teams. And if having data contracts everywhere is not attainable anytime soon, what can we do to prevent data from breaking?
If you have an interesting blog that you would like us to share with the data community, submit it here.
Upcoming data events and summits
- Monte Carlo is back with its annual conference, IMPACT: The Data Observability Summit. RSVP to hear from some of the industry’s most prominent voices, as well as the broader community of data leaders and architects paving the way forward for reliable data.
IMPACT will be a hybrid event scheduled from October 25 to 26, 2022. Register here for the event.
- Atlan is hosting an interactive Masterclass on 'How to Use Information Architecture Principles to Build a Shared Understanding of Your Business': Join Emily Lazio (Data Product Architect, WeWork) this Thursday at 12:30 pm ET. Check out the modules and sign up here.
Data startup funding news
- Unravel data raise $50 million series D funding
The round was led by Third Point Ventures, with participation from Bridge Bank and existing investors that include Menlo Ventures, Point72 Ventures, GGV Capital, and Harmony Capital, bringing the total amount of funding raised by Unravel Data to $107 million.
- Rill Data is hiring a 'Data Engineering Lead, Customer Success'
Stack: Snowflake, BigQuery,dbt
- Paperless Post is hiring a 'data analyst'
Location : NYC (hybrid)
Stack : Fievtran, Airflow, dbt
- Flock Freight is hiring a 'Staff/Senior Data Engineer'
Location: US Only (San Diego (HQ), Chicago, or Full Remote)
Stack: Fivetran,dbt, Snowflake
🔥 on Twitter
Just for fun 😃
Subscribe to our Newsletter, Follow us on Twitter and LinkedIn, and never miss data updates again.
What do you think about our weekly Newsletter?
Love it | It's great | Good | Okay-ish | Meh
If you have any suggestions, want us to feature an article, or list a data engineering job, hit us up! We would love to include it in our next edition😎
About Moderndatastack.xyz - We're building a platform to bring together people in the data community to learn everything about building and operating a Modern Data Stack. It's pretty cool - do check it out :)