The 2023 Ultimate List: Must-Follow Data-focused Blogs

From data science to data engineering, find the best resources to stay updated in the industry

Long T. Nguyen
5 min readOct 1, 2023
Photo by Markus Winkler on Unsplash

I find that a great part of the information I have was acquired by looking up something and finding something else on the way.

— Franklin P. Adams (1881–1960)

As an aspiring data analyst or data scientist just like me, you’re probably already aware of the importance of staying current on industry trends and insights. With so much information available out there, it can be very challenging to know where to begin, especially since our time is limited. This is where following blogs can be highly beneficial. I believe that bookmarking or subscribing to the top blogs to gain access to expert insights and best practices can help you speed up your learning and stay ahead of the curve.

Data Science

While data science overlaps with machine learning (ML) and artificial intelligence (AI), I’ve categorized the following blogs/websites separately. This indicates that their primary content centers on data science rather than ML/AI.

  • KDnuggets — A well-known platform in the data science community. (Note: KD stands for Knowledge Discovery). Provides articles, tutorials, and news on topics like statistics, data science, ML, and AI, contributed by various experts in the field.
  • Analytics Vidhya — Another resourceful platform for data science enthusiasts and professionals. Features articles, tutorials, and case studies on data science, ML, and related topics aimed to explain complex concepts in a digestible manner.
  • Towards Data Science — A Medium publication with a whopping 674K subscribers (as of September 2023). Serves as a platform for individuals to share their insights, findings, and experiences within the data science community.

Machine Learning (ML) and Artificial Intelligence (AI)

It also covers deep learning.

  • Machine Learning Mastery — Offers practical advice and tutorials for mastering ML concepts. Authored by Dr. Jason Brownlee and other experts, the content is well-structured, making complex topics accessible to both beginners and experienced practitioners.
  • Machine Learning is Fun — An engaging blog that simplifies complex ML concepts into enjoyable and understandable posts.
  • Artificial Intelligence Newsletter — Curated by talented Andriy Burkov on LinkedIn, it delivers concise updates on AI industry advancements. Burkov also curates another newsletter focused on data science.
  • OpenAI Blog — Showcases recent advancements and projects by OpenAI in AI and ML, spotlighting their primary product, ChatGPT.
  • Google AI Blog — A similar site to OpenAI’s, written by Google researchers, the blog offers a glimpse into the cutting-edge work being done at the company, making it a valuable resource for tech enthusiasts.
  • Data Science Central — Despite its name, this is more focused on AI, covering technical, business, and niche AI topics. A community for AI practitioners, it provides discussions, updates, and content on AI and Big Data.
  • Fast.ai — Co-created by two leading AI experts from Australia, Jeremy Howard and Rachel Thomas, it offers free classes on deep learning.
  • The Gradient — Founded in 2017 by a group of students and researchers at the Stanford AI Lab. A researcher-led newsletter aims to simplify AI learning and promote discussions among the AI community.
  • The BAIR Blog — A blog run by the BAIR research group at UC Berkeley. It provides in-depth and academic-oriented posts on a variety of topics within AI and machine learning.
  • Data Phoenix — Covers contemporary subjects like AI for green energy, robotics in education, and generative AI, catering to enthusiasts in these areas.
  • Data Machina — A weekly newsletter updating readers on recent research, projects, and repositories in AI and ML.

Data Engineering

  • Data Patterns — A Substack newsletter written by Ergest Xheblati, a seasoned data architect with 15+ years of designing and building data warehouses. Focuses on data engineering-related topics.
  • Joe Reis — Co-author of the Amazon bestseller “Fundamentals of Data Engineering.” Joe also hosts two excellent podcasts (to be mentioned in another list).
  • EcZachly Data Engineering Newsletter — Authored by Zach Wilson on Substack, it focuses on data engineering topics. As an ex-Meta, ex-Netflix, and ex-AirBnb data engineer, Zach is the guy who you can trust.
  • SeattleDataGuy — Written by Ben Rogojan, his YouTube channel, Seattle Data Guy, is highly regarded for its data engineering content.
  • Towards Data Engineering — A Medium publication offering curated insights into data engineering.
  • Data Engineer Things — A publication on Medium dedicated to data engineering topics.
  • Jesse Anderson — Managed by a data engineer, covering various data engineering topics.
  • David Jayatillake — I enjoy his writings since his posts offer deep insights into data engineering coupled with broader wisdom.
  • Data Engineering Weekly — A Substack newsletter curated by Ananth Packkildurai, highlighting the latest trends, best practices, and developments in the data engineering domain.
  • Data Engineering Central — A Substack newsletter by Daniel Beach explores database connections, transactions, data architecture, and other aspects of data engineering.
  • AWS Big Data Blog — Covers data engineering topics, particularly in the AWS ecosystem.

Generals & Other Technologies

  • ByteByteGo — A Substack newsletter written by a popular book author, Alex Xu, provides bite-sized, well-illustrated articles on system designs. Even if you don’t have time and just follow Alex’s LinkedIn posts, your future data career will surely thank you.
  • The Pragmatic Engineer — Acclaimed as the #1 tech newsletter on Substack by Gergely Orosz, ex-Uber, Skype/Microsoft engineer, offering insider views on Big Tech and startups, actionable advice, market trends analysis, and unbiased insights with a strict no-sponsorship policy.

Inactive Blogs

The following blog authors are either not regularly active or no longer updating their websites:

  • Andrej Karpathy — A founding member at OpenAI and now the Sr. Director of AI at Tesla. Last updated in March 2022.
  • Denny Britz — Ex-Google Brain. While his blog isn’t active, he’s a regular contributor on GitHub. His repositories are worth exploring.
  • Tim Dettmers — Last update was in January 2023.
  • Christopher Olah — Co-founded Distill and Anthropic (the AI lab that created Claude).
  • Distill — Highly regarded for its high-quality, peer-reviewed articles featuring engaging visuals and interactive demonstrations. Sadly, it ceased updates in September 2021.
  • Data Council Blog: Offers articles on data engineering and related fields. Last updated in April 2021.
  • Meta Analytics — A Medium publication by the analytics team at Meta. Last updated in April 2023.

Which data blogs do you swear by? Drop your favourites in the comments below.

Happy reading!

--

--

Long T. Nguyen
Long T. Nguyen

Written by Long T. Nguyen

@longnca. Data Specialist experienced in data engineering. Based in Canada. Passionate about coffee, books, and data.