Log inSign up
Joseph Machado
1,981 posts
user avatar
Joseph Machado
@startdataeng
I do data & AI engineering Free Data Engineering Fundamentals Course: de101.startdataengineering.com
New york
startdataengineering.com
Joined April 2020
32
Following
9,382
Followers

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
  • Pinned
    user avatar
    Joseph Machado
    @startdataeng
    May 25, 2020
    Exercise project for anyone starting in data engineering startdataengineering.com/post/data-engi… #dataengineering #bigdata #ETL #ApacheAirflow #AWS #ApacheSpark
  • user avatar
    Joseph Machado
    @startdataeng
    Apr 19, 2022
    Learning data engineering? Build a pipeline locally. 1. Python to pull data from an API (e.g. Coincap) 2. Load data into a local Postgres container 3. Automate it with cron/task scheduler Start small, build, improve, & repeat. #data #dataengineering #pythonlearning #Python
  • user avatar
    Joseph Machado
    @startdataeng
    May 6, 2022
    Starting as a DE? 90% of what you will need is SQL (OLAP), python, & distributed system basics Don't overcomplicate! #data #dataengineering #SQL #Database #Python
  • user avatar
    Joseph Machado
    @startdataeng
    Apr 20, 2022
    When data to process is larger than memory, try to stream with python generators, before jumping to distributed systems! #data #dataengineering #Python #pythonlearning #Generator E.g. Stream a file(note () and not []), get diff between date cols
  • user avatar
    Joseph Machado
    @startdataeng
    May 3, 2022
    Preparing for SQL interviews? Do Leetcode SQL hard, sort by freq, and do the first 40 #data #dataengineering #Software #SQL
  • user avatar
    Joseph Machado
    @startdataeng
    Mar 23, 2022
    Starting a data project is a lot of work! It can be overwhelming to define the problem, set up systems, and then code! Use this DE project as a blueprint to build your own: startdataengineering.com/post/data-engi… #data #dataengineering #Database #DataAnalytics #dataviz #Python #datapipeline
  • user avatar
    Joseph Machado
    @startdataeng
    Mar 12, 2022
    It can be overwhelming to start learning data engineering. I'd recommend starting with the basics of python, sql, UNIX commands, building a simple data project, update Github, Linkedin. Landing a DE job is 60% part learning and 40% marketing. See reply 👇🏽 for helpful links.
  • user avatar
    Joseph Machado
    @startdataeng
    May 10, 2020
    If you are interested in using "Change Data Capture" pattern for streaming ETL, check out startdataengineering.com/post/change-da… #ETL #changedatacapture #dataengineering #debezium #BigData
  • user avatar
    Joseph Machado
    @startdataeng
    Jul 14, 2023
    If you are looking for an end-to-end streaming tutorial or a project to understand the foundational skills required to build streaming pipelines, this post is for you. You'll learn fundamental streaming concepts: startdataengineering.com/post/data-engi… #dataengineering #datastreaming #dataops
    Data Engineering Project: Stream Edition – Start Data Engineering
    From startdataengineering.com
    26K
  • user avatar
    Joseph Machado
    @startdataeng
    May 2, 2022
    An orchestration tool that I've been impressed with is @dagsterio. Easy setup, powerful features and great docs. Use 👇🏽 to play around with a pipeline on dagster startdataengineering.com/post/data-engi… #data #dataengineering #Python #Database #DataAnalytics
    End-to-end data engineering project - batch edition – Start Data Engineering
    From startdataengineering.com
  • user avatar
    Joseph Machado
    @startdataeng
    Nov 2, 2023
    Are you looking for an end-to-end streaming tutorial or a project to understand the foundational skills required to build streaming pipelines? Then this post is for you. We will use Apache Flink and Apache Kafka for stream processing and queuing. startdataengineering.com/post/data-engi… #data
    Data Engineering Project: Stream Edition – Start Data Engineering
    From startdataengineering.com
    33K
  • user avatar
    Joseph Machado
    @startdataeng
    Dec 6, 2023
    Checkout this post that covers topics that can take your SQL skills to the next level and help you become a better data engineer. startdataengineering.com/post/improve-s… #data #dataengineering #database #SQL #dataprocessing #dataanalytics
    How to improve at SQL as a data engineer – Start Data Engineering
    From startdataengineering.com
    18K
  • user avatar
    Joseph Machado
    @startdataeng
    Oct 16, 2023
    Data engineers often work with APIs, but most do not have clear documentation. Knowing the standard REST API design helps make extracting data from them more straightforward. Check out this article that goes over REST API design in detail: learn.microsoft.com/en-us/azure/ar… #data #API
    Microsoft Learn
    Web API Design Best Practices - Azure Architecture Center
    From learn.microsoft.com
    32K
  • user avatar
    Joseph Machado
    @startdataeng
    Nov 20, 2023
    Pulling data from an API is a common data engineering task. Here are a few tips to make your API data pipelines resilient. 🧵 1. Paginate: The dataset may be too large or the API server only sends a max of n rows. #data #dataengineering #datapull #EL
    39K