Introduction to Data Engineering

This notebook summarizes the data engineering lifecycle, and emphasizes the components covered in DATA_ENG 300.

Data Engineering Lifecycle

Reis and Housley (2022) describes the data engineering lifecycle with five main stages:

  • Generation
  • Storage
  • Ingestion
  • Transformation
  • Serving data

. . .

data-eng-lifecycle
Data engineering lifecycle components Reis and Housley (2022)

Preparation for next week

  • Labs are on Tuesdays 4 - 6PM
  • Familiarize with Github (if you have not, register for an account.)
  • Familiarize with colab / jupyter notebook