DP 900 - Azure Data Factory

Azure Data Factory - 9 Cards
Click here to toggle all cards
Azure Data Factory
Fully managed serverless service - ETL, ELT, ..
Pipeline
Logical group of activities that can be scheduled. Chained activities can run sequentially/parallel. Can execute other pipelines.
Activity
Represents a step in a pipeline. Three types - Data movement, Data transformation, Control activities.
Data Flow
Create and manage data transformation logic. Reusable library. Executes on a Spark cluster, spun up and down automatically.
Control Flow
Orchestrate pipeline activity based on output of another pipeline activity.
Linked Service
Used to connect to external data sources
Dataset
Represents data structures within data stores
Integration Runtime
Compute infrastructure used by Azure Data Factory
Triggers
Schedules pipeline execution at specific times