Introduction
Imagine a Data Platform that’s operating smoothly, with data pipelines functioning flawlessly day in and day out. Then, all of a sudden, new demands emerge that involve integrating various files into the platform, for example:
- A business user urgently approaches you, asking for specific data from a file or another system to conduct tests or validate numbers.
- Your Data Engineering team receives a new assignment: to add hundreds of files to the existing data pipelines on a daily basis.
- Analysts on your team want to compare your internal data with information from an external Excel or CSV file.
Faced with these challenges, you, as a responsible Manager, have a dilemma: Should you treat each request as a one-off, assigning a team of Data Engineers to create individual pipelines? Or would you rather develop a framework that enables end-users to ingest data themselves and perform analytics with external and partner data?