About me
I’m Hugo Lu — I started my career working in finance before moving to JUUL, a scale-up, and falling into data engineering. I headed up the Data function at London-based Fintech Codat. I’m now CEO at Orchestra, a data release pipeline management platform that helps Data Teams release data into production reliably and efficiently.
Introduction
There’s this notion in maths of a limit. For example, the sum of the series of 1/2^n tends to 2 as n tends from 0 to infinity. This is helpful when considering what the endgame is for data engineering tools and software.
You can apply this to batch processes, in particular the frequency with which you run them and their corresponding size or throughput. Like 1/2^n tending to two, batch processes tend to “streaming” use cases; treating data as streams and triggering operations as soon as new data becomes available. This is where the Lakehouse comes into play