What does an ETL developer do at your company?
At Alooma, we develop a data pipeline as a service, helping our users connect their disparate data sources to a data warehouse of their choice. In essence, that makes all of us ETL developers - as ETL is the core of our product!
According to the three stages of ETL, our teams are developing 3 types of stages in our product:
Extraction - An ETL developer must develop / manage extraction tools, which extract data from the various data sources the company uses - be it databases, Saas services, mobile apps, data lakes, etc. These tools will usually wrap the data with some metadata and send it forward in the data pipeline.
Transformations - Raw data from many sources is guaranteed to be dissimilar. There will usually be a processing engine in charge of corrections, transformations and enrichments in this stage. This tool, again, will need to be developed as time passes, to account for all the new types of data the company adds with time.
Loading - The last piece of this puzzle is a tool that is capable of loading the data to a target, usually a data warehouse or a data lake. It is essentially the opposite of stage #1, and it focuses on the ability to quickly and efficiently load bulk data to the target.
The data model itself, as well as the schema of the database and of any event data should be decided, in my opinion, not from the ETL developer, but from the Data scientist. The ETL developer should be an excellent technical person and focus on building the most robust infrastructure, according to the data team’s needs.
Published at Quora. See Original Question here