Recruiting a Top Notch ETL Developer

by Rami Amar  
4 min read  • 7 Nov 2018

In order to build a great data team, you need great data engineers. Here's how to hire them.

First, if you are looking for an ETL developer, you should actually be looking for a data engineer. ETL is a term dating back to somewhere in the 70's, when data pipelines were mostly file or batch oriented, and were composed of multiple steps of extraction, transformation, and loading. Fast forward 40 years, and the data landscape has grown to be so complex that to harness it, a broad skillset and years of experience are required.

Now, developing an ETL process may seem easy to the inexperienced engineer, but it could quickly escalate into a nightmare of spaghetti pipelines and endless edge cases even if you have a good team in place. To break your organization's glass ceiling of data, you need a team of great data engineers, and they're not easy to find and recruit. We are lucky enough to work closely with many amazing data engineers - our users - and we see how they really make a difference in their organizations. This post is about finding the right people who will help your organization put its data to use.

Data Engineer Toolkit

So, what should you look for? First, technical chops. In the past, you would’ve seen technical skills listed such as: Informatica, Tibco, Oracle data warehouse, {My,Postgre,MS}SQL, Perl (pun intended ;) ), Bash, PL/SQL, {Websphere,Microsoft,Rabbit}MQ. Some ETL developers also call themselves DBA's, and may know a lot about database design and query optimizations.

Today’s data engineer resume expands on the above with many more modern technologies and skills:

  • Big data stores: Hadoop, Spark, MongoDB, Cassandra, Elasticsearch, Redshift, Bigquery, Vertica, Snowflake
  • A more extensive set of programming languages: Python (including Pandas and SciPy), R, Scala, Java (including Map Reduce)
  • Pipeline orchestration tools (a plus): Luigi, Celery, Airflow
  • Log collection and distribution tools: (a nice addition as well): Kafka, Flume, Fluentd, Logstash, Filebeat, ELK, Splunk

Beware: if you see all these on a single CV, you should be somewhat suspicious! Let’s be honest, not everyone is a unicorn. Select the right skills that are right for your data team.

Selecting the right candidate

Once you filter the right candidates based on CV, the next step is to meet them for an interview. Other than verifying the actual experience, you should also look for several personality traits or soft skills:

  • Attention to detail and a mild obsessiveness for order - keep in mind that data engineers deal with the bits and bytes of billions of events coming from tens of sources on a daily basis
  • Dedication and agreement for off-hours availability- 24/7 real-time data is crucial for your business, set expectations with your data engineers from day 0
  • Patience and perseverance with "dirty work" - data pipelines break, and to recover data loss, countless hours of tracking events and offsets are required

And of course, make sure they are a culture fit! The key is to find someone who is service oriented, passionate about data, and loves helping people. A great data engineer will dramatically change how your company runs.

Best places to recruit

On a personal note, recruiting and grooming many data engineers here at Alooma has helped us weave these concepts into our R&D culture. It's a crucial asset required in building a platform for data engineering and enabling the greatest data engineers to be 10x more productive.

Where do you find them, you ask? You can lookup the "standard sources", like Hacker News or AngelList, or go head hunting at meetups, but _our _best source is our personal network and engineering brand.

Many professions in the software engineering field have grown to be much more complex. IT specialists are now DevOps engineers. NOC operators are site reliability engineers. Sysadmins are cybersecurity specialists. DBAs who used to be Oracle magicians, and ETL developers are now data engineers. Interestingly, according to Glassdoor, data engineers make $15K/yr more than ETL developers, and even more interesting is the comparison of companies looking for each profession. In many ways, the skills, personality and passion of great data engineers are much harder to acquire than those of an ordinary full stack engineer.

This might interest you as well