Data Engineer
Responsibilities
- Create and manage ETL data pipelines using Python, Spark, and Airflow.
- Create and manage real-time pipelines using Kafka.
- Improve and maintain the data lake setup (S3, EMR, Presto).
- Integrate data from 3rd party APIs (e.g. Hubspot, Facebook).
- Ensure data quality through automated testing.
- Develop and maintain company metrics and dashboards.
- Collaborate with analysts, engineers, and business users to design solutions.
- Research innovative technologies and make continuous improvements.
You may be a good fit if
- 3+ year as a data engineer developing and maintaining ETL pipelines.
- Experience in building data lake/warehouse solutions consisting of structured and unstructured data.
- Hands-on experience with big data technologies (e.g. Spark, Hive)
- Experience in writing and optimizing SQL queries.
- Good knowledge of Python.
- Hands-on experience with BI tools (e.g. Looker, Redash).
- Bachelor's degree in a technical field or equivalent work experience.
- Have experience managing and designing data pipelines, debugging data issues.
- Are familiar with real-time and/or large scale data.
- You have built data products that have scaled on AWS or another cloud.
- You thrive on nimble, lean, fast-paced startups, like autonomy, and have proven you can push towards a goal by yourself.
- Coachable. Able to own mistakes, reflect, and take feedback with maturity and a willingness to improve.
- You communicate with clarity and precision and you are able to effectively present results