Data Engineer
About AdPredictive
At AdPredictive, we build smart tools for smart people. Our mission is to empower our clients to understand their customers, reach new ones, and positively impact their business outcomes. Our infrastructure is 100% cloud-based, leveraging a wide array of open source and commercial technologies.
We value…
- Our People : Our people are our greatest asset. We believe in flexibility, true work-life balance, and that a positive work environment is critical to our success. We take care of our people and carry out all facets of our work with respect, empathy, and gratitude.
- Giving Back : Giving back to our communities provides purpose and growth. We commit ourselves to regular community service and keep the good of our communities at the forefront of our business decisions.
- A Growth Mindset : Our personal and professional growth are driven by our own effort, hard work, and drive for continuous learning. We strive for innovation by seeking new challenges as a means for growth and learn from both successes and failures. We are empowered to create, own, and collaborate to unlock our potential.
- Disruption : Unconventional thinking leads to improved results. We question the way things have always been done, educate ourselves on the status quo, and use that knowledge to inform our exploration of new ways to problem-solve. We aren’t afraid to present innovative, data-backed solutions as alternatives to what is already in place.
We have an office in the Crossroads district of Kansas City; half of our team is based in KC with weekly collaboration days scheduled to coordinate in-office time if desired. The other half of the team is fully remote across the east coast and Midwest with the option to visit the KC office for team-building and collaboration on a regular basis.
About the Role
As our Data Engineer, you'll be at the forefront of modernizing and maintaining mission-critical data infrastructure that powers enterprise-grade linear TV optimization for our client. This is a unique opportunity to own and transform our entire data engineering landscape, working directly with our data platform built on Dagster and Kubernetes.
What Makes This Role Unique:
● High autonomy position
● Opportunity to own and shape the entire data engineering function for a major enterprise product
● Work with cutting-edge tools and technologies while modernizing legacy systems
● Direct impact on client's linear TV optimization capabilities
Responsibilities:
- Infrastructure Management & Optimization: Manage and optimize our Dagster/Kubernetes-based data platform, ensuring scalability, reliability, and efficiency. Oversee and maintain our pixel tracking and reporting infrastructure, ensuring accuracy, reliability, and seamless business operations throughout its lifecycle.
- Data Migration & Pipeline Development: Lead the transition of ~100 production data jobs from legacy Spark/EMR and Pentaho (Kettle/Spoon) systems to a modern Dagster/Kubernetes infrastructure. Build, maintain, and optimize robust ETL/ELT pipelines for TV audience analytics, including real-time audience size estimation, demographic analysis, and data quality assurance.
- Data Ingestion & Integration: Manage the ingestion of large datasets from multiple third-party providers, ensuring efficient, scalable, and reliable data integration processes. Develop and maintain data pipelines using Trino, Postgres, and AWS S3, handling high-volume structured and unstructured data.
- Performance Optimization & Monitoring: Optimize ingestion performance using parallel processing, incremental updates, and streaming architectures where applicable. Monitor and troubleshoot data pipelines, proactively identifying and resolving bottlenecks, failures, or inconsistencies in AWS S3, Trino, and Postgres environments. Implement monitoring, alerting, and optimization strategies to enhance data workflow performance.
- Data Quality & Compliance: Implement data validation, cleansing, and anomaly detection mechanisms to ensure high data quality and compliance with internal and industry standards. Ensure all data operations align with security and privacy best practices, implementing measures to protect sensitive information and maintain compliance with regulatory requirements.
- Technical Architecture & Product Alignment: Design and implement scalable data processing solutions for evolving product features. Collaborate closely with product and engineering teams to architect and develop new data solutions that align with business objectives.
- Cross-Functional Collaboration & Leadership: Work closely with software engineers, product managers, and other stakeholders to integrate machine learning models into applications. Serve as a liaison between technical and non-technical teams, translating business needs into technical requirements.
- Strategic Decision-Making & Stakeholder Communication: Work autonomously while maintaining clear communication with the VP of Data Intelligence and Technology and other stakeholders. Translate business requirements into technical solutions, effectively managing expectations and project timelines.
Qualifications:
● Bachelor's degree in Computer Science, Engineering, or related field
● 3+ years of professional experience in data engineering, with significant experience in:
○ ETL pipeline development and maintenance
○ Big data processing frameworks (especially Apache Spark)
○ Data warehouse architecture and optimization
● Deep expertise in:
○ Modern orchestration tools (Dagster, Airflow, or similar)
○ Kubernetes and container technologies
○ Python development for data processing
● Proven experience with:
○ Large-scale data pipeline migration projects
○ Enterprise-grade ETL processes (such as Pentaho Kettle/Spoon)
○ AWS services, particularly EMR and other data processing services
● Strong understanding of:
○ Data pipeline monitoring and optimization
○ Data quality management and validation
○ CI/CD practices for data infrastructure
● Experience with:
○Real-time data processing and analytics
○ Web analytics and pixel tracking systems (desired)
○ SQL and NoSQL databases
● Demonstrated ability to:
○ Work autonomously in a complex technical environment
○ Make architectural decisions with minimal oversight
○ Communicate effectively with both technical and non-technical stakeholders
● Plus:
○ Experience with media/entertainment industry data
○ Background in audience analytics or advertising technology
○ History of successfully modernizing legacy data systems