Position Overview
We are seeking a skilled and experienced Data Engineer for a contract position. The ideal candidate will be responsible for designing, building, and maintaining robust data pipelines and infrastructure that serve as the foundation for our data science and AI initiatives. This role requires a strong background in data engineering, proficiency with big data technologies, and a deep understanding of data governance and security. You will be a key part of an integrated, multidisciplinary team, ensuring data is clean, accessible, and ready for use by machine learning models and applications.
Responsibilities include, but are not limited to:
- Data Pipeline Development: You will design and implement scalable data ingestion pipelines for both real-time streaming and batch processing from diverse internal and external data sources.
- Data Architecture & Management: You will maintain a unified data platform, including a secure and scalable data lake and data warehouse.
- Data Quality & Transformation: You will create automated routines for data cleaning, preprocessing, deduplication, and PII scrubbing.
- Data Governance & Security: You will implement and enforce data governance policies, access controls, and data segregation to ensure compliance and prevent cross-contamination.
- Collaboration: You will work closely with ML engineers and data scientists to ensure data is properly prepared and delivered for model training and deployment. You will also collaborate with software engineers to integrate data services into applications.
- Documentation: You will create and maintain clear technical documentation for data pipelines, architecture, and governance policies.