Job_Summary:
Yahoo is a technology and media company that serves users through its portfolio of digital platforms, products, and services. They are seeking a Big Data Tools Engineer to build, improve, and maintain their high scaling Big Data Platform, working on complex data systems and cutting-edge technologies.
Responsibilities:
- Job Monitoring: Overseeing the execution of various data jobs, ensuring they adhere to SLAs and do not encounter issues.
- Data Orchestration: Utilizing tools like Airflow to manage the scheduling, execution, and monitoring of data workflows across cloud platforms such as AWS and GCP.
- Query Execution and Optimization: Designing and optimizing queries to run efficiently on platforms such as BigQuery, Hive, Pig, and Spark, ensuring high performance and scalability.
- Integration and Support: Collaborating with different teams to integrate data flows, provide support for query executions, and handle credentials for secure data operations.
- Feature Development: Implementing new features to support advanced query capabilities, including federated queries and lineage tracking.
Qualifications:
-Required:
- A Bachelor's or Master’s degree in Computer Science or equivalent work experience.
- Proficiency in Python is essential for scripting and workflow management; experience with Java and C++ is preferred for backend data operations.
- Knowledge of data structures, algorithms, and database management systems like SQL, HBase, and BigQuery.
- Experience with cloud services, especially AWS (EMR, Glue, S3) and GCP (Dataproc, BigQuery).
- Comfortable working in an Agile environment with regular sprints, planning, and retrospectives.
- Ability to design large-scale, distributed systems that are highly available and resilient.
- Some experience working with Linux/Unix operating systems.
-Preferred:
- Experience with development and deployment on public cloud platforms such as AWS, GCP, Azure, or others
- Experiencing developing containerized applications and working with container orchestration services
- Experience with Apache Hadoop, Presto, Hive, Oozie, Pig, Storm, Spark, Jupyter
- Understanding of data structures & algorithms
- Knowledge of JVM internals and its performance tuning
- Excellent debugging/testing skills, and excellent analytical and problem solving skills
- Experience with continuous integration tools such as Jenkins and Hudson
- Strong verbal and written communication skills to collaborate effectively with cross-functional teams.
Company:
Yahoo is a technology and media company that serves users through its portfolio of digital platforms, products, and services. It is a sub-organization of Verizon Media. Yahoo has a track record of offering H1B sponsorships.