logo

View all jobs

Senior Distributed Computing Software Engineer (Big Data)

Palo Alto, CA
Tal-Ex is partnering with a media services organization to identify a talented Sr. Distributed Computing Software Engineer to join our client's Machine Learning Platform and Predictive Segments team. In this role, you are instrumental in transforming research concepts, business requirements into software products. We value attitude, aptitude, communication skills, and coding skills over experience with specific languages and environments.

The Predictive Segments team, part of the Forecasting and Optimization from our client, is responsible for machine learning based advertising targeting products and supporting machine learning platforms which are responsible for driving performance of advertising campaigns managed by One DSP.  The team, leveraging the cutting edge technologies and rich data, is building realtime big data and machine learning platform offering audience segmentation products. Team members, in close partnership with research and product, work with very large amount of data (2.5 - 3 billion records per day) to help discover and prove new advanced data analytic algorithms for surfacing unique methods for optimizing statistical model enabling prediction and personalization analysis (large-scale machine learning and pattern recognition).  The end goal is to improve online advertising campaigns targeting thus maximizing revenues. In addition to utilization of distributed computing technologies, proving newly developed algorithms, work involves architecture and design to incorporate these new algorithms into new products.

Job Description

The Responsibilities of our Sr. Distributed Computing Software Engineer will include:
  • Performing research and iterative prototyping with large scale distributed computing and distributed database systems architecture
  • Utilizing experience with distributed file systems, database architecture, and data modeling to organize and process large data sets
  • Developing software to support machine learning and data mining projects and contextual analysis, such as crawling, parsing, indexing, and unique content analysis
  • Collaborating with scientists and analytics solution architects to design distributed data storage and processing services that are scalable, reliable, and available. 
  • Identifying potential performance bottlenecks and scalability issues to justify or critique the design of new algorithms; and assists researchers with accessing and processing large amounts of data.

Qualifications
  • Master’s degree in Computer Science or related field
  • At least 3 years of software development experience
  • A minimum of 2 years of experience working with distributed systems
  • Knowledge in distributed system design, data pipelining, and implementation
  • Knowledge in machine learning algorithms
  • Knowledge and experience in building large scale applications using various software design patterns and OO design principles
  • Experience with Java, Scala, Python.
  • Experience with either distributed computing (Hadoop/Spark/Cloud) or parallel processing (CUDA/threads/MPI)
  • Expertise in design pattern (UML diagrams) and data modeling of large scale analytic systems
  • Experience in research, analysis, and the conversion of large amount of raw collected data and content into new sets of data that is structured and does not reduce data context in order to enable the Productization of new products
  • Worked with data warehousing and distributed/parallel processing of large data sets using parallel computing system to map/reduce computation and Linux clusters (e.g. Hadoop/Cloud technologies, HDFS); cluster
  • Experienced in modern development methodology such as Agile, Scrum and SDLC
  • Ability to work in a research oriented, fast pace, and highly technical environment
  • Quick thinker and a fast learner, collaborative spirit, and excellent communication and interpersonal skills.

Share This Job

Powered by