• HPC-AI Performance Engineer

    Job Locations US-CA-Sunnyvale
    Posted Date 1 month ago(12/20/2018 8:40 AM)
    ID
    2018-4645
    Category
    Marketing
    Job Length
    Full-Time
  • Overview

    Mellanox Technologies is looking for a talented HPC-AI Performance Engineer to lead HPC and deep learning performance optimizations, benchmarking and profiling over Mellanox products. This individual will primarily work with marketing and engineering to execute application level benchmarks focused on deep learning applications. In addition, this individual will work closely with hardware and software partners, and customers to benchmark Mellanox products under different system configurations and workloads.

    Responsibilities

    • Measure and analyze the performance and parallel scalability of key HPC and deep learning applications that comprise Mellanox evolving workload on current and future high performance computing (HPC) and data intensive platforms using a hierarchy of benchmark programs.
    • Profile applications to identify architectural and algorithmic bottlenecks with a particular emphasis on emerging many core effects.  
    • Propose remedies to the identified bottlenecks via software restructuring and/or architectural improvement with comprehensive understanding of any trade-offs in design, cost, and software engineering effects.
    • Assess emerging technologies in architecture, algorithms, parallel programming paradigms, and languages to provide input for HPC and AI system procurements and technology roadmaps out past the next decade.
    • Prepare benchmark reports, papers, and presentations describing significant results for dissemination within Mellanox and throughout the broader HPC and AI research community.
    • The ability to contribute performance-related expertise to cross-team Mellanox activities that may involve application performance tuning, interconnects, storage I/O, and data analysis functions.

    Qualifications

    • A BS / MS degree in Computer Science, Computer or Electrical Engineering along with 3-5+ years’ of relevant experience
    • Experience in networking TCP/IP, InfiniBand (RDMA), and other high-speed interconnect technologies
    • Demonstrated understanding of HPC computer architecture issues including CPU, memory, interconnect, parallel I/O performance tuning, and networking performance
    • Experience with micro benchmarks and ability to write micro benchmarks that are able to exhibit the same performance characteristics as the full application code
    • Experience with performance profiling tools, hardware performance counters and/or code instrumentation systems
    • Detailed understanding of state-of-the-art tools used to program, profile, and debug parallel MPI, PGAS, OpenMP, and hybrid-parallel codes using C and Fortran 77/90 code
    • Experience in benchmarking, code instrumentation, and performance analysis or parallel applications with emphasis on emerging multicore and many core architectures
    • Experience with the use of script languages (e.g. Bash, Python)
    • Experience in the Linux operating system environment and writing/maintaining programs using C, and/or Fortran
    • Proven record of working effectively in a team, seeing projects through to completion, meeting deadlines, interacting with users, and thorough documentation of contributions
    • Proven ability to be able to dig in application code and tune around
    • Experience with AI application frameworks such as TensorFlow, Caffe or others.
    • Experience with application written over CPU and GPUs.
    • Experience with NVIDIA Cuda drivers
    • Experience with Slurm scheduler, parallel application compiling and using HPC clusters
    • Experience with Lustre or other parallel storage systems

     

    Advantages  

    • Prior experience with performance analysis and performance tuning of applications for pre-sales activities
    • Experience with different computer system architectures, other than x86_64
    • Familiarity or experience with HPC industry-standard benchmarks and/or user applications
    • Experience with maintaining cluster of systems, storage, and/or experience with troubleshooting hardware issues
    • Demonstrated ability to lead technical efforts with teams of people
    • Competitive attitude and strong work ethic with the ability to enthusiastically represent Mellanox Technologies
    • Ability to handle stressful situations and work under pressure to meet deadlines
    • Strong sense of pace and urgency to ensure work is completed in the expected timelines
    • Reliable and detailed-oriented
    • Self-starter; be able to manage own tasks
    • Ability to communicate ideas effectively and to assist end users and customers
    • Creative thinker; be able to research and have ideas to resolve issues

    Company Description

    About Us

    Mellanox Technologies was founded in 1999 and has headquarters in Sunnyvale, CA and Yokneam, Israel. We are a leading supplier of innovative end-to-end InfiniBand and Ethernet connectivity solutions and services for servers and storage. We offer market-leading solutions that include adapter cards, switches, cables and software to support InfiniBand and Ethernet networking technologies. Our products optimize data center performance and deliver industry-leading bandwidth and scalability. In addition, we serve a wide range of markets including high performance computing, enterprise, data centers, cloud computing, big data and Web 2.0. We are constantly reinventing ourselves to stay ahead of the market and bring game-changing products and services to the industry.

     

    About You

    Mellanox is an incubator for talent. We are a strong believer in developing our people and giving them the tools to succeed. We have a very competitive compensation package as well as frequent internal product training to keep people updated on new technologies. We are a fast growing company with a positive energy that comes from our team members' internal drive to develop, market, sell and support cutting-edge products and services. Mellanox often promotes from within and there's a sense of family that comes from the top down. We are committed to the community and donate 1% of our annual profit to charity as well as participate in green initiatives to reduce our carbon footprint.

     

    Benefits

    Mellanox Technologies offers a competitive benefits program including medical, dental and vision insurance, 3-weeks’ vacation, 10-paid holidays, sick leave, x2 annual salary Life Insurance/AD&D, 401K with company contribution, ESPP, and Stock (RSUs).  At Mellanox, the work of each individual makes an impact on the success of our company. If you are looking for a rewarding career, talented colleagues and a great environment where you can challenge yourself, grow and lead, Mellanox is the right place for you.

    Additional Information

    Equal Employment Opportunity

     

    Mellanox is an Equal Opportunity Employer that does not discriminate on the basis of actual or perceived race, color, national origin, ancestry, sex, gender, pregnancy, childbirth or related medical condition, religious creed, physical disability, mental disability, age, medical condition, marital status, sexual orientation, veteran status, genetic characteristics, gender identity/expression, or any other characteristics protected by federal, state or local law. Our management team is dedicated to this policy with respect to recruitment, hiring, placement, promotion, transfer, training, compensation, benefits, employee activities and all other terms and conditions of employment. If you need assistance to perform your job duties because of a physical or mental condition, please let our Human Resources department know.

    Options

    Sorry the Share function is not working properly at this moment. Please refresh the page and try again later.
    Share on your newsfeed