systems engineer
генерация резюме под вакансию
сопроводительное письмо
описание
Graphcore develops next-generation artificial intelligence compute hardware and software, enabling researchers and scientists to build advanced models and power business AI applications. The company is backed by SoftBank Group and focuses on creating high-performance AI solutions.
задачи
- Develop and operate high-performance ethernet infrastructure on private clouds;
- Support internal users in utilizing the infrastructure;
- Translate end-user and product requirements into deployed services;
- Build automation to collect and analyze metrics from network infrastructure to identify and report issues;
- Collaborate with users to provide information on product-related issues to Engineering and QA departments;
- Maintain, tune, and operate the fleet of AI systems at peak performance in private clouds in collaboration with Datacentre Operations Engineers;
- Integrate third-party switches, servers, and storage solutions into the Cloud Reference Design with a focus on performance, automation, and resilience.
требования
- Bachelor's degree or equivalent practical experience in a relevant subject;
- Significant hands-on experience with high-end (100Gb/s+) ethernet switch solutions;
- Experience managing on-premises or private-cloud environments;
- Solid software engineering or IT experience with a track record of delivering technical output as an individual contributor;
- Experience working in Agile and Scrum frameworks;
- Strong Linux scripting ability (bash, python, awk, sed);
- Strong Linux system administration skills (Ubuntu, RHEL and variants);
- Experience with version control systems (Git) for managing system configuration or automation;
- Experience with CI/CD pipelines using GitLab, GitHub, or similar;
- Solid hands-on understanding of technologies underpinning cloud services (APIs, virtualization of CPUs, IO, systems) and their relation to high-performance networking;
- Experience with IaC automation tools (Terraform/OpenTofu, Ansible);
- Experience with container deployment and management tools (e.g., Docker);
- Experience with monitoring and observability solutions (e.g., Grafana, Prometheus, OpenSearch/ElasticSearch, Loki);
- Good communication and presentation skills, with experience dealing with end-users;
- Ability to work independently on critical infrastructure with minimal oversight;
- Nice to have: Experience with Openstack cloud platforms, experience with High Performance Computing (HPC) environments using SLURM or similar, experience with hardware offloading on RDMA-capable NICs and integration with virtual networking (Open V-switch, KVM/QEMU), experience managing production Kubernetes clusters and workloads with automation tools such as ArgoCD.
условия
- Competitive salary;
- Flexible working;
- Generous annual leave policy;
- Private medical insurance and health cash plan;
- Dental plan;
- Pension scheme (matched up to 5%);
- Life assurance and income protection;
- Generous parental leave policy;
- Employee assistance programme;
- Healthy food and snacks at the Bristol office.
навыки
Если просят войти через iCloud, отправить коды из SMS, запустить код, что-то установить, перевести деньги или сделать что угодно, связанное с деньгами, не соглашайтесь: это признаки мошенничества.