Results-driven Head of DevOps and Engineering Leader with 15+ years of experience in administering Linux systems and leading high-load projects, alongside 8+ years of expertise in building and managing platform teams. Specializes in system design, distributed systems, incident management, and aligning DevOps practices with business objectives to drive organizational success.
Key Skills and Accomplishments:
Passionate about driving team growth, empowering engineers, and ensuring that infrastructure stability is a cornerstone of business growth. Experienced in ride tech, food tech, and delivery industries, with a focus on creating agile, resilient, and scalable solutions.
Urent is a kicksharing and power bank sharing service.
Managing the DevOps team, SRE team, and Cloud Engineers helps improve incident management and stability
Hands-on experience:
- Yandex Cloud (Managed Kubernetes, LB, VPC, s3) managed by terraform
- Kubernetes (helm, helmfile, Argo rollouts, ArgoCD)
- victoriametrics
- ingress-nginx
- redis-sentinel, keydb
- rabbitmq, kafka
- elk / apm
- vault
- gitlab + gitlab-ci
- mongodb
- dwh stack (airflow, dbt, Clickhouse)
Results:
Incident count decreased- x8 - Y2Y
Cost Effective - 30% - Y2Y
YouDo is c2c service
Managing of IT Infrastructure (DevOps & SE) and B2B development; I combine 2 roles - Head of Infrastructure and Head of Engineering. Leading more than 20 technical specialists, direct reports - 4 (managers);
Main activity:
- Strategic planning;
- Building an organizational structure and processes, personal development of team leaders and managers;
- Setting up processes in an engineering team (Agile, incident management, technical works, duties, etc.);
- Setting up HR processes in collaboration with HRD: performance reviews, 1to1 meetings, introduction in and support of the company's value system to engineering/infrastructure teams;
- Validation of value indicators, drawing up a description of requirements for grades (hard- / soft-skills);
- Setting up OKRs for department managers;
- Managing the annual operating budget;
- Selection and organization of outstaff team work - QA, Backend, Frontend, etc;
- Infrastructure development;
Citymobil is Russian Uber.
Managing of DevOps/SRE, SE departments, Help Desk, VOIP support; Leading more than 55 technical specialists, direct reports - 5 (managers);
Main activity:
- Strategic planning of infrastructure development, technical support;
- Direct interaction with Heads of engineering;
- Bringing up new technologies to stack; Increasing technical expertise in teams;
- Setting up processes in an engineering team (Agile, incident management, technical works, duties, etc.);
- Building the organizational structure, leading, supporting and mentoring of Engineering Managers and Tech Leads to be more effective leaders;
- Setting up OKRs for department managers;
- Implementation of the practice of calculating SLO/SLI/SLA;
- Implementation of SRE practices in SE and SWE teams;
- Responsibility for uptime and infrastructure stability;
- Searching for and eliminating points of failure both within the infrastructure and in architectural solutions;
- System troubleshooting and problem-solving across platform and application domains (creating of task force, leading the process) - for example, errors in the Redis-client library, application resolution errors, errors in the network stack, post-incident cases;
- Participation in audits (Deloitte);
- Participation in acquisitions of third-party companies: CityDrive (ex-YouDrive), Proil;
- Working independently and with little direct CTO supervision;
- Management of the annual IT infrastructure budget (~ $15mm).
ATOL is company which provide online receipt services and retail products
Managing the Infrastructure Department (DevOps & Infra); Leading more than 12 technical specialists, direct reports - 4 (managers);
Main activity:
I was responsible for the operation of all business units in the Group of Companies: - Server and network equipment in 4 data centers;
- Implementation of advanced DevOps/SRE / OPS practices;
- Infrastructure as a code;
- Information security;
- Formation of teams in-directions;
- CAPEX, OPEX for IT;
- Implementation of modern SRE/DevOps practices and customer-oriented operation in ATOL Group;
Leading more than 10 technical specialists, direct reports - 2 (managers);
Main activity:
- Creating and implementing a RoadMAP (DevOps, infrastructure, SRE); - Division of operation into 3 areas (SRE, infrastructure, DevOps);
- Implementation of Agile processes;
- Providing the ability to live cross-dc;
- Building a hybrid cloud based on Opennebula;
- Implementation of ELK, Prometheus, SaltStack;
- Building a Hadoop cluster;
- Implementation of the Docker container orchestrator.
CIAN is NASDAQ listed company which helps to find property.
I worked in the following areas within the System Administration group:
- Quarterly planning and formation of tasks for sprints;
- Infrastructure (Implementation of the Salstack orchestrator, Improvement of SLI, SLA, Assistance in sawing the monolith and transferring it to a microservice architecture);
- Hadoop (Hortonworks hadoop cluster, Improving cluster stability and fault tolerance, monitoring, backup);
- Balancers (Fault tolerance of balancers, latency reduction, caching, balancing, working with
AntiDDoS providers);
- Monitoring (Prometheus-component software implementation, alerting, ELK-deploying and collecting logs from balancers, applications. Building visualizations and alerts based on data from ELK); - ElasticSearch5 (cluster support, fault tolerance, sharding, monitoring);
- Cassandra (problem solving, migration of cases to small clusters, implementation of flow for rolling migrations, monitoring);
- Docker (Nomad, Consul, Consul-template, cluster support, monitoring, problem solving).
We raised the SLA from 90% to 99.9%.
Rambler is first search engine company in Russia.
I was in charge of the operation group with 4 reporting employees.
Main activity:
- Training, motivation, recruitment of employees, ensuring interchangeability, the guys of the group have gone from system administrators to DevOps;
- Administration of FreeBSD/Linux services (Home, Search, News, Comments, Payment system, more than 15 projects in total);
- Implementation of SaltStack in all groups of operation, justification of implementation, training; - Changed operating systems from FreeBSD to Linux in 2 quarters;
- Resource optimization, actively eliminated Legacy, released more than 200 servers;
- Unification of assemblies in Docker, implementation of CI / CD processes from scratch based on gitlabci + rpm + salt-api;
- Administration of Hadoop clusters (Search, TOP-100) and accompanying pipelines for log delivery to HDFS;
- Forming requirements for services living in the infrastructure;
- Was involved in the purchase of new third-party projects, evaluated integration into the infrastructure, deployed ready-made applications/prototypes;
- Working with development team leaders and ordinary developers, developing solutions for deployment, support, and incident analysis;
- Launched 3 services in production based on Docker.
I formed the best operation group in Rambler&co, which went from system administration to DevOps.