Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

Namrita Arya

Bengaluru,Karnataka

Summary

Dynamic Site Reliability Engineer with 9 years of experience designing and maintaining robust, high-performance systems. Expertise in utilizing Kubernetes, Docker, and AWS to automate operations and enhance CI/CD pipelines, driving efficiency and reliability. Proven track record in proactive monitoring, incident resolution, and infrastructure optimization using Terraform and Ansible, ensuring seamless service delivery that supports business growth. Recognized for exceptional problem-solving abilities, collaborative spirit, and dedication to cultivating resilient engineering cultures that prioritize efficiency and innovation.

Overview

10
10
years of professional experience
1
1
Certification

Work History

Senior Site Reliability Engineer

Goldman Sachs
03.2022 - 03.2024
  • Led SLOs/SLAs Implementation: Discussed and implemented bi-annual Service Level Objectives (SLOs) and Service Level Agreements (SLAs) between SRE and development team, ensuring 100% conflict resolution and improving overall platform stability.
  • Team Leadership and Mentoring: Led the team of 9 engineers, provided extensive system training and mentored them by sharing technical and domain knowledge.
  • Infrastructure Configuration: Configured 100+ BareMetal RHEL7 servers using Ansible to achieve platform readiness for large-scale application migration.
  • Incident Management and Post-mortem Analysis: Provided critical support and resolved 200+ incidents, documented the RCAs to build support run-book.
  • Containerization: Containerized and migrated 10+ pre-production applications to new environments using Docker.
  • Alerting and Monitoring: Enhanced alerting and monitoring of production environment using custom scripts and Grafana, gaining a 3-hour lead time to stabilize the system.
  • Cross-team Collaboration: Worked effectively with developers and management stakeholders to plan, release and maintain applications in production.
  • Release Management: Managed production releases for OM2 applications in the AMER/NY regions, achieving 99.9% release accuracy and maintaining consistent deployment timelines.
  • Large-Scale Migration: Orchestrated a large-scale migration of applications from RHEL6 to RHEL7 BareMetal Servers across NY/AMER regions, improving system performance by 50%.
  • Operational and Business Activities: Carried out BCP activities such as disaster recovery plan (DRP) testing for 10+ production applications, organizational directory restructuring resulted in smooth business operations.

Senior Software Engineer - SRE

Saggezza India Pvt. Ltd
11.2019 - 03.2022
  • CI/CD Implementation: Devised GitLab CI/CD pipelines and standardized Git workflows to enhance code integration and release processes for 15+ applications.
  • Automated Monitoring Systems: Developed an automated monitoring tool eliminating binary version conflicts across 5 key applications and improving production and post-production environment stability.
  • On-Call Rotations & RCA Troubleshooting: Participated in 12x5 on-call rotations, conducting thorough root cause analyses and implementing preventive measures to minimize future incidents.
  • Automated Rule Management System: Enabled synchronization of rules between Production and Testing environments by developing an automated system ensuring reliability for client testing.
  • Database purging: Implemented script to purge old DB data to achieve space and server cost optimization.

Technical Analyst

Credit Suisse Services Limited
07.2014 - 04.2019
  • TeamCity CI/CD: Established production CI/CD pipelines for monolith repository of core trading application to create the production deployment package.
  • Alerting and Reporting: Build a web framework using Python, HTML to parse the generated reports of CI/CD pipelines and automatically send report across the team and stakeholders.
  • Test-Driven Development (TDD): Developed a TDD framework for the MERCURY trading application, ensuring 100% test coverage and robust software reliability.
  • Automation & Efficiency: Automated testing processes using GCE, reducing testing time by 3 hours and optimizing deployment pipelines in TeamCity to decrease failure rates by 40%.

Education

Master of Computer Application -

National Institute of Technology
Trichy
07.2014

Bachelors of Science (Computer Application) Honours -

Aligarh Muslim University
Aligarh
06.2011

Skills

  • Programming Languages: Python, Shell Scripting
  • Tools & Technologies: Docker, Kubernetes, Helm, Terraform, TeamCity, GitLab CI/CD, UNIX, Ansible
  • Database Management: MySQL, Sybase, InfluxDB
  • Monitoring & Observability: Prometheus, Grafana
  • Cloud Platform: AWS EC2, IAM, VPC
  • Task and Code Management: JIRA, Confluence, Git, SVN, CVS
  • Operating Systems: Linux, Windows

Certification

Certified Kubernetes Administrator (CKA)

Timeline

Senior Site Reliability Engineer

Goldman Sachs
03.2022 - 03.2024

Senior Software Engineer - SRE

Saggezza India Pvt. Ltd
11.2019 - 03.2022

Technical Analyst

Credit Suisse Services Limited
07.2014 - 04.2019

Master of Computer Application -

National Institute of Technology

Bachelors of Science (Computer Application) Honours -

Aligarh Muslim University
Namrita Arya