Summary
Overview
Work History
Education
Skills
Accomplishments
Certification
Honors & Awards
Timeline
Generic

Sudhanshu Sharma

Lead MLOPS Engineer
Munich

Summary

Experienced Data and Cloud Professional with over 12 years of experience in Applied Machine learning, Data Engineering, Product Development, and Big Data Technologies. Excellent reputation for resolving problems and improving customer satisfaction.

Overview

10
10
years of professional experience
4
4
years of post-secondary education
6
6
Certifications
2
2
Languages

Work History

Lead Machine Learning Platform Engineer

E.ON Digital Group
Munich
04.2022 - Current
  • OPT-IN Verification ( VoiceToText NLP Analytics – Live project) - Scope: OPT-IN is consent of customer for marketing via opt-in channels Landline, SMS, Email, and Mobile
  • Deliver end-to-end event-driven microservices-based MLOPS pipelines on Kubernetes for OPTIN to cover data preprocessing, resampling, STT transcriber, customer data enrichment in the transcript, Segment Search, SpaCy, Salesforce patch, etc
  • Preprocess data Human Transcripts + Voice files for training and test ML models - Custom acoustic NLP Model Training with human-generated transcripts + resampled audio in the German language
  • Made continued efforts in data preparation to increase overall accuracy from 0.67 to 0.92 and made it to production which can save ~2 ml € per year of third-party manual verification cost to E.ON
  • EDG Data Mesh (On-going) - Scope: Provide raw data, Salesforce customer data, and intermediate data generated by different AI projects as a product to be available for intra-team
  • Building concept and design architecture for data mesh
  • Making Opt-in transcription available as a data product
  • Customer Happiness Index ( VoiceToText Sentimental Analytics - Proof of Concept) - Scope: Customer Happiness Index is a Microsoft Hackathon project prototype which went successful
  • Develop end-to-end event-driven microservices on Azure function for MLOPS pipelines which deliver resampling, STT Analytics, Opinion mining, and Sentimental analytics with Cognitive Services
  • Deliver UI/UX dashboard for real-time processing demo and Grafana Dashboard for deeper analysis
  • Scored winner position in MS Hackathon at Munich and presented future use case to E.ON CXO.

Lead Data Architect ( Ext )

E.ON Digital Technology
01.2020 - 01.2022

Smart Meters Analytics (SMA) project (Live)
- Scope: E.on Smart meter analytics project focused on analyzing customer’s Smart meters performance metrics and compare them with real-world environment data.

E.ON TargetBI project (Live)
- Scope: E.ON have more than one sister companies. To create a unified MDM (Master Data management) for account, sales, BI data, EON is drawing data from Powercloud Innogy in order to make it available for analytical purposes i.e. Powercloud CDC Data migration from AWS cloud to Azure Cloud with AWS DMS (Database migration service).

LeanIX Integration Pipelines (Live)
- Design and Develop cost effective and secure solution to develop pipelines to integrate LeanIX source with multiple target locations in On-prem, Azure and multiple SaaS platforms like ServiceNow, Coplanner, Symbio, AxonIvy etc

Advanced Consultant, Big Data Architect

Altran Digital Team Altran Deutschland S.A.S. & Co. KG
Munich
05.2019 - 12.2019
  • Autonomous Driving Data Management and KPI (Live) - Design and Develop a Data pipeline for exchange driving captured recordings data between client’s NAS storage and partner’s AWS S3 bucket using IAM Allow role policy with AWS
  • Secure Token Service for data security
  • Design catalog management of processed data with Hive and HBase
  • Design and develop a track management data model and centralized ORM application library for MySql and Oracle
  • Cloudera Hadoop Cluster as a backbone for Centralized Data Lake, DAG Workflow management, distributed transformation and data-sharing services
  • Developing KPI Calculation reports on HIL and SIL playlist data with EMR Spark interacted with MongoDB on AWS
  • Designed Architecture of ADAS application on AWS stack and Cloudera Hadoop - At the end of the project, I worked on database and application refactor migration from Cloudera to AWS cloud
  • Vehicle Data Management API (PoC) - Design Vehicle Data schema on AWS DynamoDB
  • Design and develop AWS Lambda REST API for GET, PUT and UPDATE data stored in DynamoDB
  • Design Cloud formation templates for provisioning application platforms.

Data Analytics Lead

Landis+Gyr
11.2018 - 04.2019
  • Oncor Electric Delivery, North America Project (Live)
  • Project Description and responsibilities: Oncor Electric Delivery Company is Texas's largest transmission and distribution electric utility.
  • I worked on incremental loading of event logs, error log, and electricity meter data on Azure data lake
  • Performed predictive breakpoint analysis on Smart meter analysis with Python and Spark on Databricks and Azure ML Service.
  • Performed image classifications with computer vision on faulty network towers device predictive maintenance.

Senior Technical Consultant

Big, Data Science Services CenturyLink Technology
Noida
04.2016 - 11.2018

CTL-Cloudera Big Data As Service Project, USA (Live)
Project Description:
Project link: https://www.cloudera.com/about/news-and-blogs/press-releases/2016-09-21-cloudera- centurylink-expand-strategic-alliance-deliver-big-data-service-for-customers.html
BDaaS Project includes complex architecture including product knowledge, Hadoop services knowledge, Cloudera API knowledge, and Cloud automation which initialize CenturyLink
managed a secured cluster for Data Warehousing, Big Data Analytics applications.
Played an integral role as Subject Matter Expert of CenturyLink Big Data as a
Service cloud product. Responsibilities are to develop an Automation API framework in Java/Python which will set up and manage clusters with all services up and running
automatically.

Panera Bread Data Science Platform , USA (Live)
Project Description: Panera, LLC is an American chain of bakery-cafe fast-casual restaurants in the United States and Canada. CenturyLink has SOW with Panera, LLC for Capacity Planning and Production setup. The client required Identification of methodology for tying the online business workload at an order level to the actual utilization of the IT infrastructure and building of sample/representative dashboard/s depicting measures defining the IT resource utilization per order.


CTL Data Lake Project: Data Ingestion API (PoC)
Project Description: CTL Data Lake is a CenturyLink internal project for creating an
application for comprehensive data access and management and then applies data analytics on scalable data.
My responsibilities were to develop and test the REST interface for a data pipeline which
takes data from the customer and dumps to the Kafka topic as well as parsing it with Streaming and stored to the HBase table and HDFS.

DLL Financial solution partner, Mumbai, India (Live)
Design and Develop Azure Data Pipeline with Azure Data Factory, HDInsight, Azure Databricks, Data Lake Analytics, Azure SQL (Stage), Azure DW, Azure ARM and Power BI

Michaels Store, USA (Live)
Design Distributed high-performance Data analysis and Business Intelligence platorm on top of Cloudera Impala, HAproxy, etc.


Senior Software Engineer ( Big Data Technology )

Amdocs, Inc
Gurgaon
11.2014 - 03.2016

A&T Insight is a module of CMS (Amdocs Customer Management Suite) is a low-latency Big Data and predictive analytics Applications on top of customer billing data to provide insight-based reports to customer care executives during a call with a customer. Responsibilities/Deliverables
• Software Development with Hive and PIG scripts for analysis of Huge Telecom data.
• Develop dynamic map-reduce jobs and UDFs with Core-java.
• Automatic Data ingestion platform for migration of data from Oracle and H-Base.
• Worked in distributed Gigaspaces Grid clusters for Insight Application.
• Exposure to real-time Spark stream jobs and batch jobs.
• Developed modules for database connections and for structured programming.
• Experience with both Hortanworks and Cloudera distribution of Hadoop (CDH).
• Exposure to data manipulation with Hive queries and Pig scripts on HBase data (No SQL)

• Developed log analysis and real-time monitoring tools for the production application.
• Exposure to ETL job creation, flow diagrams, and job scheduling in DAG.

AT&T Telegence Mobility project create and manage a set of ETL workflows with Pre-Data validation and Post-Data Validations jobs to generate customers' bills from Billing data.

Information System Engineer

Monster.com India Pvt Ltd
02.2014 - 11.2014

Automation of Campaign mailing system
The campaign mailing interface was designed for the client relation team to test emails and trigger jobs to send a huge number of targeted emails.
- Worked on distributed 7-node cluster queues, Solr, and Sphinx framework.
- Have a good experience with Lucene queries with Mysql on a distributed cluster.
- Experience in MySQL Server 5.2 including the views, Indexes, Procedures Optimization and performance.
- Software development with Java, Python, Perl, Javascript, and MemCache in Live Projects.
- Experience in Advanced Automation programming in Python and Perl for Bulk mailing systems and system tests.
- Impact: Manual processing of the mailer campaign had been stopped with fully automated data product where data can be selected with just a saved query id of target customers and 200k to 2 million emails execution at a time with mail tracking features for demographic analysis of email readers. Awards: Team got awarded by APAC leaders.

Monster online job posts aggregator
Description: A Framework built for the reusability of the jobs scraping process to run 200+ processes at the time
Impact: 121,000 Job posts with this automation in a single month. which saves the workforce hours of monster.com clients.

Information Technology Engineer

CDOT ( R & D Center for Telematics Govt of India)
New Delhi
11.2012 - 11.2013

Admin Mannual of GulfTrip website: A Real Time Web application is B2B process specifically developed to the process of providing accessibility,Sales Reports, Profitability Reports, and Business Rules forGulftrip Website.In this Application, the Client will be able to modify GulfTrip Website which is basically a Tourism Module Website developed with Company Loop Methods.
Work is Development, Testing, and Support
Impact: long contract from client

Software Trainee

Loop Method Pvt Ltd
03.2011 - 05.2011
  • The role includes Development, Testing, and Support of the website portal of the Admin Manual of GulfTrip website: A Real-Time Web application is a B2B process specifically developed to the process of providing accessibility, Sales Reports, Profitability Reports, and Business Rules for Gulf trip Website
  • In this Application, the Client will be able to modify GulfTrip Website which is basically a Tourism Module Website developed with Company Loop Methods.

Education

Bachelor of Technology - Computer Science Engineering

IK Gujral Punjab Technical University
India
05.2007 - 05.2011

Skills

Natural Language Processing

undefined

Accomplishments

  • Sex Male | Date of birth 10 March 1990 | Nationality Indian
  • EXPERTISE - -Highly experienced Certified Cloud and Data professional with strong experience and Migrating scalable, highly available services on
  • Kubernetes, Big data, AWS cloud and Azure with Data mining, GDPR
  • Data lake, CICD DevOps, Infrastructure Automation, Linux enablement projects
  • Development and Architecting data models, anonymizations, and ELT pipelines for batch and structured event-driven streaming for Big Data with Kafka and Databricks Spark -Projects deployed on AWS, Azure Cloud, and Cloudera
  • Data migration and cloud migration for clients on AWS and Azure - Applying Data-driven strategies with cloud solutions and SQL-on-Hadoop for advanced analytics and data warehousing
  • Plan strategies and automate cloud migration for on-prem applications and backend databases MySQL, PostgreSQL, Oracle, NoSQL to Azure, and AWS
  • Worked as Machine Learning Engineer for forecasting with Spark ML with
  • Databricks, Dask, Pandas, Tensorflow, Skit learn, D3 JS
  • Development and maintenance of cloud platform code for service provisioning and data pipeline flow with GitLab CICD, Linux Ansible, and Azure ARM templates
  • Programming language preference: Python3, Scala, Bash, JavaScript - Understanding of presales activities including collaboration with the customer management team, aligning business and technology objectives to deliver a pilot platform scope and phased roadmap
  • HCL Gmbh, Munich (Germany)
  • Smart Meters Analytics (SMA) project (Live) - Scope: E.on Smart meter analytics project focused on analyzing customer’s Smart meters performance metrics and compare them with real-world environment data
  • Establish Azure Data Factory pipeline with Azure App services, Databricks, Azure Sql server, KeyVault and monitor services for different organizations Tenants as UK, Italy, Germany, etc
  • Set up CICD pipelines on Azure DevOps with Terraform job flow to manage infra operation automatically
  • E.ON TargetBI project (Live) - Scope: E.ON has more than one sister company
  • To create a unified MDM (Master Data management) for an account, sales, BI data, EON is drawing data from Powercloud Innogy in order to make it available for analytical purposes i.e
  • Powercloud CDC Data migration from
  • AWS cloud to Azure Cloud with AWS DMS (Database migration service)
  • Establish a secure VPN connection between E.ON’s Azure tenant and an AWS account
  • Established AWS SQS notifier to entertain data publish requests with table’s metadata and schema in Attributes and AWS S3 Path of data file location
  • Develop code for different stages of data lakes: transient zone, raw zone, GDPR zone,
  • DWH, DLC zones - Develop Prod and Test pipelines with Azure DevOps
  • 28/3/19 European Union, 2002-2019 | http://europass.cedefop.europa.eu Page 1 / 7
  • European skills passport
  • Sudhanshu Sharma - Develop Data Vault 2.0 for TargetBI data
  • Setup platform code for Kafka source connector for S3 and Kafka Sink connector for
  • Snowflake EDW and Azure Data Lake Gen-2 storage
  • Established an on-demand pipeline for Confluent Kafka on Kubernetes with persistent storage
  • Established On-demand Backup and restore pipeline and DR site with Velero backup and migration tool for K8s
  • LeanIX Integration Pipelines (Live) - Design and Develop cost-effective and secure solutions to develop pipelines to integrate
  • LeanIX source with multiple target locations in On-prem, Azure and multiple SaaS platforms like ServiceNow, Coplanner, Symbio, AxonIvy etc - Build platform as a code with Terraform
  • Used Azure Data Factory Integration runtime for data sharing with on-prem applications
  • GDPR Data Lake on Azure cloud (POC) - Design and Develop EU GDPR compliant Centralized Data pipeline with pseudo anonymisation of user’s private data
  • Build platform as a code for Confluent Kafka on Azure Kubernetes Services maintained on
  • Gitlab CICD pipelines
  • Integrate Azure Data Lake gen-2 as Centralized Data lake, Confluent Kafka as a backbone for Data workflow pipeline with 10 days retention policy and Dask , Pandas Dataframe enabled scalable API services running on Kubernetes as backbone for Data Ingestion and consumption services for distributed transformation and analysis
  • Design event triggers on ADL for automatic bulk data ingestion to data pipeline
  • Designing data model for token management for anonymised data
  • Participation in extra-curricular activities like Debates, presentations, and Group discussions
  • Organizational/ managerial skills
  • Awards: - Awarded as a Data hero for the VTA OPT-In verification project at E.ON
  • Got Perseverance Award for leading BDaaS cloud service product in CenturyLink Business - Data-to-Decision Workshop SME in Amdocs and CenturyLink Business - Excellent Results made a difference in the Panera Machine Learning Project
  • A recognized key member of product development
  • Certifications: - Cloudera Certified Spark and Hadoop Developer (CCA-175) - AWS Certified Solution Architect ( Launched Feb’2018 ) - Microsoft Certified Azure Data Factory Developer in Big Data
  • Big Data and Applications in the Cloud, University of Illinois Urbana-Champaign - Machine Learning with Big Data, University of California, San Diego
  • Developing Big Data Solutions with Azure Machine Learning - Microsoft Azure for AWS Experts - Architecting Microsoft Azure Solutions

Certification

AWS Certified Solutions Architect - Associate

Honors & Awards

Data Hero - E.ON Deutschland, Jun 2022
Awarded as a Data hero award for the VTA OPT-In verification project at E.ON.

Centurylink Preservation Award - CenturyLink Business for Enterprise, Sep 2018
The highest award in Centurylink for consistent improvement in the betterment of the product in a continues feedback loop from customers.

Timeline

Lead Machine Learning Platform Engineer

E.ON Digital Group
04.2022 - Current

AWS Security Best Practices for Developers

01-2021

Lead Data Architect ( Ext )

E.ON Digital Technology
01.2020 - 01.2022

Architecting Microsoft Azure Solutions

12-2019

Big Data Solutions with Azure Machine Learning

11-2019

Advanced Consultant, Big Data Architect

Altran Digital Team Altran Deutschland S.A.S. & Co. KG
05.2019 - 12.2019

Data Analytics Lead

Landis+Gyr
11.2018 - 04.2019

AWS Certified Solutions Architect - Associate

07-2018

Cloudera Hadoop and Spark Developer - CCA175

08-2016

Senior Technical Consultant

Big, Data Science Services CenturyLink Technology
04.2016 - 11.2018

Senior Software Engineer ( Big Data Technology )

Amdocs, Inc
11.2014 - 03.2016

Information System Engineer

Monster.com India Pvt Ltd
02.2014 - 11.2014

Information Technology Engineer

CDOT ( R & D Center for Telematics Govt of India)
11.2012 - 11.2013

Software Trainee

Loop Method Pvt Ltd
03.2011 - 05.2011

Bachelor of Technology - Computer Science Engineering

IK Gujral Punjab Technical University
05.2007 - 05.2011

Machine Learning With Big Data - UC San Diego

Sudhanshu SharmaLead MLOPS Engineer