MSc student in Data Science with a strong foundation in computer science, data engineering, and statistical modeling. Proficient in designing and implementing ETL pipelines and optimizing database structures, with hands-on experience managing large, complex datasets. Committed to leveraging data-driven methodologies to enhance research quality, ensure compliance, and support informed decision-making. Eager to contribute analytical expertise to innovative projects that drive impactful results.
Implemented KolmogorovSmirnov and Chi-square goodness-of-fit tests in Python to validate survey distribution. Ran Monte Carlo simulations to measure Type I/II error rates and automated exact p-value estimation.
Demonstrated ability to validate large datasets against benchmarks, mirroring clinical trial data checks.
Designed and implemented ETL pipelines (SSIS, Python) to ingest and clean traffic incident records Built OLAP cubes (SSAS) for fast, multidimensional analysis and reporting.Applied data quality checks to ensure consistency across large, heterogeneous datasets.
Preprocessed and explored 1.6 million records of accident data; engineered features for analysis.Built predictive models (logistic regression, random forests) and clustering to uncover hidden patterns Collaborated in translating raw data into actionable insights, similar to preparing datasets for clinical research.
Conducted audience segmentation for a nonprofit's Google Ad Grants campaigns. Designed performance dashboards (Power BI, Python) to track and optimize campaign outcomes. Showcased ability to present data clearly to non-technical stakeholders.
Programming & Data Engineering
Data Quality & Validation
Analytics & Visualization
Collaboration & Processes
Other Technical Tools