Work Experience

Data Analyst Intern

Eversource Energy

Currently, I'm working at Eversource Energy as a data analyst intern to analyse sensor data from the newly modified power grid systems across Massachusetts in order to estimate system performance and quantify customer energy savings for the Grid Modernization program. I have developed time series based models with exogenous variables to statistically prove the benefits of the Volt-Var optimization algorithm by reducing the error rate (MAPE) from 6.1% to 4.19%. I also built python scripts to automate data anomaly detection that minimized manual processing of large data files. On weekly basis I lead calls with company leaders to present model results, visualizations and address issues with data quality.

Data Analyst Intern

Healthedge

During my 2022 summer holidays, I worked at Healthedge as a data analyst intern where I collaborated with the Devops team to convert client’s server configuration data into insightful dashboards. I learnt Powershell scripting language within 3 days and optimised Powershell scripts that were loading large datasets to Azure by adding filter parameters and creating pipelines. I automated the ETL process using PowerBI Gateway between Azure and PowerBI. I worked with DAX, data blending, hierarchies, cross filtering, slicers and drill through filters to create meticulous reports. These dashboards reduced the manual efforts and time for developers, which consist of client-server configurations and changes accessible all at one place in real-time.

Chatbot Engineer

Quantiphi Analytics Solution Pvt. Ltd.

Before coming to WPI, I worked as a Conversational Bot Engineer at Quantiphi where I created chatbots using Google's Dialogflow and Google Cloud Platform. I researched on intent detection and customizable named entity recognition approaches for chatbots in educational, healthcare and financial domains. The chatbots were also converted into voice automated calls and integrated with Big Query to store millions of conversations. The conversations were reviewed and iteratively improved which reduced the false positive rate from 35% to 3%. I mentored and trained interns in chatbot engineering on a monthly basis. Apart from building chatbots, I also worked creating a post call analysis tool for call centers. RoBERTa model was used to identify and tag sentiments detected in the call logs. This brought down the time and the cost that was required to manually analyze the huge number of calls.

Research Intern

99Yrs Network LLP

Prior to this, I also worked as a Research Intern at 99Yrs Network where I partnered with the executive team to analyse data for future sales prediction, apply content-based filtering, perform customer segmentation, and deciding marketing strategies for 20+ eCommerce mutilnational companies. I built out data reporting infrastructure from ground-up using AWS for data storage, Tableau, Python and SQL to provide insights into business KPIs, marketing funnel and revenue generation

Predicting Future Sales

In this project I worked on a time series dataset consisting of daily sales data and goal was to predict subsequent month's sales data. I performed the ADF test, seasonality and trend analysis on the dataset and used time series models like XGBoost, ARIMA, SARIMAX and FBProphet to predict the sales for the next two months.

Indoor localization using WIFI-Fingerprinting

This is a machine learning project to determine position of a user inside a building by using the Wireless Access Point values as attributes and floor and building ID as the target variables in the UJIIndoorLoc dataset

Olympics Dashboard

It is a DBMS project which provides an informative dashboard based on the historical Olympics dataset. Views and triggers were implemented to reduce execution time of complex queries and to put constraint on invalid data respectively. I managed to bring down the import time for the huge dataset from 4 hours to 3 secs


Blight ticket analysis

In a city, more are the blight tickets then more the area becomes unsafe to reside in. In order to help the governing bodies I analysed the blight ticket data from Detroit of two years and got important insights like top violators, most violations, most non-complaint violators, etc. I also built a machine learning model to see whether a violator if guilty, is complaint or non-complaint.

Netflix recommender system

We all have wasted time scrolling through Netflix's endless list of tv shows and movies. So I decided to build a content based filtering recommender system with help of sklearn's tfidf vectorizer. I cleaned and vectorized Netflix data, examined patterns, created wordcloud, countplots, lineplots and other plots for better data representation.

Reddit Comment Analysis

The project is about extracting data from a sub reddit where the opinions shared by subreddit users will be unbiased as the users are assumed to be anonymous. For this, the data was webscraped from Reddit using PRAW to collect the comments from a data science sub reddit.



Entity Extraction for Clinical data

Deep Learning & NLP

This project is about extracting and tagging entities like alcohol, tobacco and drug abuse from free text in clinical notes. Along with detecting these three main entities, I worked to extract sub entites like type of substance, status, quit history, amount of dosage and frequency of subtance intake. Identifying these entities helped the healthcare organizations make better decisions and will help make an automated pipeline to extract entites from free text.


News Inspector

Information Retrieval, SEO & NLP

By "News inspector", I mean building a plagiarism checker and a search engine for retrieving relevant documents from a huge corpus. This corupus contains around a million news articles that were published by top 10 American new publication houses. I also came up with my own page ranking algorithm to order the search engine results and using deep learning algorithms for building the plagiarism checker

My Certifications


Associate Cloud Engineer

GOOGLE CLOUD CERTIFIED

On Feb 18th, 2021 I was officially granted the Google Certified Associate Cloud Engineer badge for passing the ACE exam with pretty good grades. In order to receive this badge I had to learn about working of GCP and its services like BigQuery, Compute Engine, AutoML, Kubernetes and others


Certified Developer - Associate

AMAZON WEB SERVICES CERTIFIED

After giving the GCP ACE, I was interested in learning about the cloud based services that other cloud based providers had built and the next best cloud platform after GCP is Amazon Web Services. I gave the Associate exam on March 16th, 2021 and scored 927/1000 points!