About me

Passionate about creating impactful solutions for end users, I excel at transforming complex problems into sustainable, data-driven insights. My expertise lies in pattern recognition, anomaly detection, predictive modelling and analytical finesse.

I am also passionate about fintech, progressing in analyzing trading patterns, company fundamentals, and portfolio analysis, applying my skills in the world of stocks and investments. In addition to my fintech focus, I have a deep interest in robotics, which I pursue as a personal hobby.

I am committed to harnessing data and technology for a more innovative and meaningful tomorrow.

What i'm doing

Professionally

  • anomaly icon

    Anomaly Detection

    Uncovering hidden patterns and ensuring operational reliability


    Specialized in the domain of Fraud Detection, applied various models such as GBT, XGBoost, TabNet, DevNet and ConNet to identify abusive data points. In addition to models, built multiple rule-based pipelines to catch pre-defined fraud patterns which were identified through EDA.

    Also, worked on group frauds identification by leveraging graph theory. Modelled the entities in a graph, connected them with various KPIs and applied both community detection algorithms and models such as Louvain Algorithm, Leiden Algorithm, PSO optimization, Graph Convolution Network and Deep Modularity Network.

  • customer icon

    Propensity Modelling

    Predicting customer behavior and driving strategic decisions


    Dived deep into predicting the customer behaviour on a platform, experimented with various custom-built neural network architectures such as MLP, SIMO & MIMO with modified loss functions. Also, applied various tree-based models built in tensorflow by leveraging TF-DF module.

    Have built both classification models and regression models in which customers' risk score and lifetime value are predicted.

  • location icon

    Location Intelligence

    Transforming geospatial data into actionable insights


    Leveraged the GPS pings data to generate suggestions & rectify the incorrectly logged locations in the system. Explored multiple techniques to generate suggestions such as Geometric Median, Clustering, Outlier Filtering, etc.

    Also, working on a pipeline for the Facility Location Netowrk Optimization to determine the optimal locations of the facilities to cater the demand using Linear Programming with tools like Gurobi & Pulp.

  • content icon

    Generative AI

    Revolutionizing content creation and transforming ideas into reality


    Worked on a stable diffusion pipeline that enhances dish images by expanding the visuals and adding text overlays. Also contributed to a RAG system that turns plain language questions into SQL queries by narrowing down the needed information step by step.

  • opex icon

    Operational Excellence

    Driving optimizations and efficiency through data-driven strategies


    Implemented storage and compute optimizations, saving thousands of dollars per month in tech costs by analyzing and optimizing the usage of Redis, DynamoDB, Kafka, and Yack.

    Designed and implemented automations to reduce human dependency and enable self-serve pipelines, utilizing APIs from Databricks, GSuite, GCP, AWS, and Slack.

  • analysis icon

    Data Analysis

    Unlocking the potential of data with precise and insightful analysis


    Conducted a range of analyses, from Exploratory Data Analysis (EDA) to Total Addressable Market (TAM) sizing, to extract insights and assess the impact of data-driven actions.

    With experience in Anomaly Detection, Anecdotal Analysis, and Root Cause Analysis, I can effectively explain model behavior in specific cases and reach unbiased conclusions on what should have happened.

Personally

  • design icon

    SaaS Applications

    Building innovative and robust SaaS platforms


    I actively seek out day-to-day pain points that can be addressed through tech platforms and work to make them a reality. Some platforms I have built in the past include:

    Bolt - AI infused swing trading platform which takes charge of the entire trading process, from strategically selecting stocks to optimal investment allocation and timely executions on the exchange.

    inDex - an open data community platform where researchers, organizations, contributors, consumers, etc can come together and help each other in piling up non existing datasets for crafting new revolutionary products.

    PixelRides - a cab booking platform with bargaining capability enabled for customers & drivers to reach a satisfactory price before booking the ride.

  • design icon

    Trading & Investment

    Crafting efficient strategies for dynamic market conditions


    Always on hunt to identify and design an even-more accurate strategy which can shortlist potential stocks and provide precise entry & exit points.

    Play around with a heirarichal structure of strategies from a longer time frame to the shortest time frame, like a waterfall model which can filter for only the strong choices at the end.

  • design icon

    Reading

    Constantly expanding my knowledge from articles, blogs, and books


    I like to keep myself updated with the latest things happening around me, I try to get my quick bites from articles and blogs. And when I want to learn something new or bigger, I take up books or courses.

  • design icon

    Retro Gaming & Shows

    Exploring the charm and nostalgia to spark creativity and relaxation


    Tekken-3, Mario, Sonic, Contra and Adventure Island are some of my favourite retro games.

Testimonials

  • Pradeep Janardhanan

    Pradeep Janardhanan  
    CEO @Vsualthree60

    Masihullah is reliable, dedicated and eternally upbeat. Masihullah multitasks effectively and is able to handle a high-volume workload. He consistently met and surpassed all our expectations on delivering gold standard technology solutions. Masihullah's has a team player mind-set, enthusiastic embrace of change, ability to work with minimal supervision and unwavering commitment to exceed expectations.


    Organized and diligent, Masihullah quickly learned technology systems and software that were unfamiliar to him when he first started with Vsualthree60. Masihullah is a hardworking, top-performing professional and has my highest recommendation.

  • Jose Mathew

    Jose Mathew  
    AI/ML Leader @Project44

    I had the pleasure of working with Masi, who initially joined Swiggy as an intern(and later got converted to a DS1 role) in the Trust and Safety charter. From the outset, Masi demonstrated exceptional industriousness and a keen curiosity to learn. He quickly proved that he could get things done with minimal supervision, making him an invaluable asset to the team.


    One of Masi's major contributions was the development and implementation of a robust model for identifying fraud rings, where a network of bad actors connected by edges(like payment id, device id etc) engage in fraudulent activities. His eye for detail and strong analytical skills significantly enhanced our ability to detect and mitigate fraud.


    In addition to his technical expertise, Masi is also a fantastic teammate. His collaborative spirit and willingness to share knowledge made him a pleasure to work with. I have no doubt that he will continue to achieve great things.

  • Akash Deep

    Akash Deep  
    DS III @Nykaa

    Thorough understanding of the business, engineering systems and the problem statement, in-depth research and the ability to implement complex solutions in a time bound manner - these skills make Masih an indispensable asset to any team. It was great working with him across multiple projects at Swiggy.

  • Shubhankar Shukla

    Shubhankar Shukla  
    Product Manager @Swiggy

    I have worked closely with Masih at Swiggy, where he consistently demonstrated exceptional data science and product-backward thinking skills. One of his key strengths is his ability to collaborate seamlessly with non-tech stakeholders, understand complex problems, and apply first-principle thinking to develop effective solutions. His deep understanding of data, combined with strong technical expertise, allowed him to deliver impactful results, improving precision in fraud detection and enhancing customer engagement. I worked with him on multiple projects and was consistently impressed by his ability to transition smoothly between projects, delivering high-quality results.

Worked at

Resume

Education

  1. Indian Institute of Information Technology Sri City, India

    2018 — 2022

    BTech (Honors) in Computer Science and Engineering
    CGPA - 9.41 / 10

    Honors Research (2020-2022): Title : Exploring Collaborative Strategies for PTZ Cameras Network

    Show Details >>>

    • Created real-world traffic simulations using Agent-based Modelling and studied various collaborative strategies for an array of PTZ traffic cameras. The cameras communicate with each other by learning a graph to improve the effectiveness of target tracking.
    • Publication : Masihullah S, and Subu K. "A Decentralized Collaborative Strategy for PTZ Camera Network Tracking System using Graph Learning." Published in AMMS 2022, Paris, France.

Experience

  1. Data Scientist II @Swiggy

    Oct 2023 — Present

    Show Details >>>

    • Customer Lifetime Value: Formulated a predictive unified customer score basis 200+ user attributes by estimating it as function of lifetime value and potential risk. Trained a multi-output neural network to rank 5M new users with minimal order data, addressing information sparsity and cold start issues. Analyzed 15+ model ablations and business metrics such as MAE, NDCG, Spearman R, Repeat Rate, Average Order Value, etc.
    • Text-to-SQL using Gen AI: Contributed to an in-house RAG-based Text-to-SQL workflow enabling users to ask questions in natural language and receive SQL queries, used by 1,000+ employees. Performed related-work industry analysis, anecdotal analysis and proposed new layers in the workflow to improve the accuracy.
    • Restaurant Location Correction: Designed and implemented a framework to rectify incorrectly logged restaurant locations using delivery partners' historical GPS data and the Geometric Median technique. Achieved 92% accuracy, correcting over 13K locations and improving delivery efficiency for thousands of daily orders.
    • Facility Location Network Optimization: Developing a pipeline to generate an optimal dark store network for Swiggy Instamart using Linear Programming with constraints like facility count, serviceability, demand satisfaction, etc. Also estimating operational metrics such as order count, delivery time, revenue, and profit.
  2. Data Scientist I @Swiggy

    Jun 2022 — Sep 2023

    Show Details >>>

    • Fraud Rings Detection: Expanded on the intern work to implement community-based group fraud detection pipelines in production. Constructed graphs with 4M nodes and 3M edges, incorporating various node and edge types with parameterized rules tuning and edge case handling, achieving over 90% precision.
    • Refund Fraud Detection: Developed an E2E fraud claims detection pipeline using ensemble of TabNet(classification) and DevNet(anomaly detection). Handled limited labeled data (5%) and class imbalance (10% positives) with weak and semi-supervised methods. Engineered over 250 features to detect 70+ fraud patterns with 85% precision.
      • Publication : Piyush N, Rutvik V, Masihullah S, Meghana N, & Jose M. "Utilizing DevNet with Variational Loss for Fraud Detection in Hyperlocal Food Delivery." Published in CODS-COMAD 2024, Bangalore, India.
    • Menu Enhancement: Given a dish name, predict whether its veg or non-veg, and recommend a dish category. Built multi-head attention neural network using N-grams and Word2Vec embeddings, achieving 96% precision & reducing manual correction tickets by 75% (~10K per day).
    • New User Churn Prediction: Engineered 130+ features for new users with limited historical data while developing and iterating churn prediction models, including GBT, AE anomaly detection, tree models in TF-DF, and their ensembles. Accomplished over 90% precision, 1K new orders and 33% increase in COD availability for Swiggy Instamart.
  3. Data Science Intern @Swiggy

    Jun 2021 — Jan 2022

    Show Details >>>

    • Fraud Rings Detection: Designed a graph-based anomaly detection framework using the Leiden community detection algorithm, enabling real-time identification of fraud groups. Improved F1 score by 9.92% compared to existing SOTA like GCN and DMON.
      • Publication : Masihullah S, Meghana N, Jose M, & Jairaj S. "Identifying Fraud Rings Using Domain Aware Weighted Community Detection." Published in CD-MAKE 2022, Vienna, Austria.
  4. GRM Research Intern @IBM

    Jan 2020 — Jun 2020

    Show Details >>>

    • Road and Pothole Segmentation: Integrated Attention-based Refinement and Feature Fusion modules into the DeepLabV3+ architecture to enhance spatial representation and global context. Achieved a mIoU of 93.6% for road and 73.83% for pothole segmentation, with an 81% reduction in run-time.
      • Publication : Masihullah S, Ritu G, Prerana M, Anupama R. "Attention Based Coupled Framework for Road and Pothole Segmentation." Published in ICPR 2020, Milan, Italy.
  5. Machine Learning Intern @Vsualthree60

    Jul 2019 — Jan 2021

    Show Details >>>

    • AI SaaS Platforms: Developed platforms for Crowd Surveillance, Image Docs Parsing, Appointments Chatbot, and Virtual Clothing Try-On. Used deep learning models like VGG-Face, YOLO, Tesseract, OpenCV, and Rasa for age, gender, and ethnicity prediction, text detection in Arabic and English, and image/text processing.

Skills

  • Programming Languages: Python, SQL, Bash, JavaScript, HTML, CSS, Java, C, Dart, Julia
  • DS Tools: Tensorflow, Keras, Pytorch, OpenCV, Sklearn, Numpy, Pandas, PySpark
  • Ops Tools: Databricks, Snowflake, Kafka, AWS, GCP, Git
  • Others: Django, Flask, Flutter, Robot Operating System

Publications

Portfolio