Prerna Singh

Specialist in
Gen.AI, NLP, ML & RL

Senior
Applied Scientist
Microsoft, USA

About

Five years’ experience in AI (as of October, 2024): 

Three with Microsoft USA post a specialisation (Masters) in Machine Learning from Carnegie Mellon University, USA; and

Two at TCS Research & Innovation Labs India, as a Researcher in Reinforcement Learning and Robotics, designing algorithms for robotic target-reaching tasks, post an undergraduate degree in Electronics and Communications Engineering from IIIT Delhi, India.

At Microsoft, I develop custom machine learning models for diverse industries, including finance, sustainability, and energy. 

Published work includes seven research papers; contributions to IEEE transactions, journals, and conferences. 

My interest lies in leveraging data to address challenges in finance, healthcare, robotics, and education. I have committed myself to developing impactful solutions, and regularly share my knowledge through talks at conferences. 

My expertise includes Machine Learning, Artificial Intelligence, Reinforcement Learning, Natural Language Processing, Image Processing, and Data Structures & Algorithms. 

I welcome connections and invite you to reach out via the contact form on my site.

Education

Research Papers in AI/ML

Abstract: In this work, we review research studies that combine Reinforcement Learning (RL) and Large Language Models (LLMs), two areas that owe their momentum to the development of Deep Neural Networks (DNNs). We propose a novel taxonomy of three main classes based on the way that the two model types interact with each other. The first class, RL4LLM, includes studies where RL is leveraged to improve the performance of LLMs on tasks related to Natural Language Processing (NLP). RL4LLM is divided into two sub-categories depending on whether RL is used to directly fine-tune an existing LLM or to improve the prompt of the LLM. In the second class, LLM4RL, an LLM assists the training of an RL model that performs a task that is not inherently related to natural language. We further break down LLM4RL based on the component of the RL training framework that the LLM assists or replaces, namely reward shaping, goal generation, and policy function. Finally, in the third class, RL+LLM, an LLM and an RL agent are embedded in a common planning framework without either of them contributing to training or fine-tuning of the other. We further branch this class to distinguish between studies with and without natural language feedback. We use this taxonomy to explore the motivations behind the synergy of LLMs and RL and explain the reasons for its success, while pinpointing potential shortcomings and areas where further research is needed, as well as alternative methodologies that serve the same goal.

 

Read the full paper here

Abstract: Electrocardiogram (ECG) signals are known to encode unique signatures based on the geometrical characteristics of the heart. Due to other advantages - such as continuity and accessibility (now via smartwatch technology) - ECG could make for a robust biometric ID system. We show that single-node ECG measurements through an Apple Watch would suffice to identify an individual. Apart from the Apple Watch ECG data, we have also performed analysis on two other ECG datasets from PhysioNet to test the robustness of our methods in two situations: in particular, we tested how it holds up against high volume (across a large number of individuals) and high variability (across different states of activity). We have also compared multiple classifier models in combination with different feature sets to identify the most superior combination. We observed Equal Error Rate (EER) values that were consistently <; 3%. Our results show that ECG proves to be very effective and robust.

 

Read the full paper here

Abstract: This paper presents a feature agnostic and model-free visual servoing (VS) technique using deep reinforcement learning (DRL) which exploits two new architectures of experience replay buffer in deep deterministic policy gradient (DDPG). The proposed architectures are significantly fast and converge in a few numbers of steps. We use the proposed method to learn an end-to-end VS with eye-in-hand configuration. In traditional DDPG, the experience replay memory is randomly sampled for training the actor-critic network. This results in a loss of useful experiences when the buffer contains very few successful examples. We solve this problem by proposing two new replay buffer architectures: (a) min-heap DDPG (mH-DDPG) and (b) dual replay buffer DDPG (dR-DDPG). The former uses a min-heap data structure to implement the replay buffer whereas the latter uses two buffers to separate “good” examples from the “bad” examples. The training data for the actor-critic network is created as a weighted combination of the two buffers. The proposed algorithms are validated in simulation with the UR5 robotic manipulator model. It is observed that as the number of good experiences increases in the training data, the convergence time decreases. We find 27.25% and 43.25% improvements in the rate of convergence respectively by mH-DDPG and dR-DDPG over state-of-the-art DDPG.

 

Read the full paper here

Abstract: Video acquisition using dashboard-mounted cameras has recently achieved massive popularity around the world. One of the major developments following the dash-cam’s popularity is that videos captured by them can be used as testimony during scenarios like traffic violations and accidents. The widespread deployment of dash-cam’s brings new problems ranging from the compromise of privacy by uploading of these videos on public websites to using videos captured from other cars for making fraudulent claims. Therefore, there is a compelling need to address the problems associated with the usage of dash-cam videos. In this paper, we discuss and highlight the importance of the emerging area of multimedia vehicle forensics. We propose an algorithm for linking a dash-cam video to a specific car. The proposed algorithm is useful for various applications for example, insurance companies can authenticate the origin of video before processing the claim. In a different scenario of illegitimate video upload on the web, the video can be traced back to the car it originated from. To this end, we make use of motion blur extracted from dash-cam videos for generating a discriminative feature. We observe that the subtle motion pattern of every vehicle can serve as its unique signature. We extract motion blur from dash-cam videos and use random forest trees for classifying the vehicle correctly. Experimental results on thousands of frames obtained from dash-cam videos of several cars show the effectiveness of our approach. We further investigate the process of forging the signature of a car and propose a counter forensics method to detect such forgery. Also, we discuss the application of our technique to other potential platforms where the camera can be mounted, for example, on the chest of a person. We believe that ours is the first work that describes this new area of research.

 

Read the full paper here

Abstract: In multi-echo imaging, multiple T1/T2 weighted images of the same cross section is acquired.
Acquiring multiple scans is time consuming. In order to accelerate, compressed sensing based techniques have been proposed. In recent times, it has been observed in several areas of traditional compressed sensing, that instead of using fixed basis (wavelet, DCT etc.), considerably better results can be achieved by learning the basis adaptively from the data. Motivated by these studies, we propose to employ such adaptive learning techniques to improve reconstruction of multi-echo scans. This work will be based on two basis learning models – synthesis (better known as dictionary learning) and analysis (known as transform learning). We modify these basic methods by incorporating structure of the multi-echo scans. Our work shows that we can indeed significantly improve multi-echo imaging over compressed sensing based techniques and other unstructured adaptive sparse recovery methods

 

Read the full paper here

Abstract: This work addresses the problem of denoising multiple measurement vectors having a common sparse support. Such problems arise in a variety of imaging problems, e.g. color imaging, multi-spectral and hyper-spectral imaging, multi-echo and multi-channel magnetic resonance imaging, etc. For such cases, denoising them piecemeal, one channel at a time, is not optimal; since it does not exploit the full structure (joint sparsity) of the problem. Joint-sparsity based methods have been used for solving such problems when the sparsifying transform is assumed to be fixed. In this work, we learn the sparsifying basis following the dictionary learning paradigm. Results on multi-spectral denoising and multi-echo MRI denoising demonstrates the superiority of our method over existing ones based on KSVD and BM4D.

 

Read the full paper here

Electric Network Frequency(ENF) is a recently developed technique for the authentication of audio signals. ENF gets embedded in audio signals due to electromagnetic interferences from power lines and hence can be used as a measure to find the geographical location and time of recording. Given an audio signal, this paper presents a technique of finding the power grid the audio belongs to. Towards this, we first extract the ENF sinusoid using a very narrow bandwidth filter centered around the nominal frequency. The filter is designed using a frequency response masking approach. The ENF is estimated from the ENF sinusoid using Short time Fourier Transform technique. In order to classify a given audio, we first estimate ENF from the ENF sinusoids obtained from audio and power signals. Further, we use a matched filter in order to decide the corresponding power grid of the audio.

 

Read the full paper here

Professional Work Experience

Industry AI Team(July 2022-Present):   

  • Developed a  Retrieval-Augmented Generation (RAG) based question-answering system for customers in Finance sector. The developed system answers questions based on text, tables and images in the documents and is deployed in production.
  • Created a custom visual language model for understanding charts in documents. Developed end-to-end pipeline for data generation, finetuning and evaluation. This work was accepted and presented at the Microsoft's Machine Learning and Data Science Conference 2023.
  • Developed red teaming pipelines using supervised machine learning and reinforcement learning for numeric entity extraction task. Also developed a custom model for hallucination detection for finance-based question answering task. Presented this work at Microsoft's Machine Learning and Data Science Conference 2023 & also at the Global AI Conference 2023
  • Led a team in conducting research on the intersection of Reinforcement Learning and Large Language models. The team developed a novel taxonomy comprising three main classes based on the way RL & LLMs interact with each other. The work is published in the Journal for Artificial Intelligence Research.
  • Currently, working on graph-based RAG pipelines and on fine-tuning small language models for different industry verticals.
  • Leading the Women & Allies initiative within the MCISVAI group at Microsoft.

 

Bing Search Team ( June 2021- June 2022):

  • Developed a deep learning based multi-class classification model to understand the documents on the web. The model was trained on 10 million documents, is presently used in Bing Search.

Robustness of ECG as a biometric identifier (Jan 2020 - May 2020)

  • Analyzed the robustness of ECG as a biometric signature under different conditions: subject is standing, running, and relaxed.
  • Worked in a team of three; applied different feature extraction & classification techniques to map ECG to original person on three
    ECG datasets: ECG-ID, Wrist PPG Exercise dataset, and ECG data manually collected from Apple watch.
  • Achieved classification accuracy of 80%; published in IEEE International Workshop on Machine Learning for Signal Processing

Neural Differential Equation for Drug Responses (Mar 2020 - Dec 2020)

  • Implemented the research paper titled ‘Neural Ordinary Differential Equations’ using TensorFlow in Python.
  • Extended this work and designed a neural network to learn an ordinary differential equation for predicting the drug response in an individual; developed code in Python using Numpy, Tensorflow, and ODE solvers.
  • Given an unseen patient’s drug response at 7-time instants, the trained neural network predicts the drug response curve for the entire lifetime of the drug; algorithm used by the University of Pittsburgh Medical Center(​ UPMC​ ).

Robotics Lab (Oct 2017 - Dec 2019)

  • Developed a deep reinforcement learning framework for training a UR5 robot in target reaching tasks using Lua, Torch, and Cuda.
  • Designed novel Experience Replay architectures called mH-DDPG and dR-DDPG, which improved the convergence rates by 25% and 43% over state-of-art DDPG algorithm. Published & presented the results at the 28th IEEE Ro-man Conference.
  • Lead a team of 2 researchers and coined a pipeline for transferring the policy learned by an agent in simulation to a real UR5 robot; presented the pipeline at IEEE Translearn 2019, and filed a patent for the algorithm.

SALSA Lab (Jun 2016- Oct 2017)

Guide - Prof. Angshul Majumdar

  • Designed an algorithm in MATLAB for the image denoising problem, where zero-mean white Gaussian noise was to be removed from a given colored image. The developed approach was based on sparse and redundant representations over learned dictionaries. The PSNR and SSIM values of the denoised images were compared with results obtained from state of art KSVD algorithm. Results were published at IEEE GlobalSIP conference.
  • Received the coveted IEEE Signal Processing Travel Grant to travel and present the work at GlobalSIP conference in Montreal, Canada.

 

Multimedia and Vision Lab (Aug 2016 - Oct 2017)

Guide - Prof A.V.Subramanium

  •  Achieved 77% accuracy in classifying vehicle from its dash-cam video using random forest trees on extracted motion blur; published
    under IEEE Transactions on Circuits and Systems for Video Technology

Next Generation Voucher Server Team (Jan 2016 - May 2016)

  • Developed unit testing for the entire Next Generation Voucher Server (NGVS) project using JUnit and Mockito.
  • Extended Ericsson' s automation framework “Jive” to do end-to-end testing of the complete NGVS project with single command. This framework is currently used by NGVS team at Ericsson.

Visual Timeline

Present
2021
2017

Speaking Engagements: Select List

  • Conducted a workshop @ Machine Learning Week 2024

    Conducted a comprehensive, day-long workshop on Deep Learning, covering key concepts such as Recurrent Neural Networks (RNN), GPT models, Responsible AI, and Reinforcement Learning. The workshop featured hands-on coding exercises utilizing Python, PyTorch, and TensorFlow to reinforce these concepts.

  • Guest Speaker @ GlobalAI Community 2023

    Delivered a 30-minute technical presentation on "Designing Custom Classifiers for Hallucination Detection in Large Language Models."

  • Guest Speaker @ GlobalAI Community 2023

    Presented a 30-minute technical talk on "Enhancing Document Q&A with Chart Understanding," attended by an audience of 400.

  • Guest Speaker @ Machine Learning Week 2023

    Led a full-day workshop titled "Deep Learning in Practice: A Hands-On Introduction" at the Machine Learning Week conference in Las Vegas. The workshop encompassed a broad array of topics, including Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), Long Short-Term Memory (LSTM), GPT models, Responsible AI, and foundational reinforcement learning algorithms such as multi-arm bandits and Q-learning.

  • Invited Speaker @ IIIT Delhi

    Invited by the Alumni Association of IIIT Delhi as a guest speaker to share my journey of applying to and studying at Carnegie Mellon University. The talk, titled "Applying for Higher Education in the USA," provided insights into the application process and academic experience.

  • Technical Speaker @ MLADS December 2023

    At Microsoft, I created a custom visual language model for understanding charts in documents. I developed end-to-end pipeline for data generation, finetuning and evaluation. This work was accepted and presented at the Microsoft's Machine Learning and Data Science Conference (MLADS) 2023.

  • Presented Research paper @IEEE MLSP 2020 Conference

    Presented my research work titled ''Pulse ID: The Case for Robustness of ECG as a Biometric Identifier' at the conference.

  • Technical Speaker @ MLADS June 2023

    Lead a team in conducting research on the intersection of Reinforcement Learning and Large Language models at Microsoft. Gave a talk along with other team members on the topic 'The RL/LLM Taxonomy Tree: Reviewing Synergies Between Reinforcement Learning and Large Language Models' at MLADS 2023

  • Presented Research paper @ IEEE Ro-Man 2019 Conference

    At TCS Research, I designed novel Experience Replay architectures called mH-DDPG and dR-DDPG, which improved the convergence rates by 25% and 43% over state-of-art DDPG algorithm. I presented the results at the 28th IEEE Ro-man Conference.

  • Presented Poster @ IEEE Translearn Workshop 2019

    At TCS Research, I lead a team of 2 researchers and developed a pipeline for transferring a RL based policy learned in simulation to a real UR5 robot. Presented the developed algorithm at ​ IEEE​ ​ Translearn 2019​ workshop.

  • Guest Speaker @ Google Developers Group Conference 2019

    Delivered a talk on 'Reinforcement Learning : Application and Challenges in Robotics'. The talk was attended by 200+ students and professionals.

  • Guest Speaker @ Women in Machine Learning & Data Science

    Delivered a talk on the role of 'Artificial Intelligence for Robotics' at the Women in Machine Learning & Data Science's Delhi chapter. The talk was attended by 80+ students and professionals.

  • Presented Research paper @ IEEE GlobalSIP conference

    Received the coveted IEEE Signal Processing Travel Grant for presenting my research paper titled 'Joint-sparse dictionary learning: Denoising multiple measurement vectors' at GlobalSIP 2017 conference in Montreal, Canada.

Connect with me

Seattle, United States of America

Scroll to Top