यश भळगट | Yash Bhalgat

I am a 3rd year PhD candidate in the Visual Geometry Group (VGG) co-advised by Andrea Vedaldi, Andrew Zisserman, João Henriques and Iro Laina, funded by the EPSRC+AWS fellowship with AIMS CDT.

Previously, I was a Senior Researcher at Qualcomm AI Research, specializing in Computer Vision and Efficient Deep Learning. I've worked on 3D hand-pose estimation [DIR-Net], low-bit quantization [LSQ+, QKD], structured [StructConv] and unstructured pruning [LTP] of deep networks.

I received my Masters in Computer Science from University of Michigan, and my Bachelors in Electrical Engineering (CS minor) from IIT Bombay.

Open to collaborations, feel free to setup a call for discussions about my research or new ideas!

Email  /  CV  /  Scholar  /  Github  /  LinkedIn  /  X

profile photo
News

  • [02/2024] NeFeS accepted to CVPR 2024! 🎉
  • [01/2024] SiLVR accepted to ICRA 2024! 🎉
  • [01/2024] We are organizing the 2nd Workshop on Learning 3D with Multi-View Supervision at CVPR2024.
  • [09/2023] Contrastive-Lift paper accepted to NeurIPS 2023 as a Spotlight presentation! 🚀 Code available: github.
  • [03/2023] Epipolar-guided Transformers paper accepted to CVPR 2023! 🎉
  • [01/2023] HashNeRF-Pytorch hits 900+ stars on Github! ⭐
  • [06/2022] Serving as a Website Chair for BMVC 2022.
  • [10/2021] 3D Hand Pose Estimation work with my intern John Yang has been accepted to WACV 2022.
  • [10/2021] Started my DPhil (PhD) at University of Oxford in AIMS CDT, funded by EPSRC+AWS fellowship. 😊
  • [11/2020] I got promoted to Senior Machine Learning Researcher at Qualcomm AI Research. 😊
  • [09/2020] Structured Convolutions paper accepted to NeurIPS 2020!
  • [03/2020] LSQ+ paper accepted to the Efficient Deep Learning in Computer Vision workshop at CVPR 2020
  • [02/2020] Preprint for our work on "Learned Threshold Pruning" available.
  • [11/2019] 3rd position in NeurIPS 2019 MicroNet competition ImageNet track!! - Leaderboard. Code: here and here.

Research
Contrastive Lift: 3D Object Instance Segmentation by Slow-Fast Contrastive Fusion [Code].
Yash Bhalgat, Iro Laina, João Henriques, Andrew Zisserman, Andrea Vedaldi
NeurIPS, 2023   (Spotlight presentation)

We present a novel "slow-fast" contrastive fusion method to lift 2D predictions to 3D for scalable instance segmentation, achieving significant improvements without requiring an upper bound on the number of objects in the scene.

A Light Touch Approach to Teaching Transformers Multi-view Geometry
Yash Bhalgat, João Henriques, Andrew Zisserman
CVPR, 2023

An "Epipolar-guided training" method to incorporate multi-view geometric priors into Transformer models, which can be implemented in 150 lines of code. During test-time, the Transformer implicitly estimates the epipolar geometry given 2 images and uses it for downstream predictions, e.g. for pose-invariant retrieval.

Dynamic Iterative Refinement for Efficient 3D Hand Pose Estimation
John Yang, Yash Bhalgat, Simyung Chang, Fatih Porikli, Nojun Kwak
WACV, 2022

We propose a tiny deep network of which partial layers are recursively exploited for refining its previous estimations. During its iterative refinements, we employ learned gating criteria to decide whether to exit from the weight-sharing loop, allowing per-sample adaptation in our model. We also predict and exploit uncertainty estimations in the gating mechanism.

Structured Convolutions for Efficient Neural Network Design
Yash Bhalgat, Yizhe Zhang, Jamie Lin, Fatih Porikli
NeurIPS, 2020

We introduce a neat trick to enable the execution of convolution operations in the form of efficient, scaled, sum-pooling components. We present a Structural Regularization loss that enables this decomposition with negligible performance loss. Our method is competitive with other tensor decomposition and structured pruning methods.

Data-driven Weight Initialization with Sylvester Solvers
Debasmit Das, Yash Bhalgat, Fatih Porikli
Practical Machine Learning for Developing Countries Workshop, ICLR, 2021

We propose a data-driven scheme to initialize the parameters of a neural network. The initialization is cast as an optimization problem, which is restructured into the well-known Sylvester equation that has fast and efficient gradient-free solutions. We show that our proposed method is especially effective in few-shot and fine-tuning settings.

LSQ+: Improving low-bit quantization through learnable offsets and better initialization
Yash Bhalgat, Jinwon Lee, Markus Nagel, Tijmen Blankevoort, Nojun Kwak
Efficient Deep Learning in Computer Vision Workshop, CVPR, 2020

We introduce a general asymmetric quantization scheme with trainable scale and offset parameters. LSQ+ shows SOTA results for EfficientNet and MixNet outperforming LSQ for low-bit quantization.

Learned Threshold Pruning
Kambiz Azarian, Yash Bhalgat, Jinwon Lee, Tijmen Blankevoort
arxiv, 2020

We propose an end-to-end differentiable method for learning layerwise pruning thresholds which results in SOTA model compression ratios with AlexNet, ResNet and EfficientNet. Our method also generates a trail of checkpoints with different accuracy-efficiency operating points.

QKD: Quantization-aware Knowledge Distillation for Low-bit Quantization
Yash Bhalgat*, Jangho Kim*, Jinwon Lee, Chirag Patel, Nojun Kwak
arxiv, 2020

Low-bit quantization and KD often don't go well together, but both are important approaches to reduces a model's memory footprint. We propose an effective method to combine these two methods and show results that outperform all existing quantization/KD approaches.

Teacher-Student Learning Paradigm for Tri-training: An Efficient Method for Unlabeled Data Exploitation
Yash Bhalgat, Zhe Liu, Pritam Gundecha, Jalal Mahmud, Amita Misra
KONVENS, 2019

Teacher-student tri-training is a method for semi-supervised learning using 3 classifiers working using adaptive teacher and student thresholds.

Annotation-cost Minimization for Medical Image Segmentation using Suggestive Mixed Supervision Fully Convolutional Networks
Yash Bhalgat*, Meet Shah* Suyash Awate
Medical Imaging meets NeurIPS workshop, 2019

For Medical Image segmentation, we present a budget-based cost-minimization framework in a mixed-supervision setting via dense segmentations, bounding boxes, and landmarks.

CATSEYES: Categorizing Seismic structures with tessellated scattering wavelet networks
Yash Bhalgat, Jean Charlety, Laurent Duval
ICASSP, 2018

We use Scattering Wavelets transforms to extract sparse feature sets from seismic data. We show that using this method combined with simple PCA-based feature selection leads to promising classification performance in affordable computation time.

Hall of Fame

  • All India Rank 12 in IITJEE-Mains 2013 exam among 1.5 million students
  • All India Rank 155 in IITJEE-Advanced 2013 exam among 0.2 million students
  • Featured in National Top 30 for the International Astronomy Olympiad, 2013
  • Secured All India Rank 60 and awarded the KVPY Scholarship by Govt. of India
  • Among top 300 in India to compete in the Physics, Chemistry and Mathematics olympiads
  • Awarded Cargill Global Scholarship 2014-15 and selected in the 10-member Indian cohort to represent at the global seminar in Minneapolis, USA in 2016
  • Received the Undergraduate Research Award (URA02) for my Bachelors thesis at IIT Bombay

Music

To get away from my exciting yet stressful life as a computer scientist, I indulge myself into the realms of music. I spent my entire childhood learning Tabla, an Indian percussion instrument. I briefly also tried to learn the Piano, Drums and Harmonia but had to leave that for my higher studies. One day, I naturally started beatboxing (check this out) and I continue that passion to this day. Please checkout my music channels that I try to maintain regularly. :)


Website template borrowed from here.