| 
            
              
                | 
                    यश भळगट | Yash Bhalgat
                   
                    Final year PhD candidate in the Visual Geometry Group (VGG), Oxford. Co-advised by Andrew
                      Zisserman, Andrea Vedaldi, João Henriques and Iro Laina. Funded by the EPSRC+AWS fellowship with AIMS CDT.
                   Research Interests: 
                    Parallelly, I also work as an AI consultant for an on-device AI startup and a LLM content
                      moderation company.
                    Before, I was a Senior Researcher at Qualcomm AI Research. I have also been fortunate to spend time at Voxel51, IBM Research
                    (Bangalore and Almaden Lab), IFPEN (Paris), TCS Research.
                    
                   Education: 
                    Feel free to setup a call if you want to discuss ideas around startups, AI (CV / ML / LLMs), or want to collaborate.
                   
                    Email  / 
                    CV  / 
                    Scholar  / 
                    Github  / 
                    LinkedIn  / 
                    X
                   |   |  
            
              
                | News 
                   
                    [09/2025] Jamais Vu - Our paper on Semantic Correspondence Learning accepted to NeurIPS 2025![06/2025] Organized a successful 1st Workshop on 3D-LLM/VLA at CVPR 2025.[01/2025] GSLoc accepted to ICLR 2025![11/2024] Proud mentor moment 😎: Reflecting Reality
                      accepted to 3DV 2025![09/2024] Our 3D-Aware Egocentric Tracking
                      paper accepted to ACCV 2024! 🎉[07/2024] N2F2
                      accepted to ECCV 2024! 🎉[02/2024] NeFeS
                      accepted to CVPR 2024! 🎉[01/2024] SiLVR
                      accepted to ICRA 2024! 🎉[01/2024] We are organizing the 2nd Workshop on Learning 3D with Multi-View Supervision at
                      CVPR2024.[09/2023] Contrastive-Lift paper accepted to NeurIPS 2023 as a
                      Spotlight presentation! 🚀 Code available: github.[03/2023] Epipolar-guided
                        Transformers paper accepted to CVPR 2023! 🎉[01/2023] HashNeRF-Pytorch hits 900+ stars on
                      Github! ⭐[06/2022] Serving as a Website Chair for BMVC 2022.[10/2021] 3D
                        Hand Pose Estimation work with my intern John Yang has been accepted to WACV
                      2022.[10/2021] Started my DPhil (PhD) at University
                      of Oxford in AIMS CDT, funded by EPSRC+AWS fellowship.
                      😊[11/2020] I got promoted to Senior Machine Learning Researcher
                      at Qualcomm AI Research. 😊[09/2020] Structured
                        Convolutions paper accepted to NeurIPS 2020! [03/2020] LSQ+ paper
                      accepted to the Efficient Deep Learning in Computer Vision workshop at CVPR 2020[02/2020] Preprint for our work on "Learned Threshold Pruning"
                      available.[11/2019] 3rd position in NeurIPS 2019 MicroNet competition
                      ImageNet track! - Leaderboard.
                      Code: here and
                      here.
                     |  
            
              
                |  | 3D-Aware Instance Segmentation and Tracking in Egocentric Videos Yash Bhalgat*,
                  Vadim Tschernezki*,
                  Iro
                    Laina,
                  João
                    Henriques,
                  Andrea
                      Vedaldi
                  Andrew
                    Zisserman,
 ACCV, 2024
 
 We propose a 3D-aware method for object tracking in long egocentric videos, leveraging scene geometry
                    to handle rapid motion and occlusions. Our approach improves tracking accuracy, reduces ID switches by up to 80%,
                    and enables applications like 3D object reconstruction and amodal segmentation. |  
                |  | N2F2: Hierarchical Scene Understanding with Nested Neural Feature Fields Yash Bhalgat,
                  Iro
                    Laina,
                  João
                    Henriques,
                  Andrew
                    Zisserman,
                  Andrea
                    Vedaldi
 ECCV, 2024
 
 We present Nested Neural Feature Fields (N2F2), a hierarchical approach to 3D scene understanding that encodes 
                    multi-scale properties in a unified feature field. Our method outperforms state-of-the-art approaches like LERF 
                    and LangSplat on open-vocabulary 3D tasks, especially for complex queries, while enabling faster inference. |  
                |  | Contrastive Lift: 3D Object Instance Segmentation by Slow-Fast Contrastive Fusion
                    
                   [Code] Yash Bhalgat,
                  Iro
                    Laina,
                  João
                    Henriques,
                  Andrew
                    Zisserman,
                  Andrea
                    Vedaldi
 NeurIPS, 2023   (Spotlight presentation)
 
 We present a novel "slow-fast" contrastive fusion method to lift 2D predictions to 3D for scalable
                    instance segmentation, achieving significant improvements without requiring an upper bound on the
                    number of objects in the scene. |  
                |  | A Light Touch Approach to Teaching Transformers Multi-view Geometry Yash Bhalgat,
                  João
                    Henriques,
                  Andrew
                    Zisserman
 CVPR, 2023
 
 An "Epipolar-guided training" method to incorporate multi-view geometric priors into Transformer
                    models, which can be implemented in 150 lines of code.
                    
                    During test-time, the Transformer implicitly estimates the epipolar geometry given 2 images and uses
                    it for downstream predictions, e.g. for pose-invariant retrieval. |  
                |  | Dynamic Iterative Refinement for Efficient 3D Hand Pose Estimation John Yang,
                  Yash Bhalgat,
                  Simyung Chang,
                  Fatih Porikli,
                  Nojun Kwak
 WACV, 2022
 
 We propose a tiny deep network of which partial layers are recursively exploited for refining its
                    previous estimations. During its iterative refinements, we employ learned gating criteria to decide
                    whether to exit from the weight-sharing loop, allowing per-sample adaptation in our model. We also
                    predict and exploit uncertainty estimations in the gating mechanism. |  
                |  | Structured Convolutions for Efficient Neural Network Design Yash Bhalgat,
                  Yizhe Zhang,
                  Jamie Lin,
                  Fatih Porikli
 NeurIPS, 2020
 
 We introduce a neat trick to enable the execution of convolution operations in the form of
                    efficient, scaled, sum-pooling components. We present a Structural Regularization loss that enables
                    this decomposition with negligible performance loss. Our method is competitive with other tensor
                    decomposition and structured pruning methods. |  
                |  | Data-driven Weight Initialization with Sylvester Solvers Debasmit Das,
                  Yash Bhalgat,
                  Fatih Porikli
 Practical Machine Learning for Developing Countries Workshop, ICLR, 2021
 
 We propose a data-driven scheme to initialize the parameters of a neural network. The
                    initialization is cast as an optimization problem, which is restructured into the well-known
                    Sylvester equation that has fast and efficient gradient-free solutions. We show that our proposed
                    method is especially effective in few-shot and fine-tuning settings. |  
                |  | LSQ+: Improving low-bit quantization through learnable offsets and better initialization Yash Bhalgat,
                  Jinwon Lee,
                  Markus Nagel,
                  Tijmen Blankevoort,
                  Nojun Kwak
 Efficient Deep Learning in Computer Vision Workshop, CVPR, 2020
 
 We introduce a general asymmetric quantization scheme with trainable scale and offset parameters.
                    LSQ+ shows SOTA results for EfficientNet and MixNet outperforming LSQ for low-bit quantization. |  
                |  | Learned Threshold Pruning Kambiz Azarian,
                  Yash Bhalgat,
                  Jinwon Lee,
                  Tijmen Blankevoort
 arxiv, 2020
 
 We propose an end-to-end differentiable method for learning layerwise pruning thresholds which
                    results in SOTA model compression ratios with AlexNet, ResNet and EfficientNet. Our method also
                    generates a trail of checkpoints with different accuracy-efficiency operating points. |  
                |  | QKD: Quantization-aware Knowledge Distillation for Low-bit Quantization Yash Bhalgat*,
                  Jangho Kim*,
                  Jinwon Lee,
                  Chirag Patel,
                  Nojun Kwak
 arxiv, 2020
 
 Low-bit quantization and KD often don't go well together, but both are important approaches to
                    reduces a model's memory footprint. We propose an effective method to combine these two methods and
                    show results that outperform all existing quantization/KD approaches. |  
                |  | Teacher-Student Learning Paradigm for Tri-training: An Efficient
                      Method for Unlabeled Data Exploitation Yash Bhalgat,
                  Zhe Liu,
                  Pritam
                    Gundecha,
                  Jalal Mahmud,
                  Amita Misra
 KONVENS, 2019
 
 Teacher-student tri-training is a method for semi-supervised learning using 3 classifiers working
                    using adaptive teacher and student thresholds. |  
                |  | Annotation-cost Minimization for Medical Image Segmentation using Suggestive Mixed
                      Supervision Fully Convolutional Networks Yash Bhalgat*,
                  Meet Shah*
                  Suyash Awate
 Medical Imaging meets NeurIPS workshop, 2019
 
 For Medical Image segmentation, we present a budget-based cost-minimization framework in a
                    mixed-supervision setting via dense segmentations, bounding boxes, and landmarks. |  
                |  | CATSEYES: Categorizing Seismic structures with tessellated scattering wavelet networks Yash Bhalgat,
                  Jean Charlety,
                  Laurent Duval
 ICASSP, 2018
 
 We use Scattering Wavelets transforms to extract sparse feature sets from seismic data. We show
                    that using this method combined with simple PCA-based feature selection leads to promising
                    classification performance in affordable computation time. |  
            
              
                | Invited Talks 
                   
                    "From Video to Virtual: Object-centric 3D scene understanding from videos" 
                      [Slides]
                      
                    
 "Deep Learning for Computer Vision and applications to the Ghanaian context" [Youtube]
                      
                        @ SPARK Deep Learning workshop at IndabaX Ghana |  
            
              
                | Hall of Fame 
                   
                    All India Rank 12 in IITJEE-Mains 2013 exam among 1.5 million studentsAll India Rank 155 in IITJEE-Advanced 2013 exam among 0.2 million studentsFeatured in National Top 30 for the International Astronomy Olympiad, 2013All India Rank 60 and awarded the KVPY Scholarship by Govt. of IndiaAmong top 300 in India to compete in the Physics, Chemistry and Mathematics olympiadsCargill Global Scholarship 2014-15 and selected in the 10-member Indian cohort to represent at
                      the global seminar in Minneapolis, USA in 2016Undergraduate Research Award (URA02) for Bachelors thesis at IIT Bombay |  
            
              
                | Music I play the Tabla, an Indian percussion
                    instrument and have a Visharad (≈ Bachelor of Music) in Indian Classical music. I've
                    also briefly tried to learn the Piano and Harmonica. I am a natural beatboxer (check this out) and I sometimes post music videos here:
                   |   |  
            
              
                | 
 
                    Website template borrowed from here.
                   |  |