Shyamgopal Karthik

I am a PhD candidate at the Explainable Machine Learning lab of University of Tuebingen, led by Prof. Zeynep Akata. I am always open to exploring new problems in the areas of computer vision and machine learning. In the past, I have worked on self-supervised learning, visual tracking, using class hierarchies to improve classification, audio-visual saliency, and active learning. However, I love the Occam's Razor and the kind of research that is able to provide simple and clear explanations to complex phenomena. This talk accurately summarizes the kind of research I enjoy the most.

Bio

I completed my Master's and Bachelor's degree from the International Institute of Information Technology, Hyderabad in 2021 where I worked at the Center for Visual Information Technology with Prof. Vineet Gandhi. During my Master's, I spent a wonderful 6 months remotely interning at NAVER LABS Europe where I worked with Boris Chidlovskii and Jerome Revaud on learning from long-tailed and noisy data.

Updates

  • 19 October 2022. I was recognized as an outstanding reviewer at ECCV 2022
  • 03 July 2022. Our work on post-hoc uncertainty estimation was accepted to ECCV 2022
  • 21 May 2022. I was recognized as an outstanding reviewer at CVPR 2022.
  • 3 March 2022. Our work on Compositional Zero-Shot Learning was accepted at CVPR 2022
  • 24 November 2021 . I was recognized as an outstanding reviewer at BMVC 2021.
  • 1 October 2021. Started my PhD at the Explainable Machine Learning group.
  • 11 July 2021. Concluded my internship at NAVER LABS Europe
  • 7 April 2021. Successfully defended my Masters Thesis.

Publications

2022

BayesCap: Bayesian Identity Cap for Calibrated Uncertainty in Frozen Neural Networks
Uddeshya Upadhyay*, Shyamgopal Karthik*, Yanbei Chen, Massimiliano Mancini, and Zeynep Akata
ECCV 2022, Tel-Aviv, Israel.
paper code bibtex

We provide "free" uncertainty estimates to your favourite image-translation model. The key idea is that state-of-the-art image translation models are deterministic, whereas probabilistic models are much harder to train. Our idea is to train a probabilistic model in a post-hoc fashion which provides calibrated uncertainty estimates that can be used in downstream tasks like detecting out-of-distribution samples in safety-critical scenarios like depth estimation for autonomous driving.

KG-SP: Knowledge Guided Simple Primitives for Open World Compositional Zero-Shot Learning
Shyamgopal Karthik, Massimiliano Mancini and Zeynep Akata
CVPR 2022, New Orleans, USA.
paper code bibtex

In this work, we looked at the problem of Compositional Zero-Shot Learning, where the goal is to predict (attribute, object) labels for an image, and generalize to unseen (attribute, object) pairs. Recent methods had tried to model attributes and objects jointly using a variety of ideas. Here, we show that predicting attributes and objects independently can work quite well for this task. Additionally, we show how a knowledge-base can be incorporated to improve the performance of the model. Finally, we introduce a new partially labeled setting where we show how we can train our model in the absence of compositional labels.

2021

Bringing Generalization to Deep Multi-View Detection
Jeet Vora, Swetanjal Dutta, Kanishk Jain, Shyamgopal Karthik and Vineet Gandhi
Arxiv.
paper code bibtex

Here, we tackle the problem of multi-view detection where we want to generate the bird's eye view map of a scene given multiple views. The biggest challenge here is that capturing training data is quite difficult since it requires calibrated and synchronized cameras. We sidestep this problem by generating a large-scale training data using GTA-V and Unity graphics engines. We also propose some simple modifications to existing models and show that training on our synthetic dataset generalizes quite well on real data.

Learning from Long-Tailed Data with Noisy Labels
Shyamgopal Karthik, Jerome Revaud and Boris Chidlovskii
ICCV 2021 Workshop on Self-supervised Learning for Next-Generation Industry-level Autonomous Driving, Virtual.
paper bibtex

This work brings together self-supervised learning, long-tailed learning and learning with noisy labels. An important thing we noticed was that long-tailed methods break down when trained with noisy labels and vice-versa. We found that self-supervised pre-training followed by fine-tuning with a robust loss function which handles both imbalance and label noise works exceptionally well on a bunch of datasets.

ViNet: Pushing the limits of Visual Modality for Audio-Visual Saliency Prediction
Samyak Jain, Pradeep Yarlagadda, Shreyank Jyoti, Shyamgopal Karthik, Ramanathan Subramanian and Vineet Gandhi
IROS 2021, Virtual.
paper code bibtex

We started off with a 3D-CNN architecture which achieves State-of-the-Art performance on various video saliency benchmarks. We then incorporated audio into our model, and achieved excellent performance on audio-visual saliency benchmarks as well. However, a very curious observation we noticed was that the output remains unchanged even if we proivde zeros for the audio indicating that the audio is being ignored entirely in various State-of-the-Art models.

No Cost Likelihood Manipulation at Test Time for Making Better Mistakes in Deep Networks
Shyamgopal Karthik, Ameya Prabhu, Puneet Dokania and Vineet Gandhi
ICLR 2021, Virtual.
paper code bibtex

The main motivation behind this work was to see if we could reduce the severity of mistakes in a classification setting. To do this, we make use of label hierarchies which are readily available through taxonomies like WordNet. For our method, we show that a simple algorithm from Duda and Hart's Pattern Recognition textbook way back in 1973 can be effectively used in a post-hoc manner while retaining the calibration of the base model.

Leveraging Structural Cues for Better Training and Deployment in Computer Vision
Shyamgopal Karthik
Master's Thesis, IIIT Hyderabad
thesis bibtex

My Master's thesis which was the culmination of nearly 3 years of work from my side!

2020

Simple Unsupervised Multi-Object Tracking
Shyamgopal Karthik, Ameya Prabhu and Vineet Gandhi
Arxiv.
paper bibtex

This work mainly focuses on the Re-Identification features that are commonly used in Multi-Object Tracking algorithms. In various trackers, this is often the only component that requires video level supervision. We propose a method to train a ReID model using pseudo-labels generated from a Kalman filter based tracker in a self-supervised fashion. The resulting ReID model can be used as a drop-in replacement to the supervised ReID models used in trackers. Alternatively, using these ReID features as a post-processing step in trackers that don't use a ReID model can reduce the number of ID Switches by 67%.

Exploring 3 R's of Long-term Tracking: Re-detection, Recovery and Reliability
Shyamgopal Karthik, Abhinav Moudgil and Vineet Gandhi
IEEE WACV 2020.
paper bibtex

My first work which introduced me to the field of computer vision! We found some interesting quirks and shortcomings of existing State-of-the-Art tracking algorithms when evaluated on long-term visual object tracking datasets. We did our best to propose novel experiments and evaluation metrics to highlight these quirks.


Website style cloned from this wonderful website.
Last update: July 2022