I am currently a third-year AcSIR Integrated Ph.D. student at CSIR-CEERI, Pilani, India. My academic pursuits revolve around the exploration of learning-based image reconstruction methods for computational imaging, under the supervision of Dr. Sanjay Singh.
Before joining CSIR-CEERI, I completed my bachelor's degree in Electronics and Communication Engineering from the IRPE, University of Calcutta, Kolkata, India. Beyond my scholarly endeavors, I find great fascination in the fields of philosophy and photography.
My research interest lies at the intersection of computational imaging, computer vision, and machine learning. Presently, my work focuses on computational image formation and reconstruction in the low-data regime. I have received approval from my doctoral advisory committee to pursue my Ph.D. research in the realm of utilizing deep learning techniques to tackle imaging inverse problems.
We address the challenge of gaze vector prediction for extremely low-light conditions by leveraging a novel temporal event-encoding scheme, and a dedicated neural network architecture. The encoded temporal frames paired with our network showcase impressive spatial localization and reliable gaze direction in their predictions.
Our paper shows that it is possible to perform domain-restricted low-shot reconstruction of lensless computational images. We demonstrate an improvement in the reconstruction performance of untrained neural networks by providing very few domain-restricted images for pre-training.
We compare the reconstruction performance with untrained over-parameterized networks having over 30 million parameters. We employ a physics-informed consistency loss function to optimize our network, leading to faster convergence and better reconstruction quality.
This paper introduces a novel event-based dataset named “DVSFall”, incorporating diverse daily living activities (ADL) and simulated falls. Captured from multiple viewpoints using DVS cameras, the dataset encompasses twenty-one participants across varying age groups. we explored a hybrid 3D-CNN & SNN (NeuCube) approach where our proposed framework achieved an accuracy of 97.84%.
We proposed specialized GANs for parallel colorization of foreground and background, alongside a dense adaptive fusion network for feature-based integration of intermediate results. Our approach achieved superior performance compared to the state-of-the-art, using only a fraction of the training data. Also, we identified shortcomings in conventional non-perceptual evaluation metrics and presented a novel human-visual realism test to assess results more effectively.
We used a physics-guided consistency loss function to optimize our model to perform reconstruction and PSF estimation. Our model generates accurate non-blind reconstructions with a PSNR of 24.55 dB.
We gathered a diverse spoof detection dataset, CSDiNE, with varying lighting and backgrounds. Then, we proposed JS-SpoofNet, a jointly supervised network that leverages video sequence cues for effective spoof detection. JS-SpoofNet utilizes a parallel branched architecture with a spoof classification network and an auxiliary network for depth estimation, achieving outstanding performance. Our approach achieved a low average classification error rate (ACER) of 0.94% on the CSDiNE dataset and surpassed existing video-based spoof detection methods on established benchmark datasets.
Gaze Detection using Encoded Retinomorphic Events Abeer Banerjee, Shyam S Prasad, Naval K Mehta, Himanshu Kumar, Sumeet Saurav, and Sanjay Singh International Conference on Intelligent Human Computer Interaction (IHCI), 2022 (Oral) Paper
We gathered a diverse spoof detection dataset, CSDiNE, with varying lighting and backgrounds. Then, we proposed JS-SpoofNet, a jointly supervised network that leverages video sequence cues for effective spoof detection. JS-SpoofNet utilizes a parallel branched architecture with a spoof classification network and an auxiliary network for depth estimation, achieving outstanding performance. Our approach achieved a low average classification error rate (ACER) of 0.94% on the CSDiNE dataset and surpassed existing video-based spoof detection methods on established benchmark datasets.
Our framework uses a novel ensembling technique for boosting the accuracy while drastically decreasing the total parameter count, thus paving the way for real-time implementation. We perform an extensive hyperparameter search using a power-line defect detection dataset and obtain an accuracy of 92.30% for the 5-way 5-shot task.
In this paper, we attempt to solve the problem of privacy-preserving fall detection, a subtask of human action recognition, using the Dynamic Vision Sensor (DVS). We demonstrate the effectiveness of this approach by performing real-time fall detection with a 3D-Convolutional Neural Network (3D-CNN). Our proposed methods achieved average sensitivity and specificity of 99.34% and 100% respectively.
Voluntary Services
I actively participate as a science communicator and technology demonstrator in CSIR outreach programs to promote science and technology in society. These are regularly organized by CSIR-CEERI through various events such as "Jigyasa", "One Week One Lab (OWOL)", "National Science Day", "National Technology Day", etc.
I review research papers in prestigious SCI journals like Neurocomputing, Computers and Electrical Engineering, and ranked conferences like ICVGIP 2022 and 2023.
The source code of this website was borrowed from here.