I am a PhD student. My research is on learning-based reconstruction methods for Computational Imaging. I am under the joint supervision of Dr. Sanjay Singh and Prof. Vasilis Ntziachristos.
My present affiliations are with AcSIR, CSIR-CEERI, Helmholtz Munich, and TUM. I completed my bachelor's in Electronics and Communication Engineering (2016-2020 batch) from IRPE, University of Calcutta. I was born and raised in Durgapur, West Bengal. Beyond my scholarly endeavors, I find great fascination in the fields of philosophy and photography.
My research interest lies at the intersection of computational imaging, computer vision, and machine learning. Presently, my work focuses on computational image formation and reconstruction in the low-data regime. I have received approval from my doctoral advisory committee to pursue my Ph.D. research in the realm of utilizing deep learning techniques to tackle imaging inverse problems.
We address the challenge of gaze vector prediction for extremely low-light conditions by leveraging a novel temporal event-encoding scheme, and a dedicated neural network architecture. The encoded temporal frames paired with our network showcase impressive spatial localization and reliable gaze direction in their predictions.
Our paper shows that it is possible to perform domain-restricted low-shot reconstruction of lensless computational images. We demonstrate an improvement in the reconstruction performance of untrained neural networks by providing very few domain-restricted images for pre-training.
We compare the reconstruction performance with untrained over-parameterized networks having over 30 million parameters. We employ a physics-informed consistency loss function to optimize our network, leading to faster convergence and better reconstruction quality.
This paper introduces a novel event-based dataset named “DVSFall”, incorporating diverse daily living activities (ADL) and simulated falls. Captured from multiple viewpoints using DVS cameras, the dataset encompasses twenty-one participants across varying age groups. we explored a hybrid 3D-CNN & SNN (NeuCube) approach where our proposed framework achieved an accuracy of 97.84%.
We proposed specialized GANs for parallel colorization of foreground and background, alongside a dense adaptive fusion network for feature-based integration of intermediate results. Our approach achieved superior performance compared to the state-of-the-art, using only a fraction of the training data. Also, we identified shortcomings in conventional non-perceptual evaluation metrics and presented a novel human-visual realism test to assess results more effectively.
We used a physics-guided consistency loss function to optimize our model to perform reconstruction and PSF estimation. Our model generates accurate non-blind reconstructions with a PSNR of 24.55 dB.
We gathered a diverse spoof detection dataset, CSDiNE, with varying lighting and backgrounds. Then, we proposed JS-SpoofNet, a jointly supervised network that leverages video sequence cues for effective spoof detection. JS-SpoofNet utilizes a parallel branched architecture with a spoof classification network and an auxiliary network for depth estimation, achieving outstanding performance. Our approach achieved a low average classification error rate (ACER) of 0.94% on the CSDiNE dataset and surpassed existing video-based spoof detection methods on established benchmark datasets.
Gaze Detection using Encoded Retinomorphic Events Abeer Banerjee, Shyam S Prasad, Naval K Mehta, Himanshu Kumar, Sumeet Saurav, and Sanjay Singh International Conference on Intelligent Human Computer Interaction (IHCI), 2022 (Oral) Paper
We gathered a diverse spoof detection dataset, CSDiNE, with varying lighting and backgrounds. Then, we proposed JS-SpoofNet, a jointly supervised network that leverages video sequence cues for effective spoof detection. JS-SpoofNet utilizes a parallel branched architecture with a spoof classification network and an auxiliary network for depth estimation, achieving outstanding performance. Our approach achieved a low average classification error rate (ACER) of 0.94% on the CSDiNE dataset and surpassed existing video-based spoof detection methods on established benchmark datasets.
Our framework uses a novel ensembling technique for boosting the accuracy while drastically decreasing the total parameter count, thus paving the way for real-time implementation. We perform an extensive hyperparameter search using a power-line defect detection dataset and obtain an accuracy of 92.30% for the 5-way 5-shot task.
In this paper, we attempt to solve the problem of privacy-preserving fall detection, a subtask of human action recognition, using the Dynamic Vision Sensor (DVS). We demonstrate the effectiveness of this approach by performing real-time fall detection with a 3D-Convolutional Neural Network (3D-CNN). Our proposed methods achieved average sensitivity and specificity of 99.34% and 100% respectively.
Voluntary Services
Session Chair for the virtual session "Neural Networks for Image Processing 17" at the IEEE International Joint Conference on Neural Networks (IJCNN) organized by IEEE World Congress on Computational Intelligence 2024.
Reviewer
SCI Journals: Neurocomputing, The Visual Computer, Computers and Electrical Engineering, Expert Systems with Applications.
Ranked conferences: ICVGIP 2022 and 2023.
Science communicator and technology demonstrator in CSIR outreach programs to promote science and technology in society. These are regularly organized by CSIR-CEERI through various events such as "Jigyasa", "One Week One Lab (OWOL)", "National Science Day", "National Technology Day", etc.
The source code of this website was borrowed from here.