Machine Learning

Meta Learning

Meta-learning is a type of machine learning that focuses on learning how to learn from a given set of tasks, to apply the acquired knowledge to new tasks and domains. Deep learning's long-tailed problem refers to imbalanced class distributions in datasets, where minority classes have limited samples. This can hinder model performance and fairness. Our lab research innovative techniques to address this issue, aiming to improve accuracy for underrepresented classes. We explore data augmentation, adaptive loss functions, and advanced learing algorithms.


Vision-Language Model

Vision-Language Model is an AI technique that combines the ability to process images and written text. It can "see" and "read" simultaneously, analyzing images to recognize objects and scenes while understanding and generating natural language descriptions or answers based on the visual input. These models are trained using paired image and text data, learning to understand the relationships between visuals and their corresponding textual descriptions. Vision-Language Models have applications in image captioning, visual question answering, and aiding individuals with visual impairments, enabling computers to better comprehend and communicate about the visual world.

Medical Image Processing

In particular, efforts related to the applications of computer vision, virtual reality, and robotics to biomedical imaging problems are the focus of this project, which is centered on new research in the area of medical and biological image analysis. The study team is considering methods that make use of biomedical image datasets at all spatial scales, from molecular and cellular imaging to tissue and organ imaging. Although not restricted to these, the typical biomedical image datasets of interest include those obtained from: magnetic resonance, ultrasound, computed tomography, nuclear medicine, X-ray, optical and confocal microscopy video, and range data images. We concentrate on 3D medical detection and segmentation, as well as classification, detection, segmentation, instance segmentation, and panoptic segmentation of various biomarkers in medical images.

3D Generative Models

A neural field is a continuous neural implicit representation and represents scenes or objects fully or partially with a neural network. For each position in 3D space, the neural network maps its related features (e.g., coordinate) to the attributes (e.g., an RGB value). Neural fields are able to represent 3D scenes or objects in arbitrary resolution and unknown or complex topology due to their continuity in representation Besides, compared to the 3D generative models using VAEs and GANs representations, only the parameters of the neural network are required to store, resulting in lower memory consumption compared to other representations.

Virtual Try-on

Virtual try-on is a narrow research branch of the Generative Adversarial Network and Diffusion Model. However, in order to perform the Virtual try-on problem, there are other subproblems that need to be worked on, such as image segmentation or pose estimation, and many more subproblems.