๐Ÿ“ New paper on multimodal representation learning accepted to TMLR and on generative models for accessibility at CHI 2026

Mar 9, 2026 ยท 1 min read

We demonstrate that structural supervision significantly enhances representation learning. By leveraging the inherent redundancy of multimodal and multiview data through joint masked reconstruction and cross-view alignment, we integrate auxiliary signals to improve performance. Checkout our preprint: Structure is Supervision: Multiview Masked Autoencoders for Radiology.

Additionally, we will be presenting our work on adapting generative models for accessibility in collaboration with UNICEF at CHI 2026, checkout the preprint: Steering Generative Models for Accessibility: EasyRead Image Generation.