Mustafa Shukor

Updates

2025-09: l3m is released. Our codebase for large-scale pretraining of AIMv2, CLIP, Native VLMs and LLMs.
2025-09: 2 papers accepted at NeurIPS 2025: Scaling laws for optimal data mixtures and Learning to Steer.
2025-07: We released our work on Scaling laws for optimal data mixtures.
2025-07: Our work on Scaling laws for native multimodal models got an Oral (~top 2%) at ICCV 2025.
2025-07: 2 papers accepted at ICCV 2025: Scaling laws for NMMs and Multimodal Steering.
2025-06: We released SmolVLA an efficient foundation model for robotics: paper, code, and blogpost.
2025-02: AIMv2 is accepted as a Spotlight paper (~top 5%) at CVPR 2025.
2024-11: We released AIMv2, a SoTA large-scale vision encoder.
2024-09: Implicit Multimodal Alignment, CoX-LMM, DiffCut and Skipping Computations, are accepted at NeurIPS 2024.
2024-04: Beyond task performance is accepted at ICLR 2024.
2023-12: UnIVAL: Unified Model for Image, Video, Audio and Language Tasks, is accepted at TMLR 2023: paper and code.
2023-09: Rewarded soups is accepted at NeurIPS 2023.