Blog posts

2025

MMSA from beginner to proficient

less than 1 minute read

Published:

MMSA is a unified framework for multimodal emotion recognition developed by Tsinghua University. It supports three datasets, MOSI, MOSEI, and CH-SIMS, and 15 multimodal emotion analysis models, as well as tasks such as emotion recognition and emotion intensity regression. This column will use MMSA v1.0 as an example to implement framework analysis and customized modules.

Detectron2 Getting Started Tutorial

less than 1 minute read

Published:

Due to the high degree of encapsulation of Detectron2 and the obscure syntax, it often takes time to find the corresponding modules when carrying out the project. In addition, there is a lack of easy-to-understand introductory tutorials, and the official website tutorials lack flexibility. This series will start with topics such as Detectron2 installation, custom data sets, custom networks, and validation set loss printing to get started with Detectron2 step by step.

2024

Mamba:Linear Sequence Modeling Model

less than 1 minute read

Published:

This blog mainly focuses on the improvement of the Mamba model, involving underlying visual tasks, multimodal tasks, etc.

Openstl from zero to master

less than 1 minute read

Published:

Openstl is a third-party library developed by Westlake University for future frame prediction, which integrates multiple SOTA methods such as ConvLstm and simvip. This column will start from scratch and analyze the internal structure and customized modules of the entire Openstl project.

2022

Classic Neural Network

less than 1 minute read

Published:

This blog series mainly introduces the traditional neural network structure design, including convolutional neural network, visual Transformer, Mamba, etc.