I'm Siqi Fan (范嗣祺), a researcher at Institute for AI Industry Research, Tsinghua University (AIR, THU). Previously, I received my M.S. degree from Institute of Automation, Chinese Academy of Sciences (CASIA) in 2022, and my B.E. degree from Shanghai Jiao Tong University (SJTU) in 2019.
I am broadly interested in Representation Learning in Complex Systems, from macro physical world to micro biological world, aiming to create AI agents that perceive like or beyond human. Driven by the dedication to innovation, I aspire to advance the fields of autonomous driving and biomedical discovery, pushing the boundaries of technology to create impactful products.
I love music and visual arts.
") does not match the recommended repository name for your site ("
").
", so that your site can be accessed directly at "http://
".
However, if the current repository name is intended, you can ignore this message by removing "{% include widgets/debug_repo_name.html %}
" in index.html
.
",
which does not match the baseurl
("
") configured in _config.yml
.
baseurl
in _config.yml
to "
".
IEEE (TIP, TCSVT, TVT, TIV, ITSM), IET (CV, CSR), PR, Neurocomputing
CVPR, ICCV, ECCV, ICRA, IROS, ITSC
Traffic scenes understanding and simulation testing @ ITSC'22
My exploration on vehicle-side environment perception starts from drivable area detection (ITSC’20), and a series perception algorithms are proposed, including a RGB 2D object detection approach (FII-CenterNet, T-VT’21), a semi-supervised learning approach for RGB 2D segmentation (CPCL, T-IP’22), a RGB-T segmentation approach for challenging lighting conditions (SpiderMesh, TechReport’23), and a 3D segmentation approach for large-scale points cloud (SCF-Net, CVPR’21).
Compared with the well-studied vehicle-side perception, roadside perception has several specific challenges, and the exploration is hindered due to the lack of data. A calibration-free BEV representation network is proposed to address calibration noises caused by inevitable natural factors (CBR, IROS’23). A semantic-geometry decoupled contrastive learning framework is introduced to improve roadside perception performance by leveraging vehicle-side data (IROAM, ICRA’25), and the first real-world large-scale dataset for roadside cooperative perception is released with benchmarks to bloom the research on practical I2I perception (RCooper, CVPR’24).
Cooperative perception can effectively enhance individual perception performance by providing additional viewpoint and expanding the sensing field. A scene-level feature cooperative perception approach is proposed (EMIFF, ICRA’24). To enable interpretable instance-level flexible feature interaction, the concept of query cooperation is proposed, and a cooperative perception framework is introduced, which let query stream flow among agents (QUEST, ICRA’24). Besides, motion forecasting can also benefit from learning cooperative trajectory representation ( NeurIPS'24). In addition to focusing on improving individual modules, a pioneering end-to-end cooperative autonomous driving framework is introduced (UniV2X, AAAI’25).
Compared with representation learning in physical world, that for biological modality is more complicated.
Recent advances in LLMs have shed light on the development of knowledgeable and versatile AI research assistants in various scientific domains. Multimodal large language models bridge the semantic gap between natural language and other modalities, including molecule, protein, and vision. A multimodal large language model is proposed for assisting biomedical research (BioMedGPT, J-BHI’24), and optical chemical structure understanding task is introduced and explored for molecule-centric scientific discovery (OCSU, TechReport’25).
Multi-agent cooperation is a potential approach to solve complicated scientific research tasks in an autopilot manner. To facilitate the exploration, an agent platform for biomedicine and life science is presented and open-sourced ( OpenBioMed)