2025 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)

22-24 October 2025, Singapore

Technical Program

Day 1 | Day 2 | Day 3

Day 2: Thursday, 23 Oct 2025 - Overview
08:00-08:30	D2-0800_IB Registration Location: Island Ballroom
08:30-10:00	D2-0830_IB Active Noise Cancellation Workshop Keynote Location: Island Ballroom
08:30-10:00	D2-0830_L1 Advanced Topics in Audio Understanding of Sound Events, Scenes, and Beyond Location: Lotus I
08:30-10:00	D2-0830_L2 Speech and Language Processing I Location: Lotus II
08:30-10:00	D2-0830_H3 Research Frontiers in Learned Visual Data Coding and Processing Location: Hibiscus III
08:30-10:00	D2-0830_P1 Multimodal AI Location: Peony I
08:30-10:00	D2-0830_P2 Biomedical Signal Processing and Systems I Location: Peony II
10:00-10:30	Break
10:30-12:00	D2-1030_IB Active Noise Cancellation Panel Discussion Location: Island Ballroom
10:30-12:00	D2-1030_L1 Advanced Topics on Music Processing‬ Location: Lotus I
10:30-12:00	D2-1030_L2 Speech and Language Processing II Location: Lotus II
10:30-12:00	D2-1030_H3 Emerging Technologies and Applications of Image Processing and Computer Vision Location: Hibiscus III
10:30-12:00	D2-1030_P1 Biomedical Signal Processing and Systems II Location: Peony I
10:30-12:00	D2-1030_P2 Machine Learning: Algorithms and Application I Location: Peony II
12:00-12:30	Lunch
12:30-13:30	D2-1230_IB Women in APSIPA Forum Location: Island Ballroom
13:30-14:30	D2-1330_IB Keynote 2 by Jane Wang Location: Island Ballroom
14:30-16:00	D2-1430_IB Perspective 3: Neural Speech Assessment and Its Application Location: Island Ballroom
14:30-16:00	D2-1430_L1 Active Noise Control I Location: Lotus I
14:30-16:00	D2-1430_L2 Speech and Language Processing III Location: Lotus II
14:30-16:00	D2-1430_H3 Machine Learning: Information and Medical Applications Location: Hibiscus III
14:30-16:00	D2-1430_P1 Audio Processing Location: Peony I
14:30-16:00	D2-1430_P2 Signal & Information Processing I Location: Peony II
16:00-16:30	Break
16:30-18:00	D2-1630_IB Education Forum Location: Island Ballroom
16:30-18:00	D2-1630_L1 Active Noise Control II Location: Lotus I
16:30-18:00	D2-1630_L2 Speech and Language Processing IV Location: Lotus II
16:30-18:00	D2-1630_H3 Recent Advances in Multimedia Enrichment, Security and Privacy Location: Hibiscus III
16:30-18:00	D2-1630_P2 Advances in Multimodal AI for Multimedia Applications Location: Peony II
19:00-21:30	D2-1900_IB Banquet Location: Island Ballroom

Day 2: Thursday, 23 Oct 2025 - With Papers
08:00-08:30	D2-0800_IB Registration Location: Island Ballroom
08:30-10:00	D2-0830_IB Active Noise Cancellation Workshop Keynote Location: Island Ballroom
08:30-10:00	D2-0830_L1 Advanced Topics in Audio Understanding of Sound Events, Scenes, and Beyond Location: Lotus I D2-0830_L1.1 58 Evaluation of auditory and tactile perception for augmented sound-image enhancement using pre-virtual-leading hypersonic signals Ryota Imanaka, Yuting Geng, Masato Nakayama, Takanobu Nishiura D2-0830_L1.2 103 Improvement in Variance Estimation in Variable-Step-Size Shared-Error NLMS Algorithm for Acoustic Echo and Noise Canceller Kenta Iwai D2-0830_L1.3 117 Hierarchical Sparse Sound Field Reconstruction with Spherical and Linear Microphone Arrays Shunxi Xu, Craig T. Jin D2-0830_L1.4 157 Robust Superdirective Beamforming Using a Uniform Circular Array with Directional Microphones Weilong Huang, Longfei Felix Yan, Emanuël A.P. Habets D2-0830_L1.5 211 Towards Robust Stereo 3-D SELD: A Study of Perceptual Features and Data Augmentation Jun Wei Yeow, Ee-Leng Tan, Santi Peksi, Woon-Seng Gan, Huang Qirui D2-0830_L1.6 258 Pre-training Autoencoder for Acoustic Event Classification via Blinky Xiaoyang Liu, Yuma Kinoshita D2-0830_L1.7 275 Sound source enhancement using power spectral density estimation in beamspace for a dual unmanned aerial vehicle system Mingxue Song, Jin Xuan Teh, Yusuke Hioka, Benjamin Yen, Hiroshi Saruwatari D2-0830_L1.8 328 Three-Dimensional Gradient-Based Tracking of Multiple Sound Sources Shaoheng Xu, Wei-Ting Lai, Yile (Angela) Zhang, Jihui (Aimee) Zhang, Amy Bastine, Prasanga Samarasinghe, Thushara Abhayapala D2-0830_L1.9 389 Retrieval-Augmented Difference Captioning to Explain Unsupervised Anomalous Sound Detection Ryoya Ogura, Tomoya Nishida, Yohei Kawaguchi D2-0830_L1.10 398 An Evaluation of Supervised Virtual Microphone Estimators in Reverberant Sound Fields Kimihiro Hattori, Wen-Chin Huang, Kazuya Takeda, Tomoki Toda D2-0830_L1.11 459 Human-CLAP: Human-perception-based contrastive language–audio pretraining Taisei Takano, Yuki Okamoto, Yusuke Kanamori, Yuki Saito, Ryotaro Nagase, Hiroshi Saruwatari D2-0830_L1.12 461 Training Acoustic Scene Classification Models Robust to Asynchrony in Distributed Microphone Arrays Takao Kawamura, Nobutaka Ono
08:30-10:00	D2-0830_L2 Speech and Language Processing I Location: Lotus II D2-0830_L2.1 95 Neural Speech Separation with Parallel Amplitude and Phase Spectrum Estimation Fei Liu, Yang Ai, Zhen-Hua Ling D2-0830_L2.2 127 Single-Channel Speech Enhancement in Spherical-Mapped Short-Time Spectral Domain Yu Morinaga, Naoto Kotake, Iori Hashimoto, Suehiro Shimauchi, Shigeaki Aoki D2-0830_L2.3 142 Introducing Self-Supervised Learning Models for Spoken Query-Spoken Term Detection Masato Nagase, Kazunori Kojima, Shi-wook Lee, Yoshiaki Itoh D2-0830_L2.4 150 Characterization of Speech Similarity Between Australian Aboriginal and High-Resource Languages: A Case Study on Dharawal Ting Dang, Trini Manoj Jeyaseelan, Eliathamby Ambikairajah, Vidhyasaharan Sethu D2-0830_L2.5 179 Segment Transformer: AI-Generated Music Detection via Music Structural Analysis Yumin Kim, Seonghyeon Go D2-0830_L2.6 213 Dialect Identification Using Resource-Efficient Fine-Tuning Approaches Zirui Lin, Haris Gulzar, Monnika Roslianna Busto, Akiko Masaki, Takeharu Eda, Kazuhiro Nakadai D2-0830_L2.7 236 A High-Quality and Low-Complexity Streamable Neural Speech Codec with Knowledge Distillation En-Wei Zhang, Hui-Peng Du, Xiao-Hang Jiang, Yang Ai, Zhen-Hua Ling D2-0830_L2.8 253 Effectiveness of streaming ASR for real-time laughter and screaming detection Mizuki Kurasawa, Yoshiko Arimoto D2-0830_L2.9 262 Mitigating Data Imbalance in Automated Speaking Assessment Fong-Chun Tsai, Kuan-Tang Huang, Bi-Cheng Yan, Tien-Hong Lo, Berlin Chen D2-0830_L2.10 446 An Information-Theoretic Approach to Data Selection for Generative Topic Modeling Michael Santoso, Bhone Tay Zar Kyaw, Valentinus Roby Hananto, Victor Kryssanov D2-0830_L2.11 571 Collective Learning-based Optimal Transport GAN with Multi-Level Fine-Grained and Global Discriminators for Voice Conversion Sandipan Dhar, Md. Tousin Akhter, Nanda Dulal Jana, Swagatam Das, Monorama Swain, Saurav Chowdhury
08:30-10:00	D2-0830_H3 Research Frontiers in Learned Visual Data Coding and Processing Location: Hibiscus III D2-0830_H3.1 64 Neural Implicit Representations for Object-centric Machine Vision Tasks Yeoneui Kim, Je-Won Kang D2-0830_H3.2 106 GoP-to-Frame Encoder Adaptation for Learned Video Compression Xiaohan Pan, Runsen Feng, Henan Wang, Yixin Gao, Zhibo Chen D2-0830_H3.3 161 Efficient Adversarial Attack and Training on Learned Image Compression Jun Kurihara, Heming Sun D2-0830_H3.4 204 Accelerating VVC Inter-Frame Coding: A Lightweight CNN for Fast QTMT Partitioning Jui-Chen Luo, Jiann-Jone Chen, Tien-Ying Kuo, Yi-Fan Wu, Zhang Kai-Jie D2-0830_H3.5 335 Multimodal Speech Analysis for Early Detection of Mild Cognitive Impairment: A Scalable Approach Muhammad Bilal, Waleed Abdulla, Gary Cheung, Lynette Tippett, Reza Shahamiri D2-0830_H3.6 383 Boundary-Enhanced Attention Network for Breast Mass Segmentation Rong Chen, Karungaru Stephen, Kenji Terada, Linhuang Wang D2-0830_H3.7 465 Scale and Rotation Estimation of Similarity-Transformed Images via Cross-Correlation Maximization Based on Auxiliary Function Method Shinji Yamashita, Yuma Kinoshita, Hitoshi Kiya D2-0830_H3.8 468 Strong Eye Closure Detection in Children with Profound Intellectual and Multiple Disabilities Using Robust Temporal Difference Features Kaito Kosaki, Teppei Nakano, Mari Wakabayashi, Tomomi Sato, Tetsuji Ogawa D2-0830_H3.9 532 A Rate-Quality Model for Learned Video Coding Sang NguyenQuang, Cheng-Wei Chen, Xiem HoangVan, Wen-Hsiao Peng D2-0830_H3.10 539 Low-Light RAW Image Enhancement with Additive Parameterization and State Space Model Shugo Yamashita, Masaaki Ikehara D2-0830_H3.11 621 Synthesizing and Restoring Weather-corrupted Images with Conditional Diffusion Models Youngho Go, Sung-Hak Lee
08:30-10:00	D2-0830_P1 Multimodal AI Location: Peony I D2-0830_P1.1 23 VoxRep: Enhancing 3D Spatial Understanding in 2D Vision-Language Models via Voxel Representation Alan (Gia Tuan Dao) Dao, Norapat Buppodom D2-0830_P1.2 155 Active Multi-Object Tracking for 3D Reconstruction with Hierarchical Reinforcement Learning Heng Li, Cheng Cai D2-0830_P1.3 193 Multimodal Sentiment Analysis with Missing Modality: A Knowledge-Transfer Approach Weide Liu, Huijing Zhan D2-0830_P1.4 299 Modeling Spatiotemporal Multimodal Data With Kernel Graph Regression Models And Copulas Jeffrey Wu, Gareth Peters D2-0830_P1.5 510 CopeCap: A lightweight image captioning model with collaborative prompt learning Xiwei Yu, Guoshun He, Huijing Zhan D2-0830_P1.6 541 Lyric-Aware Karaoke Background Video Selection Using Large Language Models and Moment Retrieval Tomoki Ariga, Jun Taniguchi, Yosuke Higuchi, Sayaka Toma, Kunihiro Abe, Rie Shigyo, Tetsuji Ogawa D2-0830_P1.7 550 Audio-Visual Speech Recognition Based on Cross-Lingual Transfer Learning Fumiya Kondo, Tamura Satoshi D2-0830_P1.8 562 Exploring Machine Learning and Language Models for Multimodal Depression Detection Javier Si Zhao Hong, Timothy Zoe Delaya, Sherwyn Chan Yin Kit, Pai Chet Ng, Xiaoxiao Miao
08:30-10:00	D2-0830_P2 Biomedical Signal Processing and Systems I Location: Peony II D2-0830_P2.1 26 Predicting Problematic Internet Use in Children Using Feature-Rich Structured Data with Ensemble Machine Learning and Bayesian Optimisation Niteesh K R, Pooja T S D2-0830_P2.2 148 Phonocardiogram Signal Analysis for Myocardial Infarction Level Prediction Using Deep Learning Model Ira Puspasari, Tati L.R. Mengko, Agung W. Setiawan, Miftah Pramudyo, Nobuo Watanabe, Trio Adiono D2-0830_P2.3 173 Prediction of Maximum and Minimum Postprandial Blood Glucose Levels in People with Diabetes Kotaro Nagayama, Shota Kato, Kana Eguchi, Masahide Hamaguchi, Hiroyuki Tominaga, Youji Hamaguchi, Michiaki Fukui, Manabu Kano D2-0830_P2.4 214 Towards Telepathic Communication: A Multi-Band EEG Model for Imaginary Speech Decoding Yifan Zhang, Yuting Ding, Fei Chen D2-0830_P2.5 240 Tiny-VRN: A Lightweight Variational Residual Network for EEG-Based Emotion Recognition Sivaraj Nimishan, Selvarajah Thuseethan, Shanmuganathan Vasanthapriyan, Roshan G. Ragel D2-0830_P2.6 241 A Comparison of Solicited and Longitudinal Cough Sounds for Tuberculosis Detection Aprianto Dwi Prasetyo, Bagus Tris Atmaja, Dhany Arifianto, Sakriani Sakti D2-0830_P2.7 336 Detecting Defecation Premonition from the Acoustic Activity of Bowel Sounds Shota Miyagawa, Toshitaka Yamakawa, Masayuki Tanabe, Kazushi Ikeda D2-0830_P2.8 407 EegCNR: A Novel Feature for Attention Estimation from EEG Asif M S, Sagila Gangadharan K, Achutavarrier Prasad Vinod D2-0830_P2.9 413 Lower Limb Calf Muscle Segmentation from Diffusion-Weighted Magnetic Resonance Images Using Deep Learning Eshan Pandey, Xiaomeng Wang, Julian Gan, Ying-Hwey Nai, Derek Hausenloy, Pek Lan Khong, Forest Su Lim Tan, Thiruneepan Selvakulasingam, Ryan Fraser Kirwan, Cheryl Pei Ling Lian D2-0830_P2.10 415 Principal Component Regularization in Iterative Inversion of DBIM for Ultrasound Tomography Nguyen Thi Thu, Tran Quang-Huy, Luong Thi Theu, Duc-Tan Tran D2-0830_P2.11 447 Reasoning Visualization for Critical Care EEG Classification with Prototypical Part Networks Takuma Bingo, Hajime Yano, Taichiro Ashizaki, Kazuma Koda, Masaya Togo, Riki Matsumoto, Tetsuya Takiguchi D2-0830_P2.12 227 Plant Species-Specific Anomaly Detection Based on Electrophysiological Signals Andy Desman Lo, Elvin Nur Furqon, Junaidul Islam, Isack Farady, Kahlil Muchtar, Ronnie Concepcion II, Chih-Yang Lin
10:00-10:30	Break
10:30-12:00	D2-1030_IB Active Noise Cancellation Panel Discussion Location: Island Ballroom
10:30-12:00	D2-1030_L1 Advanced Topics on Music Processing‬ Location: Lotus I D2-1030_L1.1 177 Drum-to-Vocal Percussion Sound Conversion and Its Evaluation Methodology Rinka Nobukawa, Makito Kitamura, Tomohiko Nakamura, Shinnosuke Takamichi, Hiroshi Saruwatari D2-1030_L1.2 283 How do Deaf and Hard of Hearing people listen to Music Instruments? Subjective Evaluation and Acoustic Features Rumi Hiraga, Yuhki Shiraishi, Keiichi Yasu D2-1030_L1.3 298 Quality Assessment of DNN–Based Algorithms for Music Boundary Detection Aneeka Azmat, Li Su, ChengHsin Hsu D2-1030_L1.4 319 Note-level Nonchord-tone Identification with Graph Neural Networks Yui Uehara, Satoshi Tojo D2-1030_L1.5 337 Evaluation Score Prediction for Japanese Songs Based on Melody Fitness to Lyrics Sosuke Nishimura, Eita Nakamura D2-1030_L1.6 349 A Comparative Study of Statistical Features and Deep Learning for Orchestral Texture Classification Zih-Syuan Lin, Jun-You Wang, Li Su D2-1030_L1.7 356 Efficient Transformer-Based Piano Transcription With Sparse Attention Mechanisms Weixing Wei, Kazuyoshi Yoshii D2-1030_L1.8 424 Transformer-Based Unpaired Piano Accompaniment Style Transfer Hsin Ai, Yi-Hsuan Yang D2-1030_L1.9 441 Designing a Music Difficulty Measure for Controllable Automatic Piano Rearrangement Hikari Miyaji, Keito Sawada, Wen-Chin Huang, Tomoki Toda D2-1030_L1.10 453 Vocal onset detection and pitch segmentation in medieval choral music guided by original notational sources Samuel Bellows, Sarabeth Mullins, Brian Katz D2-1030_L1.11 496 MORTM: MoE-Optimized Rhythmic Transformer Model for Symbolic MIDI Generation Takaaki Nagoshi, Tetsuro Kitahara D2-1030_L1.12 517 TAPA-ICL: Taxonomy-Aware Prompt Augmentation for In-Context Learning in Music Understanding Jiahao Zhao, Yunjia Li, Kazuyoshi Yoshii
10:30-12:00	D2-1030_L2 Speech and Language Processing II Location: Lotus II D2-1030_L2.1 97 Beyond Binary Detection: Multi-Etiology Dysarthria Classification with Pre-trained Speech Models Zihan Zhong, Qianli Wang, Satwinder Singh, Clarion Mendes, Mark Hasegawa-Johnson, Waleed Abdulla, Seyed Reza Shahamiri D2-1030_L2.2 118 A Dual-Path Speaker-Independent Acoustic-to-Articulatory Inversion Model Based On Content and Speaker Information Disentanglement Qiang Fang D2-1030_L2.3 120 Dementia Prediction From Speech Signal Using Optimized Prosodic Features Bagus Tris Atmaja, Sakriani Sakti D2-1030_L2.4 154 Speech Emotion Recognition via Entropy-Aware Score Selection ChenYi Chua, JunKai Wong, Chengxin Chen, Xiaoxiao Miao D2-1030_L2.5 264 Improving Exemplar-Based Electrolaryngeal Speech Voice Conversion via Robust Content Representations Fo-Rui Li, Hsin-Te Hwang, Ming-Chi Yen, Men-Tung Lo, Yu Tsao, Hsin-Min Wang D2-1030_L2.6 308 An Efficient Transfer Learning Method Based on Adapter with Local Attributes for Speech Emotion Recognition Haoyu Song, Mcloughlin Ian, Qing Gu, Nan Jiang, Yan Song D2-1030_L2.7 331 ASRQ-VC: ASR-Guided Speech Content Quantization for High-Fidelity Voice Conversion Songting Liu, Deheng Ye, Wei Yang, Haoyang Li, Eng Siong Chng D2-1030_L2.8 338 PUNSER: Large-Scale Pre-trained and Unified Model for Practical Speech Emotion Recognition Yu Hayashizaki, Takashi Nose, Sumiharu Kobayashi, Satoru Fukayama, Akinori Ito D2-1030_L2.9 440 Investigation of the effectiveness of converted speech auditory feedback in low-latency real-time voice conversion Kiseki Niwa, Kazuhiro Kobayashi, Tomoki Toda D2-1030_L2.10 587 Study on Signal Processing Techniques in Protecting Voice Personae Against Speech Synthesis Systems Nopparut Li, Candy Olivia Mawalim, Masashi Unoki D2-1030_L2.11 603 MixedG2P-T5: G2P-free Speech Synthesis for Mixed-script texts using Speech Self-Supervised Learning and Language Model Joonyong Park, Daisuke Saito, Nobuaki Minematsu
10:30-12:00	D2-1030_H3 Emerging Technologies and Applications of Image Processing and Computer Vision Location: Hibiscus III D2-1030_H3.1 54 You Only Touch Once: One-Touch System for Personalized 3D Music Video Generation Kyungjune Lee, Youngjin Shin, Jungwoo Huh, Sanghoon Lee D2-1030_H3.2 143 Single-Image Pupil Localization via Implicit 3D Eye Reconstruction Taejun Roh, Yejin Cho, Duong Hai Nguyen, Chul Lee D2-1030_H3.3 171 Flow-Guided Consistent Video Depth Estimation for Cross-Dataset Generalization Jaeseok Jang, Chang-Su Kim D2-1030_H3.4 196 DCB: An Efficient Approach for Building Long-Range Dependencies in CNNs Tianxiang Lan, Mingyi He, Yuchao Dai D2-1030_H3.5 269 A User-Guided and Local Motion-Adaptive Framework for Virtual Product Placement in Video Tianwen Zhang, Ju-Won Seo, Kang-Min Kim, Keunsoo Ko D2-1030_H3.6 332 Shallow yet Perceptual Decoding for Neural Image Compression through Minimal Nonlinearity JaeKyung Ryu, Nam Ik Cho D2-1030_H3.7 445 SyncScore: A Framework for Synchronization Scoring in Group Sports via Human Pose Estimation Khai Pin Ang, Iven Zi Yin Low, Yumun Hooi, Yuen Peng Loh D2-1030_H3.8 518 Data Augmentation-Driven Segmentation of Ovarian Tumor Ultrasound Images using Vision Mamba Thanh-Phuc Dao, Huyen-Trang To, Hoang-Son Bui, Thi-Lan Le D2-1030_H3.9 548 Optimizing JPEG Decoder for Bitstream-Corrupted Image Restoration Shumin Jiang, Hao Qin, Tianyi Liu, Yi Wang D2-1030_H3.10 568 Semantic Scene Completion from a Single Depth Image with Coarse-Grained Segmentation Jiun Yen Ching, Lai-Kuan Wong, Wai Lee Kung D2-1030_H3.11 572 Pixel-weighted Domain Adaptation for Agricultural Segmentation Shunta Kimura, Handie Shao, Shogo Matsumoto, Daiki Yamada, Toshihiro Kitajima, Hideki Nakayama
10:30-12:00	D2-1030_P1 Biomedical Signal Processing and Systems II Location: Peony I D2-1030_P1.1 454 Freeze and Learn using KAN for Infant Cry Classification Arth Shah, Vishnu Vardhan, Hemant Patil D2-1030_P1.2 466 Investigation of Enhancement Strategies for Recurrent Spiking Neural Network based Brain-Machine Interface Decoding Wilson Tansil, Nur Ahmadi, Timothy Constandinou, Dessi Puji Lestari D2-1030_P1.3 531 Detecting Deceptive Responses Due to Psychological Bias by the Probability Density Function of EEG Content Rate Dynamics During NEO-FFI Answering Yuto Ashikawa, Yosuke Kurihara D2-1030_P1.4 535 A Comparative Analysis of Statistical, Regional CNN, and Sequential Transformer Approaches for Alzheimer's Disease Classification Trí Huynh, Xuan Hoc Pham, Nhu Nguyen, Thi Thu Nguyen, Huong Ha, Lua Ngo D2-1030_P1.5 591 Beyond Speech and More: Investigating the Emergent Ability of Speech Pre-Trained Models for Classifying Physiological Time-Series Signals Orchid Chetia Phukan, Swarup Ranjan Behera, Girish, Mohd Mujtaba Akhtar, Arun Balaji Buduru, Rajesh Sharma D2-1030_P1.6 630 Channel Selection Guided by Layer-wise Relevance Propagation for CNN-Based EEG Classification of Major Depressive Disorder Woo-Seok Ahn, Seung-Hwan Lee, Han-Jeong Hwang D2-1030_P1.7 631 Development of HRV-Based Biomarkers for Predicting Blood Glucose Levels Ju-An Park, Jun-Seok Lee, Na-Ri Kim, Han-Jeong Hwang D2-1030_P1.8 632 Development of 3D Textile Electrodes for Electrocardiography Measurement Sang-Ho Lee, In-Su Park, Han-Jeong Hwang
10:30-12:00	D2-1030_P2 Machine Learning: Algorithms and Application I Location: Peony II D2-1030_P2.1 22 Kernel Ridge Regression for Efficient Learning of High-Capacity Hopfield Networks Akira Tamamori D2-1030_P2.2 47 Enhanced Sliding Discrete Fourier Transform (eSDFT) with Error-Bound Control for Real-Time Parallel Processing Jetsada Arnin, Danial Kahani, Bernard A. Conway D2-1030_P2.3 216 Sparse-Coded Time-Delay DMD with Control for Nonlinear State-Space Modeling on Graphs Ryuto Ito, Hiromu Kanauchi, Hiroyasu Yasuda, Masaaki Nagahara, Shogo Muramatsu D2-1030_P2.4 229 Nonnegative Matrix Factorization Using Dirichlet-Distribution-Based Regularization Haru Ogawa, Daichi Kitamura, Shoma Ayano D2-1030_P2.5 274 Significance of co-occurring biomarkers in localization of epileptic seizure onset zone Nawara Mahmood Broti, Masaki Iwasaki, Yumie Ono D2-1030_P2.6 294 Reinforcement Learning in Portfolio Management: A Survey of Methods and Trends Silan Hu, Yulin Huang, Arjun Agarwal, Tanya Warrier, Yuwen Wang, Haozhe Ma, Zhengding Luo D2-1030_P2.7 313 Large Sparse Covariance Matrix Estimation via Dual Proximal Gradient Method Fengpei Li, Ziping Zhao D2-1030_P2.8 361 An improved method for Image Shadow Removal by Combining Deterministic and Stochastic Models Hongjun Sheng, Lanqing Guo, Xinggan Peng, Zhiping Lin, Bihan Wen D2-1030_P2.9 576 Knowledge-Infused Topic Model for Empathetic Dialogue Response Po-Chuan Chen, Jen-Tzung Chien D2-1030_P2.10 578 Cross-Patient Seizure Onset Zone Classification by Patient-Dependent Weight Xuyang ZHAO, Hidenori Sugano, Toshihisa Tanaka D2-1030_P2.11 609 NOCTUA: A High-Efficiency Reconfigurable NoC-based Transformer Universal Accelerator Kun-Chih Chen, Pin-Ching Shen, Bo-Chun Chen
12:00-12:30	Lunch
12:30-13:30	D2-1230_IB Women in APSIPA Forum Location: Island Ballroom
13:30-14:30	D2-1330_IB Keynote 2 by Jane Wang Location: Island Ballroom
14:30-16:00	D2-1430_IB Perspective 3: Neural Speech Assessment and Its Application Location: Island Ballroom D2-1430_IB.1 637 Progress and Challenges in DNN-based Objective Quality Assessment of Synthesized Speech Erica Cooper D2-1430_IB.2 635 Advancing Speech Quality Assessment Through Scientific Challenges and Open-source Activities Wen-Chin Huang D2-1430_IB.3 646 Non-Intrusive Intelligibility Prediction for Hearing Aids: Recent Advances, Trends, and Challenges Ryandhimas Zezario D2-1430_IB.4 647 From Evaluation to Optimization: Neural Speech Assessment for Downstream Applications Yu Tsao
14:30-16:00	D2-1430_L1 Active Noise Control I Location: Lotus I D2-1430_L1.1 56 Design of speech leakage-suppressed audio-spot based on auditory masking area control with active masker cancellation using parametric array loudspeakers Tomoki Hashida, Yuting Geng, Masato Nakayama, Takanobu Nishiura D2-1430_L1.2 57 Multichannel feedforward active noise control system with optical laser microphone in reverberant environments Maoto Mizutani, Kenta Iwai, Masato Nakayama, Takanobu Nishiura, Yoshiharu Soeta D2-1430_L1.3 72 Frequency-domain online modeling of multiple secondary paths without auxiliary noise for active noise control Siyuan Lian, Xiaofeng Zeng, Ruquan Sun, Jing Lu D2-1430_L1.4 124 Applying Model-Agnostic Meta-Learning with Iterative Dichotomiser 3 for Alternating-Switching Active Noise Control Systems Xiaoyi Shen, Dongyuan Shi, Woon-Seng Gan, Jun Yang D2-1430_L1.5 285 A Robust Proactive Communication Strategy for Distributed Active Noise Control Systems Junwei Ji, Dongyuan Shi, Zhengding Luo, Boxiang Wang, Ziyi Yang, Haowen Li, Woon-Seng Gan D2-1430_L1.6 289 Directional Selective Fixed-Filter Active Noise Control Based on Convolutional Neural Network in Reverberant Environments Boxiang Wang, Zhengding Luo, Haowen Li, Dongyuan Shi, Junwei Ji, Ziyi Yang, Woon-Seng Gan D2-1430_L1.7 301 An Online Secondary Path Modeling Technique in a Hybrid Active Noise Control System Harold Alexis Lao, Cheng-Yuan Chang D2-1430_L1.8 345 A Diffusion Remote Microphone Technique for Distributed Active Noise Control Tianyou Li, Sipei Zhao, Haowen Li, Xiaofeng Zeng, Ruquan Sun, Jing Lu D2-1430_L1.9 511 An Integrated Active Noise Control and Crosstalk Cancellation System Designed Under a Generalized Model-Matching Framework Michael Edy, Chih Yen Wang, Ching En Huang, You Siang Chen, Mingsian R. Bai D2-1430_L1.10 553 Improvement of Noise Reduction in a Panel Combined with Multiple Loudspeakers Using Active Noise Control Tatsuya Murao D2-1430_L1.11 610 Selective Fixed Filter Sub-band Active Noise Control System Based on Reference Signal Power Estimation Shota Toyooka, Ryo Matsuura, Kenta Iwai, Yoshinobu Kajikawa D2-1430_L1.12 626 Performance analysis of active noise control over a spatial region Jihui (Aimee) Zhang, Thushara Abhayapala, Naoki Murata, Prasanga Samarasinghe, Yu Maeno, Yuki Mitsufuji
14:30-16:00	D2-1430_L2 Speech and Language Processing III Location: Lotus II D2-1430_L2.1 27 I^2TTS: Image-indicated Immersive Text-to-speech Synthesis with Spatial Perception Jiawei Zhang, Tian-Hao Zhang, Jun Wang, Jiaran Gao, Ruijie Tao, Xinyuan Qian, Xu-Cheng Yin D2-1430_L2.2 130 Chain-of-Thought Distillation for ASR Error Correction with Multimodal Large Language Models Shaomeng Yang, Jiaming Luo, Jinran Wang, Rongfeng Su, Yongjie Zhou, Lan Wang, Nan Yan D2-1430_L2.3 163 Direction-guided Spatial Attention for Multichannel Speech Enhancement Shuai Nie, Yaran Chen, Shan Liang, Jiaming Xu, Runyu Shi D2-1430_L2.4 168 A Study of Japanese Mixed Emotional Speech Synthesis Based on an End-to-End Emotional Speech Synthesis Model Issei Sakata, Tetsuo Kosaka D2-1430_L2.5 191 EFTTS: Zero-Shot Emotional Speech Synthesis via Conditional Flow Matching and Self-Supervised Representations Haoyu Wang, Jiale Chen, Jiaxun Li, Sizhe Shan, Yuehai Wang D2-1430_L2.6 208 Improving Speech-to-Speech Translation for Low-Resource Languages via Transfer Learning Rui Zhou, Akinori Ito, Takashi Nose D2-1430_L2.7 235 DialoSpeech: Dual-Speaker Dialogue Generation with LLM and Flow Matching Hanke Xie, Dake Guo, Chengyou Wang, Yue Li, Wenjie Tian, Xinfa Zhu, Xinsheng Wang, Xiulin Li, Guanqiong Miao, Bo Liu, Lei Xie D2-1430_L2.8 237 VICNet: FaderNet-Based Voice Impression Conversion with Affective Dimensional Representation Takuya Takahashi, Saki Kugimoto, Toru Nakashika D2-1430_L2.9 248 Strategic Re-weighting of U-Net Components in Diffusion Models for Enhanced Speech Enhancement without Retraining Yuehai Zhang, Yang Li, Yuehao Zhao, Shoji Makino D2-1430_L2.10 261 Fast and Speaker-Independent Utterance Selection for ASR-Free CALL Systems of Minority Languages Takaki Koshikawa, Akinori Ito, Takashi Nose D2-1430_L2.11 414 Speech-Content-Driven Highlighting of Translated Lecture Slides for Foreign Language Lecture Understanding Naoki Muto, Chee Siang Leow, Junichi Hoshino, Takehito Utsuro, Hiromitsu Nishizaki
14:30-16:00	D2-1430_H3 Machine Learning: Information and Medical Applications Location: Hibiscus III D2-1430_H3.1 24 Class Incremental Learning using Continual Backpropagation on Honey Botanic Origin Classification with Hyperspectral Imaging Guyang Zhang, Iman Ardekani, Waleed Abdulla D2-1430_H3.2 260 Multi-strategy improved electric eel foraging optimisation algorithm for UAV path planning Zexin Zhang, Chengbiao Fu, Hongwei Guo, Anhong Tian D2-1430_H3.3 278 A Deep Reinforcement Learning Approach to Roundabout Traffic Signal Control Cheng-Yu Chen, Daniil Buryakov, Valentinus Roby Hananto, Victor Kryssanov D2-1430_H3.4 300 A preliminary study on machine learning to predict circuit exchange in pediatric patients with ECMO Tatsuya Hasegawa, Toshiyuki Nakanishi, Koichi Fujiwara D2-1430_H3.5 303 HasRL Robot: A Heterogeneous Asynchronous Reinforcement Learning System for High-Dimensional Bipedal Control Jingyang Mai, Zechen Guo, Zhengding Luo, Haozhe Ma D2-1430_H3.6 324 A Psychological Strategy Annotation Method Using Multiple LLMs with a Chain of Thought Based on Deductive Reasoning Jinran Wang, Jiaming Luo, Shaomeng Yang, Yongjie Zhou, Xuefang Zhang, Rongfeng Su, Nan Yan, Lan Wang D2-1430_H3.7 431 Outlier Removal in MEG Data for Imagined Speech Classification Koki Nose, Hajime Yano, Tetsuya Takiguchi, Seiji Nakagawa D2-1430_H3.8 558 Performance Evaluation of CHIRPS and ETCCDI Indices for Extreme Rainfall Risk Mapping in Thailand Using XGBoost Vinitar Khettar, Nuntikorn Kitratporn, Sawarin Lerk-u-suke, Jirabhorn Chaiwongsai, Phaisarn Jeefoo, Chanika Sukawattanavijit D2-1430_H3.9 580 Riverbed Estimation Using Locally-Structured Unitary Network Seiyu Hitomi, Hiroyasu Yasuda, Kiyoshi Hayasaka, Shogo Muramatsu D2-1430_H3.10 594 Contrastive Learning of Temporal and Event-Based Behavioral Views for Universal User Embeddings Yuuki Tachioka D2-1430_H3.11 595 Market Forecasting Using LSTM-ARIMA Model with MACD Decomposition Teng-Chih Yu, Jian-Jiun Ding
14:30-16:00	D2-1430_P1 Audio Processing Location: Peony I D2-1430_P1.1 129 Anomalous Sound Detection Based on Derivative Features of Short-Time Holomorphic Fourier Transform Iori Hashimoto, Yu Morinaga, Suehiro Shimauchi, Shigeaki Aoki D2-1430_P1.2 140 Elastic Additive Angular Margin Loss Integrated with Mixup for Anomalous Sound Detection Yihao Zhao, Yichen Yang, Xiao Zhang, Shoji Makino D2-1430_P1.3 174 A Distilled Low-Latency Neural Vocoder with Explicit Amplitude and Phase Prediction Hui-Peng Du, Yang Ai, Zhen-Hua Ling D2-1430_P1.4 408 Directional Filtering of Sound Fields for Emphasizing Specific Directions of Arrival and Its Applications Ryo Murakami, Natsuki Ueno D2-1430_P1.5 409 Sound Field Estimation Method Robust to Microphone Position and Directivity Errors Takumi Koga, Natsuki Ueno D2-1430_P1.6 434 Anomalous Sound Detection Using Time-Frequency Derivative of Instantaneous Phase Features Tran-Quang-Tuan Vo, Quoc-Huy Nguyen, Masashi Unoki D2-1430_P1.7 492 Few-Step Diffusion-Based Voice Conversion Using Consistency Trajectory Models Ryuichi Hatakeyama, Toru Nakashika, Takuya Takahashi D2-1430_P1.8 583 Spatial Audio Signal Enhancement: A Multi-output MVDR Method in The Spherical Harmonic-domain Huawei Zhang, Jihui (Aimee) Zhang, Huiyuan (June) Sun, Prasanga Samarasinghe
14:30-16:00	D2-1430_P2 Signal & Information Processing I Location: Peony II D2-1430_P2.1 46 Generalized Student's t Sparse Kernel Learning for Robust Signal Processing Long Pan, Libiao Peng, Xifeng Li, Dongjie Bi, Yongle Xie D2-1430_P2.2 61 A Hierarchical Attention Model for Local and Global Feature Integration in RCS Classification Yida Wu, Caiyun Wang, Jianing Wang, Xiaofei Li, Ying Nan D2-1430_P2.3 66 A Sliding-Window Range–Bearing Scan STAP for Underwater Active Sonar Target Detection Weisi Hua, Yixin Yang, Yuxuan Chen, Xianghao Hou D2-1430_P2.4 76 TH-LDV: Transformer-based Hybrid method for Signal Detection in Laser Doppler Velocimetry Yue Wang, Ruifeng Li, Changsong Liu, Liangrui Peng, Ning Ding, Gang Yao D2-1430_P2.5 159 Estimating Dynamic Graph Flows with Kernel Models and Hadamard-Structured Riemannian Constraints Duc Thien Nguyen, Konstantinos Slavakis, Dimitris Pados D2-1430_P2.6 202 Period Estimation for Time-Varying Graph Signals and Its Application to Graph Wiener Filter Tsutahiro Fukuhara, Junya Hara, Hiroshi Higashi, Yuichi Tanaka D2-1430_P2.7 244 Computationally Efficient Sparse Signal Recovery by Deep Unfolded-Periodic Sketched ISTA Tatsuki Tokumura, Ayano Nakai-Kasai, Tadashi Wadayama D2-1430_P2.8 259 Fisher Information-based Metrics for Representation Learning Do Nguyen Dang Thi, Le Quoc Anh, Tran Trong Duy, Le Vu Ha, Nguyen Linh Trung D2-1430_P2.9 271 Wave Direction Estimation Based on Local Gradient Techniques from Satellite Imagery for Coastal Dynamics Monitoring Woramet Simrum, Paweena Kanokhong, Chakapat Chokchaisiri, Somrudee Deepaisarn, Kittipisut Chansri, Chanyut Lisawat, Waranrach Viriyavit, Akkharawoot Takhom, Phutphalla Kong, Didin Agustian Permadi, Sharifah Hafizah Syed Ariffin, Surasak Boonkla, Kasorn Galajit, Jessada Karnjana D2-1430_P2.10 318 HIQA-DB: A Benchmark Dataset for Image Quality Assessment in Hospital Surveillance Yujin Han, Taewan Kim D2-1430_P2.11 627 Semantic Neural View Synthesis for Key Content Preservation in Horizontal-to-Vertical Video Conversion Dipanita Chakraborty, Minoru Okada, Kosin Chamnongthai
16:00-16:30	Break
16:30-18:00	D2-1630_IB Education Forum Location: Island Ballroom
16:30-18:00	D2-1630_L1 Active Noise Control II Location: Lotus I D2-1630_L1.1 63 Electro-acoustic component placement optimization for helicopter cabin ANC systems Yuhang Yang, Liquan Shi, Ningyuan Liang, Guoyong Jin D2-1630_L1.2 87 Spatial-Correlation-Based Error Weighting Method for Efficient Application of Filtered Reference Algorithm in Multichannel Active Noise Control Meiling Hu, Jing Lu, Qingyu Ma D2-1630_L1.3 134 An Alternating Mode Strategy for Adaptive Sound Field Control and Acoustic Path Tracking Junqing Zhang, Jingli Xie, Dongyuan Shi, Wen Zhang, Jingdong Chen, Jacob Benesty D2-1630_L1.4 265 DOA Estimation with Lightweight Network on LLM-Aided Simulated Acoustic Scenes Haowen Li, Zhengding Luo, Dongyuan Shi, Boxiang Wang, Junwei Ji, Ziyi Yang, Woon-Seng Gan D2-1630_L1.5 305 Co-forecasting of Time-varying Spatial-frequency Map for Selective Fixed-Filter Multichannel ANC based on Dynamic Factor Graph Xiruo Su, Bin Wu D2-1630_L1.6 310 Unsupervised Spectrogram Enhancement Algorithm Based on Bi-LSTM Hanwen Zhang, Xiruo Su, Zhijuan Zhu, Bin Wu, Lingyun Ye D2-1630_L1.7 330 Continual Learning-Based Selective Fixed-filter Active Noise Control Jingsong Xiao, Qirui Huang D2-1630_L1.8 340 Meta-Learned Regional Initialization of Control Filters for Headphone Active Noise Control Ziyi Yang, Zhengding Luo, Dongyuan Shi, Junwei Ji, Boxiang Wang, Haowen Li, Qirui Huang, Woon-Seng Gan D2-1630_L1.9 458 RAMDC: Room-Aware Multi-Device Clustering for Large Scale Teleconferencing Yile Zhang, Weiting Lai, Amy Bastine, Xingyu Chen, Lachlan Birnie, Thushara Abhayapala, Prasanga Samarasinghe D2-1630_L1.10 490 Multi-channel ANC with Adaptive Kernel Assisted On-line Secondary Path Modeling Hucheng Wang, Tao Liu, Junqing Zhang, Wen Zhang D2-1630_L1.11 497 A Laplace Distribution-Based Variable Step-Size FxlogLMS Algorithm for Active Impulsive Noise Control Aoi Haneda, Yosuke Sugiura, Tetsuya Shimamura D2-1630_L1.12 513 Research Progress on Active Control of Road Noise in Vehicles Wangxiaoxu Chen, Jiancheng Tao, Shuping Wang, Kai Chen, Haishan Zou, Xiaojun Qiu
16:30-18:00	D2-1630_L2 Speech and Language Processing IV Location: Lotus II D2-1630_L2.1 29 Leveraging Language Information for Target Language Extraction Mehmet Sinan Yildirim, Ruijie Tao, Wupeng Wang, Junyi Ao, Haizhou Li D2-1630_L2.2 116 VietLyrics: A Large-Scale Dataset and Models for Vietnamese Automatic Lyrics Transcription Nguyen Quoc Anh, Bernard Cheng, Kelvin Soh D2-1630_L2.3 449 Autofocus Neural Beamformer Based on Steering Vector Estimation Reiya Marukawa, Takeshi Yamada D2-1630_L2.4 478 Estimating User Sentiment at Sub-exchange Granularity from Exchange-level Annotations Daichi Yukizawa, Kazunori Komatani, Ryu Takeda, Kenta Yamamoto D2-1630_L2.5 502 DAU-KDAH Dysarthic Multi-Lingual and Multimodal Speech Corpora for Indic Languages Arth Shah, Hiya Chaudhari, Kavya Kumar, Arushi Srivastava, Priya Damdar, Ravindrakumar Purohit, Dharmendra Vaghera, Bhavna Singh, Aparna Walanj, Abhishek Srivastava, Hemant Patil D2-1630_L2.6 503 Gamma-VAE-VC: Voice conversion based on VAE assuming gamma distribution for both latent variables and observation Nanako Imaichi, Takuya Takahashi, Toru Nakashika D2-1630_L2.7 514 Zero-shot Context Biasing with Trie-based Decoding using Synthetic Multi-Pronunciation Changsong Liu, Yizhou Peng, Eng Siong Chng D2-1630_L2.8 551 Dimension 414 and Minimal Embedding Dimensions for Phonetic Feature Encoding in WavLM Narthana Sivalingam, Uthayasanker Thayasivam D2-1630_L2.9 560 Directional Hybrid Optimization of HRTFs for Low-Order Spherical Harmonics Binaural Rendering Rui Zhang, Yuxuan Ke, Qunping Ni, Ge Yao, Xiaodong Li, Chengshi Zheng D2-1630_L2.10 577 Speech Enhancement Network With Windowed Cross Attention Using Noise-Reference Microphone Kota Suzuki, Yosuke Sugiura, Tetsuya Shimamura D2-1630_L2.11 641 BAANI: A 296M-Parameter Neural Vocoder for End-to-End Punjabi Speech Synthesis Siddharth Kumar, Nisarg Trivedi, Ravindrakumar Purohit, Hemant Patil
16:30-18:00	D2-1630_H3 Recent Advances in Multimedia Enrichment, Security and Privacy Location: Hibiscus III D2-1630_H3.1 89 Reversible Data Hiding in EtC Images with Flexible Access Privileges Yusaku Kato, Shoko Imaizumi D2-1630_H3.2 109 Robust Ownership Verification of DNN Models Against JPEG Compression via Probability-Controlled Adversarial Attacks Teruki Sano, Minoru Kuribayashi, Masao Sakai, Shuji Ishobe, Eisuke Koizumi, Zhang Zhang D2-1630_H3.3 136 Detoxification of Poisoned Recognition Models by Fine-tuning with Out-of-Distribution Samples Junsuke Takano, Kazuaki Nakamura D2-1630_H3.4 192 Layer-Wise Weight Statistics for Node Classification and Defense of Federated Large Language Models Alexander Berns, Reon Akai, Minoru Kuribayashi, Rémi Cogranne D2-1630_H3.5 210 Robustness evaluation against fine-tuning in associative watermarking method for CNN Keiichi Mori, Masaki Kawamura D2-1630_H3.6 218 Lossless Image Processing for OpenEXR Images with Flexible Functions Anna Yamaguchi, Shoko Imaizumi D2-1630_H3.7 225 Proposal of a Random Encoding Layer Compatible with Arbitrary Message Lengths for DiffuseTrace Ou Egami, Masaki Kawamura D2-1630_H3.8 292 Automatic Dependent Surveillance-Broadcast Preamble Classification for Spoofing Detection Darren Kah Hou Quek, Guang Hua, Zhiping Lin D2-1630_H3.9 320 Model Extraction Attack and Its Countermeasure for Denoising Diffusion Implicit Models Hayato Shoji, Kazuaki Nakamura D2-1630_H3.10 325 Content-Aware Dominant Color Extraction and Its Application to Mltiple-key-Color Image Retrieval Mei Hashimoto, Michiharu Niimi D2-1630_H3.11 469 Privacy-Preserving Image Retrieval Scheme Using Combined Features in Cloud Computing Jing Liang, Yuxuan Wang, Tingting Song, Ce Zheng, Peiya Li
16:30-18:00	D2-1630_P2 Advances in Multimodal AI for Multimedia Applications Location: Peony II D2-1630_P2.1 59 Efficient Generative Adversarial Networks for Color Document Image Enhancement and Binarization Using Multi-scale Feature Extraction Rui-Yang Ju, KokSheik Wong, Jen-Shiun Chiang D2-1630_P2.2 68 Leveraging Large Language Models in Visual Speech Recognition: Model Scaling, Context-Aware Decoding, and Iterative Polishing Zehua Liu, Xiaolou Li, Li Guo, Lantian Li, Dong Wang D2-1630_P2.3 96 Computationally-efficient Call Classification of New Zealand Birds using Texture-based Features Yonghui Tao, Mathis Quere, Yusuke Hioka, Stephen Marsland D2-1630_P2.4 245 Incorporating Semantic Visual Content into Click-Through Rate Prediction for Video Advertisements Yoshiaki Tanabe, Shuntaro Masuda, Gakumatsu Ryu, Naoto Tanji, Hiroyuki Seshime, Ling Xiao, Toshihiko Yamasaki D2-1630_P2.5 249 From Blurry to Brilliant Detection: YOLO-Based Aerial Object Detection with Super Resolution Ragib Amin Nihal, Benjamin Yen, Takeshi Ashizawa, Katsutoshi Itoyama, Kazuhiro Nakadai D2-1630_P2.6 252 ATJO: Adaptive three-dimensional joint optimization for remote sensing video super-resolution Tian Qin, Lijing Bu, Zhengpeng Zhang, Mingjun Deng, Yin Yang, Jingxue Wang, Xinyu Lan, Wenjuan Peng, Yang Hu D2-1630_P2.7 268 Block-level Lagrange multiplier adaptation based on distortion propagation factors Hongwei Guo, Yipeng Liu, Lei Luo, Chengbiao Fu, Ce Zhu D2-1630_P2.8 317 Distributed Compressed Video Sensing with Enhanced Boundary Handling Based on Extended Convolutional Sparse Representation Ibuki Muta, Yoshimitsu Kuroki D2-1630_P2.9 354 Joint Modeling of Big Five and HEXACO for Multimodal Apparent Personality-trait Recognition Ryo Masumura, Shota Orihashi, Mana Ihori, Tomohiro Tanaka, Naoki Makishima, Taiga Yamane, Naotaka Kawata, Satoshi Suzuki, Taichi Katayama D2-1630_P2.10 359 Foreground-Background Segmentation Based Surveillance Video Coding Jiyong Yu, Luheng Jia, Yifan Zang, Zhaoyang Yu, Shuyuan Zhu, Li Song, Kebin Jia D2-1630_P2.11 436 Rain Removal via VAE-Enhanced Transformer with Hierarchical Feature Integration Yaya Huang, Litong Liu, KokSheik Wong
19:00-21:30	D2-1900_IB Banquet Location: Island Ballroom

2025 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)

Technical Program

Day 1 | Day 2 | Day 3

Menu

Technical Sponsor

Supporting Organizations