Yong Dai
Software Engineer @ Hithink RoyalFlush Information Network Co., Ltd

Email: daiyongya@outlook.com

Github | Google Scholar
Brief Bio

I'm a Software Engineer at Hithink RoyalFlush Information Network Co., Ltd. in China. Previously, I was on the research team at Tencent. My Ph.D. journey was at the University of Electronic Science and Technology of China (UESTC) in Chengdu, where I was fortunate to be mentored by Professor Zenglin Xu. My focus is on harnessing language models for a variety of downstream tasks, and recently, I've been delving into the fascinating world of multi-modality tasks.
News


2024.09 | Two paper are accepted to NeurIPS 2024.
2024.09 | One paper is accepted to EMNLP 2024 Findings.
2024.06 | Three papers are accepted to ACL 2024.
2023.12 | We released our evaluation paper "TencentLLMEval: a hierarchical evaluation of Real-World capabilities for human-aligned LLMs".
2023.12 | We released our reward adaptation paper "Everyone deserves a reward: Learning customized human preferences".
2023.06 | Our paper "SkillNet-X: A Multilingual Multitask Model with Sparsely Activated Skills" is accepted to ICASSP 2024.
2022.12 | Our paper "When Federated Learning Meets Pre-trained Language Models' Parameter-Efficient Tuning Methods" is accepted to Findings of ACL 2023.
2022.10 | Our paper "Leveraging Only the Category Name for Aspect Detection through Prompt-based Constrained Clustering" is accepted to Findings of EMNLP 2022.
2022.08 | We released our technique report "Effidit: Your AI Writing Assistant".
2022.05 | We released our multimodal data processing paper "One model, multiple modalities: A sparsely activated approach for text, sound, image, video and code".
2022.08 | We released our paper "Pretraining Chinese BERT for Detecting Word Insertion and Deletion Errors".
2022.03 | We released our paper "MarkBERT: Marking Word Boundaries Improves Chinese BERT".
2022.03 | Our paper "Is Whole Word Masking Always Better for Chinese BERT?": Probing on Chinese Grammatical Error Correction" is accepted to Findings of ACL 2022.
2022.03 | Our paper "Exploring and Adapting Chinese GPT to Pinyin Input Method" is accepted to ACL 2022.
2022.01 | Our paper "Graph fusion network for text classification" is accepted to KBS.
2021.09 | Our paper "Unsupervised sentiment analysis by transferring multi-source knowledge" is accepted to CC.
2020.10 | Our paper "Contextualize knowledge bases with transformer for end-to-end task-oriented dialogue systems" is accepted to EMNLP 2021 (Oral).
2020.04 | Our paper "Adversarial training based multi-source unsupervised domain adaptation for sentiment analysis" is accepted to AAAI 2020.
Publications --- after ChatGPT


IDGen: Item Discrimination Induced Prompt Generation for LLM Evaluation
Fan Lin*, Shuyi Xie*, Yong Dai*, Wenlin Yao, Tianjiao Lang, Zishan Xu, Zhichao Hu, Xiao Xiao, Yuhong Liu, Yu Zhang
NeurIPS 2024, arXiv: 2409.18892
Self-playing Adversarial Language Game Enhances LLM Reasoning
Pengyu Cheng, Tianhao Hu, Han Xu, Zhisong Zhang, Yong Dai, Lei Han, Nan Du
NeurIPS 2024, arXiv: 2404.10642
On Diversified Preferences of Large Language Model Alignment
Dun Zeng, Yong Dai*, Pengyu Cheng, Tianhao Hu, Wanshun Chen, Nan Du, Zenglin Xu
EMNLP 2024 Findings, arXiv: 2312.07401
TencentLLMEval: a hierarchical evaluation of Real-World capabilities for human-aligned LLMs
Shuyi Xie, Wenlin Yao, Yong Dai, Shaobo Wang, Donlin Zhou, Lifeng Jin, Xinhua Feng, Pengzhi Wei, Yujie Lin, Zhichao Hu, Dong Yu, Zhengyou Zhang, Jing Nie, Yuhong Liu
arXiv:2311.05374
WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models
Hongliang He, Wenlin Yao, Kaixin Ma, Wenhao Yu, Yong Dai, Hongming Zhang, Zhenzhong Lan, Dong Yu
ACL 2024, arXiv:2401.13919
Adversarial Preference Optimization
Pengyu Cheng, Yifan Yang, Jian Li, Yong Dai, Nan Du
ACL 2024 Findings, arXiv:2311.08045
Everyone deserves a reward: Learning customized human preferences
Pengyu Cheng, Jiawen Xie, Ke Bai, Yong Dai, Nan Du
arXiv:2309.03126
Chunk, Align, Select: A Simple Long-sequence Processing Method for Transformers
Jiawen Xie, Pengyu Cheng, Xiao Liang, Yong Dai, Nan Du
ACL 2024, arXiv:2308.13191
Publications --- pretrain techniques


"Is Whole Word Masking Always Better for Chinese BERT?": Probing on Chinese Grammatical Error Correction
Yong Dai*, Linyang Li*, Cong Zhou*, Zhangyin Feng, Enbo Zhao, Xipeng Qiu, Piji Li, Duyu Tang
Findings of ACL 2022
MarkBERT: Marking Word Boundaries Improves Chinese BERT
Linyang Li ,Yong Dai, Duyu Tang, Xipeng Qiu, Zenglin Xu, Shuming Shi
NLPCC 2023
Publications --- applications of PLMs


Effidit: Your AI Writing Assistant
Shuming Shi, Enbo Zhao, Duyu Tang, Yan Wang, Piji Li, Wei Bi, Haiyun Jiang, Guoping Huang, Leyang Cui, Xinting Huang, Cong Zhou, Yong Dai, Dongyang Ma
arXiv:2208.01815
One model, multiple modalities: A sparsely activated approach for text, sound, image, video and code
Yong Dai*, Duyu Tang*, Liangxin Liu*, Minghuan Tan*, Cong Zhou*, Jingquan Wang*, Zhangyin Feng, Fan Zhang, Xueyu Hu, Shuming Shi
Under review
SkillNet-X: A Multilingual Multitask Model with Sparsely Activated Skills
Zhangyin Feng, Yong Dai, Fan Zhang, Duyu Tang, Xiaocheng Feng, Shuangzhi Wu, Bing Qin, Yunbo Cao, Shuming Shi
ICASSP 2024
Emage: Non-Autoregressive Text-to-Image Generation
Zhangyin Feng, Runyi Hu, Liangxin Liu, Fan Zhang, Duyu Tang,Yong Dai, Xiaocheng Feng, Jiwei Li, Bing Qin, Shuming Shi
arXiv:2312.14988
Exploring and Adapting Chinese GPT to Pinyin Input Method
Minghuan Tan*, Yong Dai*, Duyu Tang, Zhangyin Feng, Guoping Huang, Jing Jiang, Jiwei Li, Shuming Shi
ACL 2022
Pretraining Chinese BERT for Detecting Word Insertion and Deletion Errors
Cong Zhou, Yong Dai, Duyu Tang, Enbo Zhao, Zhangyin Feng, Li Kuang, Shuming Shi
arXiv:2204.12052
When Federated Learning Meets Pre-trained Language Models' Parameter-Efficient Tuning Methods
Zhuo Zhang, Yuanhang Yang, Yong Dai, Lizhen Qu, Zenglin Xu
Findings of ACL 2023
Skillnet-nlu: A sparsely activated model for general-purpose natural language understanding
Fan Zhang, Duyu Tang, Yong Dai, Cong Zhou, Shuangzhi Wu, Shuming Shi
arXiv:2203.03312
Contextualize knowledge bases with transformer for end-to-end task-oriented dialogue systems
Yanjie Gou, Yinjie Lei, Lingqiao Liu, Yong Dai, Chunxu Shen
EMNLP 2022 (Oral)
Publications --- UDA and Graph learning


Adversarial training based multi-source unsupervised domain adaptation for sentiment analysis
Yong Dai, Jian Liu, Xiancong Ren, Zenglin Xu
AAAI 2020
Unsupervised sentiment analysis by transferring multi-source knowledge
Yong Dai, Jian Liu, Jian Zhang, Hongguang Fu, Zenglin Xu
Cognitive Computation
Graph fusion network for text classification
Yong Dai, Linjun Shou, Ming Gong, Xiaolin Xia, Zhao Kang, Zenglin Xu, Daxin Jiang
Knowledge-Based Systems
Experience


05/21 - Now: Research Intern and Researcher, Tencent AI Lab
11/20 - 04/21: Visiting student, Westlake University
09/19 - 10/20: Research Intern, Microsoft STCA nlpg
09/18 - 08/19: Project leader, cooperation project with Nuance
Reviewer Services


ACL: 2022, 2023, 2024
EMNLP: 2022, 2023, 2024
COLING: 2022, 2024
AAAI: 2022, 2023, 2024
IJCAI: 2023
ECAI: 2023
ACM MM: 2023, 2024
TASLP
Knowledge-Based Systems
Neuralcomputing
Neural Network

Last updated: 2023-02-06