Competition-solutions

自然语言处理

大语言模型LLM

kaggle-LLM Science Exam

★★★★★
wikipedia相关知识内容, 由给定prompt问题和五个选项，预测正确答案
关键在于RAG优化, 主流包括采用LLM做或者deberta分类来做
标题和第一句话来召回文章articles，再从文章中召回相关句子
或直接召回整个段落, (separated by '\n')

kaggle-LLM-Detect AI Generated Text

★★★★☆

1st code

LMSYS - Chatbot Arena Human Preference Predictions

★★★★★ 输入格式: prompt+res_a+res_b，或者 prompt+res_a+prompt+res_b

1st code
2nd code
- https://optimi.benjaminwarner.dev/kahan_summation/
3rd discussion

WSDM 2024 Conversational Multi-Doc QA

1st code

KDD-Meta_RAG

多个领域的问答："finance", "music", "movie", "sports", "open"reports

code
code
- trained Sequence Classifiers as routers for Domain and Dynamism based on BGE-M3
code
report
- 三阶段: Question Answering With LLM Parameterized Knowledge (2) Question Answering With External Sources (3) Final Answer Selection.

KDD-Amazon

code

kaggle-AI Mathematical Olympiad

天池-SMP 2023 ChatGLM金融大模型挑战赛

★★★★★
包含pdf解析、SQL生成、意图识别、RAG召回、prompt问答

all top code

2023全球智能汽车AI挑战赛

★★★☆☆
PDF解析, RAG, LLM prompt问答

4th code
6th discussion
13rd code

WSDM-kaggle

2nd code

ATEC2023-科技精英赛—大模型的知识引入

7th code

大语言模型微调数据竞赛

2nd code

CCF AIOps 2024

1st discussion
2nd code

Kaggle-ARC

video

text classification 文本分类/回归

Feedback Prize - English Language Learning

kaggle-CommonLit Readability Prize

kaggle2022-Jigsaw Rate Severity of Toxic Comments

1st code
14th code discussion

kaggle2020-Jigsaw Multilingual Toxic Comment Classification

kaggle2019-Jigsaw Unintended Bias in Toxicity Classification

3rd code
4th code
7th code
10th code
18th code

kaggle2018-Toxic Comment Classification Challenge

19th code
34th code

CCF-BDCI 2019 互联网新闻情感分析

code

kaggle-Google QUEST Q&A Labeling

1st code discussion

named entity recognition 实体命名识别

kaggle-NBME

baseline train inference
baseline train inference

kaggle-feedback

★★★★★

1st code discussion
2nd code discussion
3rd code discussion
4th code discussion
5th code discussion
9th code discussion
11st code discussion
baseline
- https://github.com/abhishekkrthakur/long-text-token-classification
- https://www.kaggle.com/code/cdeotte/pytorch-bigbird-ner-cv-0-615
- https://www.kaggle.com/code/cdeotte/tensorflow-longformer-ner-cv-0-633

人工智能技术创新大赛——商品标题实体识别

小结

CCKS2019-中文短文本的实体链指

1st code
unk code

天池中药说明书实体识别挑战

1st code
4th code

达观杯

CCF金融信息负面及主体判定

https://github.com/xiong666/ccf_financial_negative
https://github.com/Chevalier1024/CCF-BDCI-ABSA
https://github.com/rebornZH/2019-CCF-BDCI-NLP

text match 文本匹配

标签传播/Siamese RNN

kaggle-U.S. Patent Phrase to Phrase Matching

kaggle-TensorFlow 2.0 Question Answering

7st code
collection dicsussion

CCF-房产行业聊天问答匹配

1st code discussion
24th code
unk code

天池-新冠疫情相似句对判定大赛

1st code
4th code
6th code discussion
8th code

Kaggle: Quora Question Pairs

1st code discussion
4th code
128th code
http://www.wuyuanhao.com/2019/02/25/quora-insincere-questions-classification%e6%80%bb%e7%bb%93/

CCF-技术需求与技术成果项目之间关联度

1st code

天池-小布助手对话短文本语义匹配

1st code
4th code

sohu-文本匹配

2nd code
https://zhuanlan.zhihu.com/p/533808475

新冠疫情相似句对判定大赛

https://github.com/zzy99/epidemic-sentence-pair
https://github.com/daniellibin/nCoV-2019-sentence-similarity
https://mp.weixin.qq.com/s/B267GHm16ZIlKkxhOJmMqg
https://github.com/Rowchen/pytorch-for-Text-Matching

信息检索

kaggle-Learning Equality - Curriculum Recommendations

★★★★★
多语言content与topic的文本匹配问题
召回：tfidf + transformer arcface + rule
排序：LGB

EEDI

1st [code](https://github.com/rbiswasfc/eedi-mining-misconceptions
10th [code](https://github.com/shyoulala/Kaggle_Eedi_2024_sayoulala

天池-问天引擎电商搜索算法赛

★★★☆☆
电商场景下，中文匹配. 开源较少

2nd code discussion
13rd code

★★★★☆

1st code
2nd code
3rd code
4th code
8th code
9th code

kaggle-AI4code

★★★★★
notebook根据code cell 顺序markdown cell顺序

1st code
code
https://github.com/louis-she/ai4code
https://www.kaggle.com/code/nickuzmenkov/ai4code-tensorflow-distilbert-baseline

WSDM2023 Pre-training for Web Search

1st code

**KDD Cup 2024-AQA

1st code
- 多fold来分别做这件事。多步迭代选择困难负样本，最终选择Top20. 初始向量模型召回1000/200，200作为困难负样本微调向量模型召回100，召回100的部分作为排序负样本，最终选择20
3rd code
code
7th code
8th code
9th code
code

MRC 信息抽取

kaggle-Tweet Sentiment Extraction

1st code discussion
2nd discussion
3rd code
7th code
base code code

天池-瑞金医院MMC人工智能辅助构建知识图谱

百度aistudio事件抽取比赛

12nd code discussion

**CCKS&百度 2019中文短文本的实体链指))

2019之江杯人工智能大赛电商评论观点挖掘赛道

https://github.com/eguilg/OpinioNet
https://github.com/srtianxia/opinion_mining

科大讯飞2020事件抽取挑战赛

https://github.com/WuHuRestaurant/xf_event_extraction2020Top1
https://github.com/xiaoqian19940510/Event-Extraction
https://lonepatient.top/2022/07/12/gaiic_2022_ner_top10.html
https://github.com/luhua-rain/MRC_Competition_Dureader
https://github.com/YingZiqiang/LES-MMRC-Summary
https://github.com/basketballandlearn/MRC_Competition_Dureader

QA问答

CCKS2019 CKBQA

4th code
9th code

1st code
4th code
5th discussion
36th discussion

中医文献问题生成挑战

https://tianchi.aliyun.com/forum/postDetail?spm=5176.12586969.1002.3.2db024ddZShYhb&postId=10854

https://github.com/Dikea/Dialog-System-with-Task-Retrieval-and-Seq2seq https://github.com/xueyouluo/S2S-in-Production

其他

kaggle 手语识别

2nd discussion

Previous技巧 Next技巧

Last updated 1 month ago