模型在线预估框架(Augur):支持语言化定义特征,配置化加载和卸载模型与特征,支持主流线性模型与TF模型的在线预估;基于Augur可以方便地构建功能完善的无状态、分布式的模型预估服务。为了能方便将BERT特征用于排序模型,Augur团队开发了Model Stacking功能,完美支持了BERT as Feature;这种方式将模型的分数当做一个特征,只需要在Augur服务模型配置平台上进行特征配置即可,很好地提升了模型特征的迭代效率。
[1] Devlin, Jacob, et al. "BERT: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018).[2] 杨扬、佳昊等. 美团BERT的探索和实践[3] 肖垚、家琪等. Transformer在美团搜索排序中的实践[4] Mikolov, Tomas, et al. "Efficient estimation of word representations in vector space." arXiv preprint arXiv:1301.3781 (2013).[5] Peters, Matthew E., et al. "Deep contextualized word representations." arXiv preprint arXiv:1802.05365 (2018).[6] Vaswani, Ashish, et al. "Attention is all you need." Advances in neural information processing systems. 2017.[7] Radford, Alec, et al. "Improving language understanding by generative pre-training."[8] Sun, Yu, et al. "Ernie: Enhanced representation through knowledge integration." arXiv preprint arXiv:1904.09223 (2019).[9] Zhang, Zhengyan, et al. "ERNIE: Enhanced language representation with informative entities." arXiv preprint arXiv:1905.07129 (2019).[10] Liu, Weijie, et al. "K-bert: Enabling language representation with knowledge graph." arXiv preprint arXiv:1909.07606 (2019).[11] Sun, Yu, et al. "Ernie 2.0: A continual pre-training framework for language understanding." arXiv preprint arXiv:1907.12412 (2019).[12] Liu, Yinhan, et al. "Roberta: A robustly optimized bert pretraining approach." arXiv preprint arXiv:1907.11692 (2019).[13] Joshi, Mandar, et al. "Spanbert: Improving pre-training by representing and predicting spans." Transactions of the Association for Computational Linguistics 8 (2020): 64-77.[14] Wang, Wei, et al. "StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding." arXiv preprint arXiv:1908.04577 (2019).[15] Lan, Zhenzhong, et al. "Albert: A lite bert for self-supervised learning of language representations." arXiv preprint arXiv:1909.11942 (2019)[16] Clark, Kevin, et al. "Electra: Pre-training text encoders as discriminators rather than generators." arXiv preprint arXiv:2003.10555 (2020).
[17] Qiu, Xipeng, et al. "Pre-trained Models for Natural Language Processing: A Survey." arXiv preprint arXiv:2003.08271 (2020).
[18] Qiao, Yifan, et al. "Understanding the Behaviors of BERT in Ranking." arXiv preprint arXiv:1904.07531 (2019).[19] Nogueira, Rodrigo, et al. "Multi-stage document ranking with BERT." arXiv preprint arXiv:1910.14424 (2019).[20] Yilmaz, Zeynep Akkalyoncu, et al. "Cross-domain modeling of sentence-level evidence for document retrieval." Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019.
[21] Wenhao Lu, et al. "TwinBERT: Distilling Knowledge to Twin-Structured BERT Models for Efficient Retrieval." arXiv preprint arXiv: 2002.06275
[22] Pandu Nayak.[23] 帅朋、会星等.WSDM Cup 2020检索排序评测任务第一名经验总结[24] Blei, David M., Andrew Y. Ng, and Michael I. Jordan. "Latent dirichlet allocation." Journal of machine Learning research 3.Jan (2003): 993-1022.[25] Jianfeng Gao, Xiaodong He, and JianYun Nie. Click-through-based Translation Models for Web Search: from Word Models to Phrase Models. In CIKM 2010.[26] Huang, Po-Sen, et al. "Learning deep structured semantic models for web search using clickthrough data." Proceedings of the 22nd ACM international conference on Information & Knowledge Management. 2013.[27] SimNet.[28] Guo T, Lin T. Multi-variable LSTM neural network for autoregressive exogenous model[J]. arXiv preprint arXiv:1806.06384, 2018.[29] Hu, Baotian, et al. "Convolutional neural network architectures for matching natural language sentences." Advances in neural information processing systems. 2014.[30] Pang, Liang, et al. "Text matching as image recognition." Thirtieth AAAI Conference on Artificial Intelligence. 2016.[31] 非易、祝升等. 大众点评搜索基于知识图谱的深度学习排序实践.[32] 仲远、富峥等. 美团餐饮娱乐知识图谱——美团大脑揭秘.[33] Liu, Xiaodong, et al. "Multi-task deep neural networks for natural language understanding." arXiv preprint arXiv:1901.11504 (2019).[34] Burges, Christopher JC. "From ranknet to lambdarank to lambdamart: An overview." Learning 11.23-581 (2010): 81.[35] Hinton, Geoffrey, Oriol Vinyals, and Jeff Dean. "Distilling the knowledge in a neural network." arXiv preprint arXiv:1503.02531 (2015).[36] Sanh, Victor, et al. "DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter." arXiv preprint arXiv:1910.01108 (2019).[37] Jiao, Xiaoqi, et al. "Tinybert: Distilling bert for natural language understanding." arXiv preprint arXiv:1909.10351 (2019).