专栏名称: 极市平台

极市平台是由深圳极视角推出的专业的视觉算法开发与分发平台，为视觉开发者提供多领域实景训练数据库等开发工具和规模化销售渠道。本公众号将会分享视觉相关的技术资讯，行业动态，在线分享信息，线下活动等。网站: http://cvmart.net/

一文网尽CV/Robotics顶会论文常用高级词汇/句式！

极市平台 · 公众号 · 科技创业科技自媒体 · 2024-11-18 22:00

主要观点总结

本文总结了CV/Robitcs领域的顶会文章中常出现的高级词汇、句式。涉及词汇如“leverage”、“heterogeneous”、“reason”等，以及句型如“This end-to-end philosophy has led to significant advances for...”等。文章旨在帮助读者提高论文写作能力，更好地表达论文内容。

关键观点总结

关键观点1: 高级词汇总结

文章中总结了CV/Robitcs领域常见的专业词汇，如“leverage”、“heterogeneous”、“adverse”等，并给出了每个词汇的出处和用法示例。

关键观点2: 句式结构特点

文章介绍了常见的CV论文句式结构特点，如表示转折、强调实验重要性等，并给出了具体的例句和出处。

关键观点3: 文章目的和价值

文章旨在帮助读者学习和掌握CV论文写作中常用词汇和句式，提高论文写作能力和表达能力，从而更好地表达论文内容。

正文

↑ 点击蓝字关注极市平台

作者丨叶小飞@知乎（已授权）

来源丨https://zhuanlan.zhihu.com/p/415926905

编辑丨极市平台

极市导读

写论文也有一些小套路？本文总结了尽可能多地CV/Robitcs领域的顶会文章中常出现的比较高级的词汇、好用的句式。 >> 加入极市CV技术交流群，走在计算机视觉的最前沿

初入学术圈的小伙伴在写论文时一定有过一个烦恼：看大佬们的论文写的行云流水、文笔华丽，顿时激情澎湃，到了自己下笔却总是词不达意、句式散乱，翻来覆去就是那么几个重复的词，仿佛飞哥（我是小飞哥，不要搞混）附身.

其实我在去年刚开始读博时也有这个困扰，写文章曾经自闭到抓耳挠腮、撸秃秀发。后来我发现， 写文章这事，光去读还不行，看到好的词语、句式一定要记下来，然后到自己写论文不知该如何写时拿来参考 ，效果非常好：读博前我一二作最好也就是投个workshop, ICPR这种水会，后来用了这种方式后写论文忽然得心应手，一路连中ITSC、IOTJ、ICRA、ECCV这些质量较高的会议和期刊。所以在这篇文章里，尽可能多地将CV/Robitcs领域的顶会文章中常出现的比较高级的词汇、好用的句式总结下来， 并记录下每个词的出处和原文用法 。

高级CV词汇：

leverage: 使用，利用，这个非常常见。例句：By leveraging the Vehicle-to-Vehicle (V2V) communication technology, different Connected Automated Vehicles (CAVs) can share their sensing information and thus provide multiple viewpoints for the same obstacle to compensate each other [1].

heterogeneous: 异向的，不同的，在graph里经常用到，例句：V2X-Vit is composed of alternating layers of heterogeneous multi-agent self-attention and multi-scale window self-attention, capturing inter-agent interaction and per-agent spatial relationships [2].

adverse: 不利的，例句：Our experiments show that multi-camera configurations are critical in overcoming adverse conditions in large-scale outdoor scenes [3].

streamline: 精简，例子：We streamline the training pipeline by viewing object detection as a direct set prediction problem [4].

reason, 常用在关系推导，比如：DETR reasons about the relations of the objects and the global image context to directly output the final set of predictions in parallel [4].

bypass, 跳过

invariant, 不变的，例子：a point cloud is just a set of points and therefore invariant to permutations of its members(PointNet). Since the transformer architecture is permutation-invariant [4].

omit，忽略

auxiliary loss: 辅助损失

philosophy: 理念 This end-to-end philosophy has led to significant advances for ... [4].

fidelity: 逼真度，常用于simulation里，例句：However, this can be a complex and costly endeavor that requires constructing realistic virtual worlds and developing high- fidelity sensor simulators [5].

full autonomy stack: 全栈自动驾驶系统，包含perception, localization, planning, control, [5].

gauge: 测量， downstream task/experiment: 自动驾驶下游任务，常见的如motion planning和control, 例句：To gauge the usefulness of using our simulations to test motion planning, we conduct downstream experiments with two motion planners [5].

sparse sensor measurements: 稀疏的传感器数据, 这个常在检测遮挡物体的话题里出现 [1][2]。

design ethos: 设计理念，例子：The design ethos of DETR easily extend to more complex tasks [3].

ingredients: 要素, 例子：Two ingredients are essential for direct set predictions in detection [3].

reliance: 依赖，例子：We show that this reliance on CNNs is not necessary and a pure transformer applied directly to sequences of image patches can perform very well on image classification tasks [6].

fashion: 风格，例子：We train the model on image classificaECCV2022tion in supervised fashion [6].

trump: 打败，例子：We find that large scale training trumps inductive bias [6].

inductive biases: 归纳偏置，可以理解为对现实现象进行观察后进行一些总结，对模型进行约束，比如CNN是可以保持spatial relationship信息的，这是因为我们观察到图片每个pixel之间的位置关联很重要，所以这样设计了CNN。例句：Transformers lack some of the inductive biases inherent(这个词也常出现，表示内在属性） to CNNs, such as translation equivariance and locality.

alternating: 交互的，经常用来表示某几种module在模型里循环交叉出现，例子：The Transformer encoder consists of alternating layers of multiheaded selfattention and MLP blocks [6].

representation learning capabilities: 模型学习鲁棒feature的能力，经常出现。

appetite: 需求，例子：This appetite for data has been successfully addressed in natural language processing (NLP) by self-supervised pretraining [7].

harness：驾驭，经常用于某种模型或方法没有被完全利用好，例如：However, its full power remains to be harnessed through the advent of new smart technologies, such as self-driving vehicles [8].

indispensable: 不可或缺的，例如：Detecting objects such as cars and pedestrians in 3D plays an indispensable role in autonomous driving [9].

alleviate: 减轻，例如：To alleviate the local misalignment, we use a 2D-3D bounding box consistency constraint [10].

account for: 占据，例如：Consequently, errors in depth estimation account for the major part of the gap between pseudo-lidar and lidar-based detectors [10].

consensus: 共识，例如：Hence, we propose a novel neural reasoning (这个词再次出现） framework that learns to communicate, to estimate potential errors, and finally, to reach a consensus about those errors [11].

surrogate: 代理，常用在adversarial attack中，例子：. In this setting, the attacker trains a surrogate model that imitates the target model [12].

degrade: 降低，损害，和performance联系紧密，常见的还有damage, decrease, drop. 例句：In addition, we also aim to minimize the intersection-over-union (IoU) of the bounding box proposals to further degrade performance by producing poorly localized objects [12].

hallucinating/hallucination: 原意是幻想，在CV里常用于amodal detection, 表示输入信息不直接含有某物体，但通过网络推理可以判断出物体在那。例句：We dub this problem amodal scene layout estimation, which involves hallucinating scene layout for even parts of the world that are occluded in the image [13].

defacto: 实际的：这个单词本身是拉丁语，但很多论文会用到，而且要用斜体。例句：We thus argue that MetaFormer is our de facto need for vision models that deserved more future research [14].

empirically 实操地，一般是想说用实验证明的意思，例句：we empirically demonstrate that the success of the transformer model is largely attributed to the MetaFormer architecture [14].

rudimentary 基本的，初级的。Currently, rudimentary resizing methods such as nearest neighbor, bilinear, and bicubic are among the top adopted image resizers visual recognition systems. [16]

off-the-shelf 现成的 The proposed resizer, therefore, can be an alternative to the offthe-shelf resizers to effectively reduce the expected drop in the recognition performance. [16]

常见CV句型：

表示xxx模型很流行：xxx have become the model of choice in some area.

解決某個問題：To ameliorate these issues

为了某个目的：Towards the goal of xxx

利用優點：leveraged the advantages of

表示转折：whereas

表示有些问题被解决，但还有一些被人们忽视：While much of the research into xxx(the certain area) have focused on (已解决的问题）， an equally important yet underexplored problem is how to （还没有被解决的问题）。例句：While much of the research into microscopic traffic simulation have focused on modeling actors’ behaviors, an equally important yet underexplored problem is how to generate realistic snapshots of traffic scenes. （出处：SceneGen: Learning to Generate Realistic Traffic Scenes， CVPR)

表示实验证明某种想法的重要性：Our experiments emphasize / demonstrate the necessity / importance of xxx [3].

以xxx的形式：in the context of，例子：In this paper, we consider the collaborative perception in the context of a heterogeneous multi-agent system [15].

表示某个元素起到关键作用：xxx play a critical role / the success of something is attributed to xxx

References

[1] OPV2V: An Open Benchmark Dataset and Fusion Pipeline for Perception with Vehicle-to-Vehicle Communication ( ICRA2022 )

[2] V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision Transformer ( ECCV2022 )

[3] Asynchronous Multi-View SLAM (ICRA 2021)

[4] DETR: End-to-End Object Detection With Transformers (ECCV2020)

[5] Testing the Safety of Self-driving Vehicles by Simulating Perception and Prediction (ECCV2020)

[6] AN IMAGE IS WORTH 16X16 WORDS: TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE (ICLR2021)

[7] Masked Autoencoders Are Scalable Vision Learners (CVPR2022)

[8] Multimodal Trajectory Predictions for Autonomous Driving using Deep Convolutional Networks (ICRA2019)

[9] Pseudo-LiDAR++: Accurate Depth for 3D Object Detection in Autonomous Driving (ICLR2020)

[10] Is Pseudo-Lidar needed for Monocular 3D Object detection (ICCV2021)

[11] Learning to Communicate and correct pose error (CORL2020)

[12] Adversarial Attacks On Multi-Agent Communication (ICCV2021)

[13] MonoLayout: Amodal scene layout from a single image (WACV2020)

[14] MetaFormer Is Actually What You Need for Vision (CVPR2022)

[15] Learning Distilled Collaboration Graph for Multi-Agent Perception (NeurIPS 2021)

[16]Learning to Resize Images for Computer Vision Tasks (ICCV 2021)

公众号后台回复“ 数据集 ”获取100+深度学习各方向资源整理