专栏名称: 机器学习研究会

机器学习研究会是北京大学大数据与机器学习创新中心旗下的学生组织，旨在构建一个机器学习从事者交流的平台。除了及时分享领域资讯外，协会还会举办各种业界巨头/学术神牛讲座、学术大牛沙龙分享会、real data 创新竞赛等活动。

【推荐】(R/Python)t-SNE聚类算法实践指南

机器学习研究会 · 公众号 · AI · 2017-01-23 20:11

正文

点击上方 “机器学习研究会” 可以订阅哦

摘要

转自：爱可可-爱生活

Introduction

Imagine you get a dataset with hundreds of features (variables) and have little understanding about the domain the data belongs to. You are expected to identify hidden patterns in the data, explore and analyze the dataset. And not just that, you have to find out if there is a pattern in the data – is it signal or is it just noise?

Does that thought make you uncomfortable? It made my hands sweat when I came across this situation for the first time. Do you wonder how to explore a multidimensional dataset? It is one of the frequently asked question by many data scientists. In this article, I will take you through a very powerful way to exactly do this.

What about PCA?

By now, some of you would be screaming “I’ll use PCA for dimensionality reduction and visualization”. Well, you are right! PCA is definitely a good choice for dimensionality reduction and visualization for datasets with a large number of features. But, what if you could use something more advanced than PCA? (If you don’t know PCA, I would strongly recommend to read this article first )

What if you could easily search for a pattern in non-linear style? In this article, I will tell you about a new algorithm called t-SNE (2008), which is much more effective than PCA (1933). I will take you through the basics of t-SNE algorithm first and then will walk you through why t-SNE is a good fit for dimensionality reduction algorithms.

You will also, get hands-on knowledge for using t-SNE in both R and Python.

Read on!

Table of Content

What is t-SNE?
What is dimensionality reduction?
How does t-SNE fit in the dimensionality reduction algorithm space
Algorithmic details of t-SNE

Algorithm
Time and Space Complexity

What does t-SNE actually do?

Use cases

t-SNE compared to other dimensionality reduction algorithm

Example Implementations

Hyper parameter tuning
Code
Implementation Time

Hyper parameter tuning
Code
Implementation Time
Interpreting Results

In R
In Python

Where and when to use

Data Scientist
Machine Learning Competition Enthusiast
Student

Common fallacies

请到「今天看啥」查看全文

推荐文章

机器之心 · 超越思维链？深度循环隐式推理引爆AI圈，LLM扩展有了新维度

21 小时前

宝玉xp · 回复@stockGPT:一句话提示词：网页链接 //@stock-20250212100413

昨天

新智元 · DeepSeek算力卡脖子，高校AI研究遇瓶颈？华为联合15校给出最强解法

昨天

机器之心 · 不卡顿、免费的满血版DeepSeek-R1 API，在无问芯穹这里用上了，更有异构算力鼎力相助

昨天

股妖姬 · 人工智能赋能制药行业：AI制药概念股票潜力解析

3 天前

股妖姬 · 人工智能赋能制药行业：AI制药概念股票潜力解析

3 天前

大呲花 · 今生不忘兄弟情，总是不忘父母恩！

7 年前

互联网观察 · 共享洗车、洗车O2O和全自动洗车之间并非是竞争关系

7 年前

国家人文历史 · 特种兵的前世今生：现实中的特种兵真有银幕上那么厉害？

7 年前

药圈网 · 目前我的2017年执业药师通过率是70%，你呢?

7 年前

债券之星 · 农村中小金融机构债券业务发展调查

7 年前