专栏名称: 大数据应用
数据应用学院被评为2016北美Top Data Camp, 是最专业一站式数据科学咨询服务机构,你的数据科学求职咨询专家!
目录
相关文章推荐
CDA数据分析师  ·  Deepseek教我自学Python,貌似3 ... ·  2 天前  
大数据文摘  ·  历史性一刻!顶级域名ai.com重定向到De ... ·  3 天前  
软件定义世界(SDX)  ·  麻省理工科技评论:2025年AI五大趋势 ·  3 天前  
天池大数据科研平台  ·  使用DeepSeek必备的10个技巧 ·  3 天前  
51好读  ›  专栏  ›  大数据应用

每日一练 | Data Scientist & Business Analyst & Leetcode 面试题 727

大数据应用  · 公众号  · 大数据  · 2019-10-25 09:13

正文

点击上方 蓝字 会变美




Oct.

24

Data Application Lab 自2017年6月15日起,每天和你分享讨论一道数据科学(DS)和商业分析(BA) 领域常见的面试问题。

自2017年10月4日起,每天再为大家分享一道Leetcode 算法题。


希望积极寻求相关领域工作的你每天关注我们的问题并且与我们一起思考,我们将会在第二天给出答案。

Day

627

DS Interview Question

How does the KNN algorithm work?

BA Interview Question

Customers Who Never Order


Suppose that a website contains two tables, the Customers table and the Orders table. Write a SQL query to find all customers who never order anything.

Table: Customers.

+----+-------+
| Id | Name  |
+----+-------+
| 1  | Joe   |
| 2  | Henry |
| 3  | Sam   |
| 4  | Max   |
+----+-------+

Table: Orders.

+----+------------+
| Id | CustomerId |
+----+------------+
| 1  | 3          |
| 2  | 1          |
+----+------------+


Using the above tables as example, return the following:

+-----------+
| Customers |
+-----------+
| Henry     |
| Max       |
+-----------+

LeetCode Question

Add Binary


Description:

Given two binary strings, return their sum (also a binary string).

Input: a = “11” b = “1”

Output: “100”

Day

626

答案揭晓

DS Interview Question & Answer

What are methods to make a predictive model robust to outliers?

  • Use a model that is resistant to outliers.  Tree-based models are not as affected by outliers as regression models.  For statistical tests, choose non-parametric test instead of parametric test

  • Use a more robust error metric.  For instance, use absolute mean difference instead of mean squared error to reduce the effect of outliers

  • Winsorize the data.  Cap the data at a certain threshold

  • Transform the data.  If the data has a pronounced right tail, use log transform

  • Remove the outliers.  If there are very few of outliers and you are certain that they are anomalies not worth predictin

Reference:https://www.quora.com/What-are-methods-to-make-a-predictive-model-more-robust-to-outliers

BA Interview Question & Answer

Duplicate Emails


Write a SQL query to find all duplicate emails in a table named Person.

+----+---------+
| Id | Email   |
+----+---------+
| 1  | [email protected] |
| 2  | [email protected] |
| 3  | [email protected] |
+----+---------+

For example, your query should return the following for the above table:

+---------+
| Email   |
+---------+
| [email protected] |
+---------+


Note: All emails are in lowercase.


Answer:

select Email
from Person
group by Email
having count(Email) > 1;

Reference:

https://leetcode.com/problems/duplicate-emails/description/s/duplicate-emails/description/







请到「今天看啥」查看全文