专栏名称: 大数据应用
数据应用学院被评为2016北美Top Data Camp, 是最专业一站式数据科学咨询服务机构,你的数据科学求职咨询专家!
目录
相关文章推荐
天池大数据科研平台  ·  一文漫谈DeepSeek及其背后的核心技术 ·  昨天  
软件定义世界(SDX)  ·  指标数据体系建设分享 ·  2 天前  
大数据文摘  ·  历史性一刻!顶级域名ai.com重定向到De ... ·  3 天前  
艺恩数据  ·  艺恩数据祝您开工大吉! ·  1 周前  
51好读  ›  专栏  ›  大数据应用

每日一练 | Data Scientist & Business Analyst & Leetcode 面试题 724

大数据应用  · 公众号  · 大数据  · 2019-10-22 09:35

正文


点击上方 蓝字 会变美




Oct.

21

Data Application Lab 自2017年6月15日起,每天和你分享讨论一道数据科学(DS)和商业分析(BA) 领域常见的面试问题。

自2017年10月4日起,每天再为大家分享一道Leetcode 算法题。


希望积极寻求相关领域工作的你每天关注我们的问题并且与我们一起思考,我们将会在第二天给出答案。

Day

624

DS Interview Question

How does a tree decide where to split?

BA Interview Question

Consecutive Numbers


Write a SQL query to find all numbers that appear at least three times consecutively.
+----+-----+
| Id | Num |
+----+-----+
| 1  |  1  |
| 2  |  1  |
| 3  |  1  |
| 4  |  2  |
| 5  |  1  |
| 6  |  2  |
| 7  |  2  |
+----+-----+


For example, given the above Logs table, 1 is the only number that appears consecutively for at least three times.

+-----------------+
| ConsecutiveNums |
+-----------------+
| 1               |
+-----------------+

LeetCode Question

Jump Game


Description:

Given an array of non-negative integers, you are initially positioned at the first index of the array.

Each element in the array represents your maximum jump length at that position.

Determine if you are able to reach the last index.

Input: [2,3,1,1,4]

Output: true

Assumptions:

non-negative integers

Day

623

答案揭晓

DS Interview Question & Answer

What are the primary differences & similarity between classification and regression trees.

Regression trees are used when dependent variable is continuous. Classification trees are used when dependent variable is categorical.

In case of regression tree, the value obtained by terminal nodes in the training data is the mean response of observation falling in that region. Thus, if an unseen data observation falls in that region, we’ll make its prediction with mean value.

In case of classification tree, the value (class) obtained by terminal node in the training data is the mode of observations falling in that region. Thus, if an unseen data observation falls in that region, we’ll make its prediction with mode value.

Both the trees divide the predictor space (independent variables) into distinct and non-overlapping regions. For the sake of simplicity, you can think of these regions as high dimensional boxes or boxes.

Both the trees follow a top-down greedy approach known as recursive binary splitting. We call it as ‘top-down’ because it begins from the top of tree when all the observations are available in a single region and successively splits the predictor space into two new branches down the tree. It is known as ‘greedy’ because, the algorithm cares about only the current split, and not about future splits which will lead to a better tree.

This splitting process is continued until a user defined stopping criteria is reached.

In both the cases, the splitting process results in fully grown trees until the stopping criteria is reached. But, the fully grown tree is likely to overfit data, leading to poor accuracy on unseen data.

BA Interview Question & Answer

Rank Scores


Write a SQL query to rank scores. If there is a tie between two scores, both should have the same ranking. Note that after a tie, the next ranking number should be the next consecutive integer value. In other words, there should be no "holes" between ranks.
+----+-------+
| Id | Score |
+----+-------+
| 1  | 3.50  |
| 2  | 3.65  |
| 3  | 4.00  |
| 4  | 3.85  |
| 5  | 4.00  |
| 6  | 3.65  |
+----+-------+

For example, given the above Scores table, your query should generate the following report (order by highest score):
+-------+------+
| Score | Rank |
+-------+------+
| 4.00  | 1    |
| 4.00  | 1    |
| 3.85  | 2    |
| 3.65  | 3    |
| 3.65  | 3    |
| 3.50  | 4    |
+-------+------+

Answer:

select s.Score, count(distinct t.Score) Rank

from Scores s join Scores t on s.Score <= t.Score

group by s.Id

order by s.Score desc;


Reference:

https://leetcode.com/problems/rank-scores/discuss/53101/Accepted-solution-with-variables







请到「今天看啥」查看全文