每日一练 | Data Scientist & Business Analyst & Leetcode 面试题 307

大数据应用 · 公众号 · 大数据 · 2018-03-08 11:01

正文

自2017年6月15日起，数据应用学院与你一起温习数据科学（DS）和商业分析（BA）领域常见的面试问题。从2017年10月4号起，每天再为大家分享一道Leetcode算法题。

希望积极寻求相关领域工作的你每天关注我们的问题并且与我们一起思考，我们将会在第二天给出答案。

Day 207

DS Interview Questions

Why is naive Bayes so ‘naive’ ?

BA Interview Questions

R language:

Using the following variable:

x=cbind(c(1,2,3,4,9,7,4,3),c(3,1,2,5,3,6,5,3))

type a for() loop that calculate y=3 8 18 44 126 140 100 84, such that:

y[1]=x[1,1]*x[1,2]

y[2]=x[2,1]*sum(x[1:2,2])

y[3]=x[3,1]*sum(x[1:3,2])

y[8]=x[8,1]*sum(x[1:8,2])

LeetCode Questions

Description:

Given numRows, generate the first numRows of Pascal’s triangle.

Input: 5
Output:

欲知答案如何？请见下期分解！

Day 206 答案揭晓

DS Interview Questions

You are given a data set on cancer detection. You’ve build a classification model and achieved an accuracy of 96%. Why shouldn’t you be happy with your model performance? What can you do about it?

If you have worked on enough data sets, you should deduce that cancer detection results in imbalanced data. In an imbalanced data set, accuracy should not be used as a measure of performance because 96% (as given) might only be predicting majority class correctly, but our class of interest is minority class (4%) which is the people who actually got diagnosed with cancer. Hence, in order to evaluate model performance, we should use Sensitivity (True Positive Rate), Specificity (True Negative Rate), F measure to determine class wise performance of the classifier. If the minority class performance is found to to be poor, we can undertake the following steps:

We can use undersampling, oversampling or SMOTE to make the data balanced.
We can alter the prediction threshold value by doing probability calibration and finding a optimal threshold using AUC-ROC curve.
We can assign weight to classes such that the minority classes gets larger weight.
We can also use anomaly detection

BA Interview Questions

R language:

Using the following variable:

x=as.Date("10/11/2017","%d/%m/%Y")

# type a repeat () loop that increment x until x is equal to 31/12/2017.

repeat{

x=x+1

print(x)

if(x=="2017-12-31") break;

}

Leetcode Questions

Description:

Given two binary trees, write a function to check if they are equal or not.
Two binary trees are considered equal if they are structurally identical and the nodes have the same value.

Input: 两颗树相同
Output: true