专栏名称: 大数据应用
数据应用学院被评为2016北美Top Data Camp, 是最专业一站式数据科学咨询服务机构,你的数据科学求职咨询专家!
目录
相关文章推荐
51好读  ›  专栏  ›  大数据应用

每日一练 | Data Scientist & Business Analyst 面试题 147

大数据应用  · 公众号  · 大数据  · 2017-08-01 09:01

正文

从6月15日起,数据应用学院将与你一起温习数据科学(DS)和商业分析(BA)领域常见的面试问题。希望积极寻求相关领域工作的你每天关注我们的问题并且与我们一起思考,我们将会在第二天给出答案。

Day 47

DS Interview Questions

Give me some examples about the applications of Naive Bayes Algorithms.

BA Interview Questions

R language:

You have an urn with balls from 1 to 100.

You want to find out how often you need to draw a ball to get number 55.

This is an experiment with replacement – you put the ball back each time you draw.

Simulate 1000 runs of the experiment to get an accurate estimation of the required draws.

Use seed 23 to make the experiment reproducible. Use loops (for, while) for the solution.

欲知答案如何?请见下期分解!

Day 46 答案揭晓

DS Interview Questions

What are the Pros and Cons of Naive Bayes?

Pros:

- It is easy and fast to predict class of test data set. It also perform well in multi class prediction

- When assumption of independence holds, a Naive Bayes classifier performs better comparing to other models like logistic regression and you need less training data.

- It perform well in case of categorical input variables compared to numerical variable(s). For numerical variable, normal distribution is assumed (bell curve, which is a strong assumption).


Cons:

- If categorical variable has a category (in test data set), which was not observed in training data set, then model will assign a 0 (zero) probability and will be unable to make a prediction. This is often known as “Zero Frequency”. To solve this, we can use the smoothing technique. One of the simplest smoothing techniques is called Laplace estimation.

- On the other side naive Bayes is also known as a bad estimator, so the probability outputs from predict_proba are not to be taken too seriously.

- Another limitation of Naive Bayes is the assumption of independent predictors. In real life, it is almost impossible that we get a set of predictors which are completely independent.

BA Interview Questions

We are using the same while loop as in the last exercise.

The loop prints again all numbers up to 35,

but this time it skips a whole vector of numbers: 3,9,13,19,23,29.







请到「今天看啥」查看全文