每日一练 | Data Scientist & Business Analyst & Leetcode 面试题 318

大数据应用 · 公众号 · 大数据 · 2018-03-22 09:11

正文

自2017年6月15日起，数据应用学院与你一起温习数据科学（DS）和商业分析（BA）领域常见的面试问题。从2017年10月4号起，每天再为大家分享一道Leetcode算法题。

希望积极寻求相关领域工作的你每天关注我们的问题并且与我们一起思考，我们将会在第二天给出答案。

Day 218

DS Interview Questions

How is KNN different from k-means clustering?

BA Interview Questions

SQL: Write a query identifying the type of each record in the TRIANGLES table using its three side lengths. Output one of the following statements for each record in the table:

Equilateral: It's a triangle with sides of equal length.
Isosceles: It's a triangle with sides of equal length.
Scalene: It's a triangle with sides of differing lengths.
Not A Triangle: The given values of A, B, and C don't form a triangle.

LeetCode Questions

Description:

Given a 2D board and a word, find if the word exists in the grid.

The word can be constructed from letters of sequentially adjacent cell, where "adjacent" cells are those horizontally or vertically neighboring. The same letter cell may not be used more than once.

Input: board = [ ['A','B','C','E'], ['S','F','C','S'], ['A','D','E','E'] ] word = "ABCCED"

Output: true

欲知答案如何？请见下期分解！

Day 217 答案揭晓

DS Interview Questions

What is latent semantic indexing? What is it used for? What are the specific limitations of the method?

Latent semantic indexing:

Latent Semantic Indexing is Principal Component Analysis (PCA) in document analysis, it is simply applying PCA to (the variance-covariance matrix) of X and the principal directions (eigenvectors) now define topics.
It uses a term-document matrix X that describes the occurrences of terms in documents. Rows correspond to terms(vocabulary) and columns correspond to documents. Elements of X are typically weights that are proportional to the number of times a term appears in a document, with rare terms upweighted to reflect the relative importance. The matrix X is usually large and sparse.
LSA finds a low-rank approximation of the original term-document matrix, which merges the dimensions of terms that have similar meanings.

What is it used for:

LSA can be applied to compare documents in the low-dimensional space (document classification), find relations between terms (synonym identification), find matching documents by translating a query of terms to low-dimensional space (information retrieval), and etc.

Limitations include:

The resulting dimensions can be difficult to interpret
LSA cannot capture multiple meanings of a word
The terms of a document are represented unordered
Eigenvectors can have negative components

Reference: https://en.wikipedia.org/wiki/Latent_semantic_analysis

BA Interview Questions

SQL: Query all columns for all American cities in CITY with populations larger than 100000. The CountryCode for America is USA.

The CITY table is described as follows:

SELECT *

FROM CITY

WHERE

COUNTRYCODE = 'USA'

AND POPULATION > 100000;

Leetcode Questions

Description:

Given a linked list, swap every two adjacent nodes and return its head.

Input: 1->2->3->4

Output: 2->1->4->3

Assumptions:

Your algorithm should use only constant space.

You may not modify the values in the list, only nodes itself can be changed.

每日一练 | Data Scientist &amp; Business Analyst &amp; Leetcode 面试题 318

正文

请到「今天看啥」查看全文

每日一练 | Data Scientist & Business Analyst & Leetcode 面试题 318