Hard Level Questions

Q21. Data Normalization Machine Learning
Data normalization is a vital pre-processing, mapping, and scaling method that helps forecasting and prediction models become more accurate. The current data range is transformed into a new, standardized range using this method.

Q22. How can pandas be used for data analysis?
Pandas is one of the most widely used Python libraries for data analysis. It has powerful tools and data structure which is very helpful in analyzing and processing data. Some of the most useful functions of pandas which are used for various tasks involved in data analysis are as follows:

  • Data loading functions: Pandas provides different functions to read the dataset from the different-different formats like read_csvread_excel, and read_sql functions are used to read the dataset from CSV, Excel, and SQL datasets respectively in a pandas DataFrame.
  • Data Exploration: Pandas provides functions like headtail, and sample to rapidly inspect the data after it has been imported. In order to learn more about the different data types, missing values, and summary statistics, use pandas .info and .describe functions.
  • Data Cleaning: Pandas offers functions for dealing with missing values (fillna), duplicate rows (drop_duplicates), and incorrect data types (astype) before analysis.
  • Data Transformation: Pandas may be used to modify and transform data. It is simple to do actions like selecting columns, filtering rows (lociloc), and adding new ones. Custom transformations are feasible using the apply and map functions.
  • Data Aggregation: With the help of pandas, we can group the data using groupby function, and also apply aggregation tasks like summeancount, etc., on specific columns.
  • Time Series Analysis: Pandas offers robust support for time series data. We can easily conduct date-based computations using functions like resampleshift, etc.
  • Merging and Joining: Data from different sources can be combined using Pandas merge and join functions.

Descriptive Statistics

Inferential Statistics

It gives information about raw data which describes the data in some manner. It makes inferences about the population using data drawn from the population.
It helps in organizing, analyzing, and presenting data in a meaningful manner. It allows us to compare data, and make hypotheses and predictions.
It is used to describe a situation. It is used to explain the chance of occurrence of an event.
It explains already known data and is limited to a sample or population having a small size. It attempts to reach a conclusion about the population.

Q24. What is a correlation?
Correlation is a statistical term that analyzes the degree of a linear relationship between two or more variables. It estimates how effectively changes in one variable predict or explain changes in another. Correlation is often used to assess the strength and direction of associations between variables in various fields, including statistics, and economics.

The correlation between two variables is represented by a correlation coefficient, denoted as “r”. The value of “r” can range between -1 and +1, reflecting the strength of the relationship:

  • Positive correlation (r > 0): As one variable increases, the other tends to increase. The greater the positive correlation, the closer “r” is to +1.
  • Negative correlation (r < 0): As one variable rises, the other tends to fall. The closer “r” is to -1, the greater the negative correlation.
  • No correlation (r = 0): There is little or no linear relationship between the variables.

Q25. Topological Sorting
Approach:

  • Create a stack to store the nodes.
  • Initialize the visited array of size N to keep the record of visited nodes.
  • Run a loop from 0 till N :
  • if the node is not marked True in visited array then call the recursive function for topological sort and perform the following steps:
    • Mark the current node as True in the visited array.
    • Run a loop on all the nodes which has a directed edge to the current node
    • if the node is not marked True in the visited array:
      • Recursively call the topological sort function on the node
    • Push the current node in the stack.
  • Print all the elements in the stack.
     
  • Transform each row of original matrix into required column of final matrix. From the above picture, we can observe that:
  • first row of original matrix——> last column of final matrix
  • second row of original matrix——> second last column of final matrix
  • so on …… last row of original matrix——> first column of final matrix

Explanation:

  • This pseudocode uses a backtracking algorithm to find a solution to the 8 Queen problem, which consists of placing 8 queens on a chessboard in such a way that no two queens threaten each other.
  • The algorithm starts by placing a queen on the first column, then it proceeds to the next column and places a queen in the first safe row of that column.
  • If the algorithm reaches the 8th column and all queens are placed in a safe position, it prints the board and returns true.
  • If the algorithm is unable to place a queen in a safe position in a certain column, it backtracks to the previous column and tries a different row.
  • The “isSafe” function checks if it is safe to place a queen on a certain row and column by checking if there are any queens in the same row, diagonal or anti-diagonal.
  • It’s worth to notice that this is just a high-level pseudocode and it might need to be adapted depending on the specific implementation and language you are using.

Approach 1 (Using Hashing): The idea behind the following approach is

The numbers will be in the range (1, N), an array of size N can be maintained to keep record of the elements present in the given array

WHERE Clause

HAVING Clause

WHERE Clause is used to filter the records from the table based on the specified condition. HAVING Clause is used to filter record from the groups based on the specified condition.
WHERE Clause can be used without GROUP BY Clause HAVING Clause cannot be used without GROUP BY Clause
WHERE Clause implements in row operations HAVING Clause implements in column operation
WHERE Clause cannot contain aggregate function HAVING Clause can contain aggregate function
WHERE Clause can be used with SELECT, UPDATE, DELETE statement. HAVING Clause can only be used with SELECT statement.
WHERE Clause is used before GROUP BY Clause HAVING Clause is used after GROUP BY Clause

Q30. What is the difference between joining and blending in Tableau?
Tableau, joining and blending are ways to combine data from various tables or data sources. However, they are employed in various contexts and have several major differences:

Basis

Joining

Blending

Data Source Requirement Joining is basically used when you have data from the same data source, such as a field foundationrelational database, where tables are already related through primary and foreign keys. Blending is used when we have data from different data sources. such as a combination of Excel spreadsheets, CSV files, and databases. These sources may not have predefined relationships.
Relationships Foundation for joins is the use of common data like a customer ID or product code to establish predetermined links between tables. These relations are developed within the involvesame data source. There is no need for pre-established links between tables while blending. Instead, you link different data sources separately and combine them by matching fields with comparable values.
Data Combining When tables are joined, a single unified data source with a merged schema is produced. A single table with every relevant fields is created by combining the two tables. Data blending maintains the separation of the data sources. At query time, tableau gathers and combines data from several sources to produce a momentary, in-memory blend for visualization needs.
Data Transformation It is useful for data transformation, aggregations and calculations on the combined data. The information from many connected tables can be used to build computed fields. It is only useful for data transformation and calculations. It cannot create calculated fields that involves data from different blended data sources.

P.S: To check the Tiger Analytics  Experiences and other asked questions go through the attached link



Tiger Analytics Interview Questions and Answers for Technical Profiles

Think globally, and impact millions. That’s the driving force behind Tiger Analytics, a data-driven powerhouse leading the AI and analytics consulting world. Tiger Analytics tackles challenges that resonate across the globe, shaping the lives of millions through innovative data-driven solutions. More than just a company, Tiger Analytics fosters a culture of expertise and respect, where collaboration remains supreme. With headquarters in Silicon Valley and delivery centers scattered across the globe, including India’s bustling hubs of Chennai and Hyderabad, Tiger Analytics offers a dynamic environment catering to both in-person and remote teams.

To know more about Tiger Analytics Recruitment Process please go through this attached link

Table of Content

  • Easy Level Questions.
  • Medium Level Questions
  • Hard Level Questions

Cracking the Tiger Analytics data analyst interview is not an easy task, it requires careful planning and the correct tools. But don’t worry, aspiring data analysts! Sharpen your data storytelling abilities with strategic communication prompts, and impress with your knowledge of the company’s cutting-edge tools and projects. This article contains a treasure of important interview questions that have been frequently asked in data analyst interviews at Tiger Analytics and will turn you into a confident data analyst, so be ready to ace the interview and take your career to the next level!

Similar Reads

Easy Level Questions.

Q1. How to swap two numbers without using a temporary variable?The idea is to get a sum in one of the two given numbers. The numbers can then be swapped using the sum and subtraction from the sum....

Medium Level Questions

Q11. What is the difference between SQL DELETE and SQL TRUNCATE commands?...

Hard Level Questions

Q21. Data Normalization Machine LearningData normalization is a vital pre-processing, mapping, and scaling method that helps forecasting and prediction models become more accurate. The current data range is transformed into a new, standardized range using this method....