Business Data Analyst Interview Questions and Answers

1. Why do you want to be a Data Analyst?

Answer:

You should list why you love data, research what the company do with data, explain how you can create insights from data.

2. Where do you see yourself in 5years?

Answer:

Your answer should be how you can use your acquired experience to influence businesses or solve complex issues by creating positive impact for the company. Do not list job titles or dream companies you want to work for.

3. What are Clustered and Non- clustered Indexes in SQL Server? Explain the difference between the two.

Answer:

A Clustered Index deals with data stored in tables with key values, while Non – Clustered Index have data different from tables, usually unrelated data.

Differences:

Clustered

  • Always has an index ID of 0
  • Data is stored in index order
  • Data is faster to read
  • Only 1 per table

Non-Clustered

  • Index ID > 0
  • Does not order data, only columns based on index key specified when the index is created.
  • Faster for inserting and updating data
  • Contains between 249 and 999+ per table.

4. How do you Validate Data?

Data Validation is the process of ensuring data quality and accuracy.

Data Screening: for screening inaccuracy of data

Data Verification:  used to verify data sources and use-case scenarios.

5. List the various steps in an Analytics project.

  • Problem definition
  • Data exploration
  • Data preparation
  • Modelling
  • Validation of data
  • Implementation and tracking

6. What is the difference between Data mining and Data profiling?

Data mining is the act of collecting data from different sources and checking patterns/relationship within the data. These could be unusual records, dependencies, sequence discovery etc.

Data Profiling simply means categorizing data either by data types, values, frequency and so on.

7. What is Logistic Regression?

Logistic Regression is used for classification problems by estimating probabilities using logistic function.

8. What is Linear Regression?

Linear regression is used for finding out the relationship between variables and forecasting.

9. What is Decision tree?

A Decision tree is a schematic description of different decisions followed by chances of occurrence. In statistics it predicts category and continuous response variable.

List some best tools that can be used for data analysis?

Tableau

KNIME

Google Search

Solver

OpenRefine

RapidMiner

NodeXL

Io

Google Fusion tables

Wolfram Alpha’s

11. List out some common problems faced by Data Analyst?

Some of the problems data analysts face with data are:

Missing values

Duplicate entries

Varying value representation

Identifying overlapping data Common misspelling

12. What is an outlier?

An outlier is a term used by analysts to refer values between results with far difference in values. For example, 2450 is far from 9800 for result needed from same data.

There are two types of outliers:

  Univariate

  Multivariate

  13. What is Hierarchical Clustering?

Hierarchical clustering algorithm combines and divides existing groups, creating a hierarchical structure that shows the order in which groups are divided or merged.

14. What is K-mean Algorithm?

K mean is a famous partitioning method. Objects can be classified as belonging to K groups,

In K-mean,

The clusters are spherical:  the data points in a cluster are centered around that cluster.

The variance/spread of the cluster is similar: Each data point belongs to the closest cluster.

15. Mention the key skills required for Data Analysts?

  A Business Data Analyst must have the following skills:

  • Database knowledge
  • Business domain knowledge
  • Communication skill
  • Analytical skill
  • Querying skill
  • Basic Descriptive Analysis
  • Report design
  • Data Visualization skill
  • Insight presentation
  • Data Manipulation
  • Predictive Analysis

16. What are the tools used in Big-data?

       Tools used in Big-data include:

  • Hadoop
  • Hive
  • Pig
  • Flume
  • Mahout
  • Sqoop

17. What is KPI, design of experiments and 80/20 rule?

       KPI: stands for key performance indicator, it consists of any      

       combination of spreadsheets, reports or chats about business process.

       Design of experiments: The initial process used to split data, sample and

       Setup up data for statistical analysis.

       80/20 rule: Means 80 percent of your income comes from 20 percent of

        your clients.

18. What is Map Reduce?

        Map Reduce is a framework to process large data sets, splitting them

        into subsets, processing each subset on a different server and then

        blending results obtained on each.

19. What is Clustering? What are the properties for clustering algorithms?

       Clustering is a data classification method. Clustering algorithm divides a

       data set into natural groups or clusters.

       Properties for clustering algorithm are:

  • Hierarchical or flat
  • Iterative
  • Hard and soft
  • Disjunctive

20. List Statistical methods used that are useful to a Data Analyst?

       Statistical methods useful to Data Analysts, Data Scientists and so on:

  • Bayesian method
  • Markov process
  • Spatial and Cluster processes
  • Imputation techniques
  • Rank Statistics, percentile, outlier’s detection
  • Simplex algorithm
  • Mathematical optimization

21. What is Time Series Analysis?

Time series is usually found in two domains, frequency domain and time domain. Time series usual output process is based on forecast by analyzing the previous data by the help of various methods like exponential smoothing, log-linear regression method, etc.

22. What is a Hash table?

     In computing, a Hash table is a map of keys to values. Also, a data

     structure used to implement an associate array. Hash table is used

     to compute an index into an array of slot, from which desired value can

be fetched.

23. Mention the steps of a data analysis project?

  •    Understanding business requirements
  •    Identifying reliable and verified relevant data sources for business    requirement.
  •    Exploring data sets, manipulating, cleaning the data
  •    Validating Data
  •    Visualizing data to help make business decisions and solve problems

24. What problems do Data Analysts encounter during data analysis?

       Data Analysts face a lot of challenges when analyzing data sets:

  • Duplicate entries of data
  • Poor quality data from unreliable sources
  • Incomplete data

25. What are processes of Data Analysis?

       Data Analysis is the process of collecting, cleansing, interpreting,

       transforming and modelling data to get insights and generate reports for business decisions.

  • Collect data
  • Analyze data
  • Create reports

26. When should a model be retrained?

       A data model should be trained when business data keeps changing on a regular basis. Also be informed that data format does not change.

27. What problems do Data Analysts encounter while performing data

Analysis?

  • Duplicate data entries.
  • Poor quality data acquired from unreliable sources.
  • Data extracted from multiple sources may vary in representation. Once the collected data is combined after being cleansed and organized, the variations in data representation may cause a delay in the analysis process.
  • Incomplete data is another major issue in data analysis process.

28.   What are the characteristics of a good data model?

          For a data model to be considered good and developed, it must have

          these characteristics:

  • Must have predictable performance that estimates accurately.
  • Data should be adaptive and responsive to changes so that it can accommodate the growing business needs from time to time.
  • Data should be capable of scaling in proportion to the changes in data
  • It should be consumable to allow clients reap profitable results.

29.  What is the difference between Data Mining and Data analysis?

        DATA MINING

  • Used to recognize patterns in stored data stored.
  • Results extracted from data mining are not easy to interpret.
  • Mining is performed on clean and well-documented data.

DATA ANALYSIS

  • Used to order and organize a meaningful manner.
  • Results extracted from data analysis are easy to interpret.
  • The analysis of data involves data cleaning. So, data is present in a well-documented format.

30. What is the process of Data Analysis?

          Data Analysis process involves the collecting, cleansing, Interpreting,

          transforming and modeling data to gather insights and generate

reports to gain business profits. 

31. How can you highlight cells with negative values in Excel?

       You can highlight the negative values in Excel by using the

conditional formatting.

       Below are steps that you can follow:

  • Select the cells which you want to highlight with the negative values.
  • Go to the Home tab and click on the conditional formatting option
  • Go to the Highlight cell rules and click on the Less than option.
  • In the dialogue box of Less than, specify the values as 0.

         

32. What is a Pivot table and list the different functions of a Pivot Table?

       A Pivot table is a simple feature in Microsoft Excel which allows you to

       Quickly summarize huge data sets. As an easy to use tool, it requires

       dragging and dropping rows/column headers to create reports.

       A Pivot table is made up of four different sections:

  • Value areas: Values are reported in this area
  • Rows area: The headings which are present on the left of the values.
  • Column area: The headings at the top of the value area makes the columns area.
  • Filter area: This is an optional filter used to drill down in the data set.

33. Can you make a Pivot table from multiple tables?

         Yes, we can create one pivot from multiple tables when there is a

         Connection between these tables.

 34. What are the most common questions you should ask a client before

         creating a dashboard?

        Answer depends on case-to-case basis. Here are some common

       questions to ask when creating a dashboard in excel

  • Purpose for building the dashboard
  • Different data sources
  • Usage of the excel dashboard
  • The frequency at which the dashboard needs to be updated.
  • The version of Microsoft office the client uses.

35. What steps can you take to handle slow Excel workbook?

        There are many ways to do this. Some steps have been listed below:

  • Maintain all referenced data in a single sheet.
  • Try using manual calculation mode.
  • Try to avoid using entire rows or columns in references.
  • Often use excel tables and named ranges.
  • Convert all the used formulas to values.

36. What is Interleaving in SAS?

       Interleaving in SAS means combining individual sorted SAS data sets

        into one sorted data set. You can interleave data sets using a SET

         statement along with a BY statement.

 37. What is the basic syntax for writing code in SAS?

       The basic style of writing code in SAS is as follows:

  • Write the DATA statement which will basically name the dataset.
  • Write the INPUT statement to name the variables in the data set.
  • All the statements should end with a semi-colon.
  • There should be a proper space between word and a statement.

38.  What are the different types of joins?

Inner Join: used to return all rows from multiple tables with matching

values.

 Left Join: used to return all records from the left table with matching  

 records from the right table.

 Right Join: used to return all records from the right table with matching

 records from the left table.

  Full Join: Returns all records with a match in all tables

39. List the core competencies of a Business Data Analyst.

  •  Analytical thinking and decision making
  •  Business and industry knowledge
  •  Business process management
  •  Technical and soft skills
  •   Problem solving and negotiating skills

  40. What is a Feasibility study?

          The requirements and problems of a business/project should be

          studied and understood by a Business Data Analyst, and he or she

          should set a scope for the business problem. Feasibility study is

          identifying the possibility (success rate) of the proposed idea for a

          business problem. It helps identify new opportunities and focus on the

          project.

          A career path in Data Analytics is very rewarding, if you need training

         in your data science technical track career, check our contact us page           to learn more about DataCatchup career track now.

Please follow and like us:

Leave a Reply

Your email address will not be published. Required fields are marked *