Adobe – Hadoop Interview Questions
Here is the list of Hadoop Interview Questions which are recently asked in Adobe company. These questions are included for both Freshers and Experienced professionals.
1. What is Fact Table and Dimension Table (When I said that I am aware of Dataware house concept)
A fact table works with the dimension tables. A fact table holds the data to be analyzed, and a dimension table stores data about the ways in which the data in the fact table can be analyzed. Thus, the fact table consists of two types of columns.
2. What type of data we should store in Fact table and dimension table?
Fact table is defined by their grain or its most atomic level whereas Dimension table should be wordy, descriptive, complete, and quality assured. Fact table helps to store the report labels whereas Dimension table contains detailed data.
3.There is a string in a Hive column, how you will find the count of a character. For example, the string is “hdfstutorial”, then how to count number of ‘t’.
INSTR function in the Apache Hive helps in finding the position of a substring in a string. It returns only the first first occurrence of the given input. Returns null if either of the arguments are null and returns 0 if the substring could not be found in the string.
4. There is a table in Hive, and the columns are student id, score and year. Find the top 3 students based on the score in each year.
5. There is a table having 500 Million records. Now you want to copy the data of that table in some other table, what best approach you will choose.
6. You have 10 tables, and there are certain join conditions you have to put and then the result needs to be updated in another table. How you will do it and what best practice you will follow
Relations are possible between the 10 tables, but this is just considering relations between tables as it will make that the number much bigger. If we make the restriction that each table may appear at most once, there are 2^10-1 = 1023 possibilities.
Free PDF : Get our updated Hadoop Course Content pdf
7. Which all analytical functions you have used in Hive
Hive provides the following set of the analytical functions are:- RANK.
- DENSE_RANK.
- ROW_NUMBER.
- PERCENT_RANK.
- CUME_DIST.
- NTILE.
8. Why we use bucketing
Bucketing in hive is useful when dealing with the large datasets that may need to be segregated into clusters for more efficient management and to be able to perform join queries with the other large datasets. The primary use case is in joining two large datasets involving resource constraints like the memory limits.
9. what is actually hapeening in bucketing and when we apply
Bucketing in hive is useful when dealing with the large datasets that may need to be segregated into clusters for more efficient management and to be able to perform join queries with the other large datasets. The primary use case is in joining two large datasets involving resource constraints like the memory limits.
10. How bucketing is different from Partition and why we use it
Bucketing decomposes data into more manageable or equal parts. With partitioning, there is a possibility that you can create the multiple small partitions based on the column values. If you go for bucketing, you are restricting number of buckets to store the data. This number is defined by during the table creation scripts.
11. If you have a bucketed table then can you take those records to Sqoop directly
So you would have to import the data to an intermediate table and then insert into the bucketed table.
Get Answer for all the above questions and place in your dream company
Wants to Become an Expert
in Hadoop?
Know MoreTOP MNC's HADOOP INTERVIEW QUESTIONS & ANSWERS
Here we listed all Hadoop Interview Questions and Answers which are asked in Top MNCs. Periodically we update this page with recently asked Questions, please do visit our page often and be updated in Hadoop .