Call Us Now!
+91 9884412301 | +91 9600112302
info@credosystemz.com
Credo SystemzCredo Systemz
  • Courses
    • TRENDING TECHNOLOGIES TRAINING
    • RPA TRAINING
    • CLOUD COMPUTING TRAINING
    • BIG DATA TRAINING
    • WEB DEVELOPMENT TRAINING
    • MOBILE APPLICATION TRAINING
    • SOFTWARE TESTING TRAINING
    • MICROSOFT TECHNOLOGIES TRAINING
    • JAVA TRAINING
    • PROJECT MANAGEMENT TRAINING
    • DATA WAREHOUSING TRAINING
    • ORACLE TRAINING
    • DATABASE DEVELOPER TRAINING
    • OTHER TRAININGS
    • TRENDING TECHNOLOGIES
      Python Training Data Science Training Angular Training React JS Training ORACLE PRIMAVERA TRAINING Machine Learning Training Hadoop Training Amazon Web Services Training DevOps Training Azure Training PySpark Training MEAN Stack Training
    • RPA TRAINING
      Blue Prism Training UiPath Training Automation Anywhere
    • CLOUD COMPUTING
      Amazon Web Services Training AWS with Devops Training Azure Training AZ 104 Azure Administrator AZ 204 Azure Developer AZ 300 Azure Architect AZ 303 Azure Architect AZ 400 Azure Devops Google Cloud Platform Salesforce Training OpenNebula Training OpenStack Training OpenSpan Training
    • BIG DATA TRAINING
      Hadoop Training Big Data Analytics Training Spark Training
    • WEB DEVELOPMENT
      Angular Training Node JS Training React JS Training React Native Training Ionic Framework Training MEAN Stack Training PHP Training JavaScript Training CoffeeScript Training Less JS Training Graphics Design Training HTML Training CSS Training
    • MOBILE APPLICATION
      Android Training iOS Training iOS Swift Training Kotlin Training Flutter Dart Training
    • SOFTWARE TESTING
      Manual Testing Training UFT / QTP Training Selenium Training API Testing Training Selenium with Python Training Perfecto Mobile Testing Training ETL Testing Training JMeter Training LoadRunner Training Performance Engineering Big Data Testing Training Protractor Testing Training
    • MICROSOFT TECHNOLOGIES
      Dot Net Training MVC Framework ASP.NET MVC with Angular SharePoint Training Advanced Excel Training Excel Macro Training Azure Training Azure Infrastructure Solutions AZ 300 Azure Architect
    • JAVA TRAINING
      Core Java Training Java 8 Training Java J2EE Training Advanced Java Training Hibernate Training Spring Training Struts Training
    • PROJECT MANAGEMENT
      Oracle Primavera Training Primavera P6 Online Training Microsoft Project Training PMP Training ITIL Training Prince2 Training Scrum Master Training Business Analytics Training
    • DATA WAREHOUSING
      Tableau Training Power BI Training Qlikview Training Qlik Sense Training Informatica Training Microstrategy Training Teradata Training Cognos Training SAS Training
    • ORACLE TRAINING
      Oracle PL/SQL Training Oracle DBA Training Oracle Apps Technical Training Oracle Apps SCM Training Oracle Apps HRMS Training Oracle Apps Finance Training Oracle RAC Training PeopleSoft HCM Training PeopleSoft Finance Training
    • DATABASE DEVELOPER
      MongoDB Training Apache Cassandra Training Sybase Training Informix Training Performance Tuning Training
    • OTHER TRAININGS
      Ethical Hacking Training C C++ Training Unix Shell Scripting Training Tensorflow Training Data Modeling Training Workday Training PEGA Training Digital Marketing Training CCNA Training Arduino Training Elm Training Go Programming Training Rust Programming Training CYBER SECURITY TRAINING BIZTALK SERVER TRAINING Spoken English Course
  • Fresher Courses
    • ANGULAR TRAINING
    • REACT JS TRAINING
    • PYTHON TRAINING
    • JAVA TRAINING
    • SELENIUM TRAINING
    • FULLSTACK TRAINING
  • Placements
    • Career Guidance
      • Job Opportunities
      • Interview Questions
      • Resume Building
    • RECENT PLACEMENTS
  • About Us
    • Online Training
    • Corporate Training
    • Events
    • Reviews
      • Video Reviews
    • Become an instructor
  • Training
    • Trending Technologies Training
    • RPA TRAINING in Chennai
    • Cloud Computing Training
    • Big Data Hadoop Training in Chennai
    • Web Development Training
    • Mobile Application Training
    • Software Testing Training
    • Microsoft Technologies Training
    • Java Training
    • Project Management Training
    • Data Warehousing Training
    • Oracle Training
    • Database Developer Training
    • Other Training
  • Contact Us
  • Courses
    • TRENDING TECHNOLOGIES TRAINING
    • RPA TRAINING
    • CLOUD COMPUTING TRAINING
    • BIG DATA TRAINING
    • WEB DEVELOPMENT TRAINING
    • MOBILE APPLICATION TRAINING
    • SOFTWARE TESTING TRAINING
    • MICROSOFT TECHNOLOGIES TRAINING
    • JAVA TRAINING
    • PROJECT MANAGEMENT TRAINING
    • DATA WAREHOUSING TRAINING
    • ORACLE TRAINING
    • DATABASE DEVELOPER TRAINING
    • OTHER TRAININGS
    • TRENDING TECHNOLOGIES
      Python Training Data Science Training Angular Training React JS Training ORACLE PRIMAVERA TRAINING Machine Learning Training Hadoop Training Amazon Web Services Training DevOps Training Azure Training PySpark Training MEAN Stack Training
    • RPA TRAINING
      Blue Prism Training UiPath Training Automation Anywhere
    • CLOUD COMPUTING
      Amazon Web Services Training AWS with Devops Training Azure Training AZ 104 Azure Administrator AZ 204 Azure Developer AZ 300 Azure Architect AZ 303 Azure Architect AZ 400 Azure Devops Google Cloud Platform Salesforce Training OpenNebula Training OpenStack Training OpenSpan Training
    • BIG DATA TRAINING
      Hadoop Training Big Data Analytics Training Spark Training
    • WEB DEVELOPMENT
      Angular Training Node JS Training React JS Training React Native Training Ionic Framework Training MEAN Stack Training PHP Training JavaScript Training CoffeeScript Training Less JS Training Graphics Design Training HTML Training CSS Training
    • MOBILE APPLICATION
      Android Training iOS Training iOS Swift Training Kotlin Training Flutter Dart Training
    • SOFTWARE TESTING
      Manual Testing Training UFT / QTP Training Selenium Training API Testing Training Selenium with Python Training Perfecto Mobile Testing Training ETL Testing Training JMeter Training LoadRunner Training Performance Engineering Big Data Testing Training Protractor Testing Training
    • MICROSOFT TECHNOLOGIES
      Dot Net Training MVC Framework ASP.NET MVC with Angular SharePoint Training Advanced Excel Training Excel Macro Training Azure Training Azure Infrastructure Solutions AZ 300 Azure Architect
    • JAVA TRAINING
      Core Java Training Java 8 Training Java J2EE Training Advanced Java Training Hibernate Training Spring Training Struts Training
    • PROJECT MANAGEMENT
      Oracle Primavera Training Primavera P6 Online Training Microsoft Project Training PMP Training ITIL Training Prince2 Training Scrum Master Training Business Analytics Training
    • DATA WAREHOUSING
      Tableau Training Power BI Training Qlikview Training Qlik Sense Training Informatica Training Microstrategy Training Teradata Training Cognos Training SAS Training
    • ORACLE TRAINING
      Oracle PL/SQL Training Oracle DBA Training Oracle Apps Technical Training Oracle Apps SCM Training Oracle Apps HRMS Training Oracle Apps Finance Training Oracle RAC Training PeopleSoft HCM Training PeopleSoft Finance Training
    • DATABASE DEVELOPER
      MongoDB Training Apache Cassandra Training Sybase Training Informix Training Performance Tuning Training
    • OTHER TRAININGS
      Ethical Hacking Training C C++ Training Unix Shell Scripting Training Tensorflow Training Data Modeling Training Workday Training PEGA Training Digital Marketing Training CCNA Training Arduino Training Elm Training Go Programming Training Rust Programming Training CYBER SECURITY TRAINING BIZTALK SERVER TRAINING Spoken English Course
  • Fresher Courses
    • ANGULAR TRAINING
    • REACT JS TRAINING
    • PYTHON TRAINING
    • JAVA TRAINING
    • SELENIUM TRAINING
    • FULLSTACK TRAINING
  • Placements
    • Career Guidance
      • Job Opportunities
      • Interview Questions
      • Resume Building
    • RECENT PLACEMENTS
  • About Us
    • Online Training
    • Corporate Training
    • Events
    • Reviews
      • Video Reviews
    • Become an instructor
  • Training
    • Trending Technologies Training
    • RPA TRAINING in Chennai
    • Cloud Computing Training
    • Big Data Hadoop Training in Chennai
    • Web Development Training
    • Mobile Application Training
    • Software Testing Training
    • Microsoft Technologies Training
    • Java Training
    • Project Management Training
    • Data Warehousing Training
    • Oracle Training
    • Database Developer Training
    • Other Training
  • Contact Us

Infosys Hadoop Interview questions

  • Home
  • Infosys Hadoop Interview questions

Infosys – Hadoop Interview Questions

Here is the list of Hadoop Interview Questions which are recently asked in Infosys company. These questions are included for both Freshers and Experienced professionals.


1. Why Hadoop? (Compare to RDBMS)

It is more flexible in storing, processing, and the managing data than traditional RDBMS. Unlike the traditional systems, Hadoop enables the multiple analytical processes on the same data at the same time

2. What would happen if NameNode failed? How do you bring it up?

If NameNode gets fail the whole Hadoop cluster will not work. Actually, there will not any data loss only the cluster work will be shut down, because the NameNode is only the point of contact to all the DataNodes and if the NameNode fails all communication will stop.

3. What details are in the “fsimage” file?

FsImage is a file stored on the OS filesystem that contains the complete directory structure namespace of the HDFS with details about the location of the data on the Data Blocks and the which blocks are stored on the which node. This file is used by the NameNode when it is started.

4. What is SecondaryNameNode?

  • FsImage file :-This file is the snapshot of the HDFS metadata at a certain point of time .
  • Edits Log file :-This file stores the records for changes that have been made in the HDFS namespace . The main function of the Secondary namenode is to store the latest copy of the FsImage and the Edits Log files.

5. Explain the MapReduce processing framework? (start to end)

Hadoop MapReduce is a software framework for distributed processing of large data sets on computing clusters. It is a sub-project of the Apache Hadoop project. In layman's term Mapreduce helps to split the input data set into a number of parts and run a program on all data parts parallel at once.

6. What is Combiner? Where does it fit and give an example? Preferably from your project.

A classic example of combiner in mapreduce is with Word Count program, where map task tokenizes each line in the input file and emits output records as (word, 1) pairs for each word in input line. The reduce() method simply sums the integer counter values associated with each map output key word.

7. What is Partitioner? Why do you need it and give an example? Preferably from your project.

The Partitioner in MapReduce controls the partitioning of the key of the intermediate mapper output. By hash function, key or a subset of the key is used to derive the partition. A total number of partitions depends on the number of reduce task.

8. Oozie – What are the nodes?

Control nodes define job chronology, setting rules for beginning and ending a workflow. In this way, Oozie controls the workflow execution path with decision, fork and join nodes. Action nodes trigger the execution of tasks. Oozie triggers workflow actions, but Hadoop MapReduce executes them.

9. What are the actions in Action Node?

Action nodes are the mechanism by which a workflow triggers the execution of a computation/processing task. Oozie provides support for different types of actions: Hadoop map-reduce, Hadoop file system, Pig, SSH, HTTP, eMail and Oozie sub-workflow.

10. Explain your Pig project?

Pig is a high level scripting language that is used with Apache Hadoop. Pig enables data workers to write complex data transformations without knowing Java. Pig's simple SQL-like scripting language is called Pig Latin, and appeals to developers already familiar with scripting languages and SQL.

Free PDF : Get our updated Hadoop Course Content pdf

11. What log file loaders did you use in Pig?

Fortunately Piggybank, a repository of user-submitted UDF, contains a custom loader function CommonLogLoader to load Apache's Common Log Format files into pig.

12. Hive Joining? What did you join?

Basically, for combining specific fields from two tables by using values common to each one we use Hive JOIN clause. In other words, to combine records from two or more tables in the database we use JOIN clause. However, it is more or less similar to SQL JOIN. Also, we use it to combine rows from multiple tables.

13. Explain Partitioning & Bucketing (based on your project)?

Based on partition keys it divides tables into different parts. Partition keys determine how the data is stored in the table. Bucketing is a technique where the tables or partitions are further sub-categorized into buckets for better structure of data and efficient querying.

14. Why do we need bucketing?

Bucketing in hive is useful when dealing with large datasets that may need to be segregated into clusters for more efficient management and to be able to perform join queries with other large datasets. The primary use case is in joining two large datasets involving resource constraints like memory limits.

15. Did you write any Hive UDFs?

User Defined Functions, also known as UDF, allow you to create custom functions to process records or groups of records. Hive comes with a comprehensive library of functions. There are however some omissions, and some specific cases for which UDFs are the solution.

16. Filter – What did you filter out?

Use filters to temporarily hide some of the data in a table, so you can focus on the data you want to see.

17. HBase?

HBase is a column-oriented non-relational database management system that runs on top of Hadoop Distributed File System (HDFS). HBase provides a fault-tolerant way of storing sparse data sets, which are common in many big data use cases.

18. Flume?

Apache Flume is an open-source, powerful, reliable and flexible system used to collect, aggregate and move large amounts of unstructured data from multiple data sources into HDFS/Hbase (for example) in a distributed fashion via it's strong coupling with the Hadoop cluster.

19. Sqoop?

Apache Sqoop is a tool designed for efficiently transferring bulk data between Apache Hadoop and external datastores such as relational databases, enterprise data warehouses. Sqoop is used to import data from external datastores into Hadoop Distributed File System or related Hadoop eco-systems like Hive and HBase.

20. Zookeeper?

Apache ZooKeeper provides operational services for a Hadoop cluster. ZooKeeper provides a distributed configuration service, a synchronization service and a naming registry for distributed systems. Distributed applications use Zookeeper to store and mediate updates to important configuration information.

Get Answer for all the above questions and place in your dream company

Wants to Become an Expert
in Hadoop?

Know More

TOP MNC's HADOOP INTERVIEW QUESTIONS & ANSWERS

Here we listed all Hadoop Interview Questions and Answers which are asked in Top MNCs. Periodically we update this page with recently asked Questions, please do visit our page often and be updated in Hadoop .

Accenture
Cognizant
Adobe
Wipro
Standard Chartered
Barclays
Amazon
IBM
Cloudera
Infosys
Paypal
Capgemini
Robert Bosch
MindTree
Tech Mahindra
FIS

other top mnc Hadoop Interview Q&A

  • Accenture
  • Amazon
  • Capgemini
  • Cognizant
  • IBM
  • Robert Bosch
  • Adobe
  • Wipro
  • Cloudera
  • MindTree
  • Wipro
  • Infosys
  • Tech Mahindra
  • Standard  Chartered
  • Paypal
  • FIS
  •  
  • Barclays
  •  

INDIA LOCATIONS

New #30,Old #16A,
Rajalakshmi Nagar, Velachery,
Chennai - 600 042.
Mobile: +91 9884412301

Plot No.8, Vinayaga Avenue,
Rajiv Gandhi Salai, Okkiampettai(OMR),
Chennai – 600 097.
Mobile: +91 9600112302

Refund/Cancellation Policy

INTERNATIONAL LOCATIONS

USA
Houchin Drive, Franklin, TN -37064
Tennessee
Email: info@credosystemz.com
Web: www.credosystemz.com
Chat With Us

UAE
Sima Electronic Building,
LLH Opposite,
Electra Street – Abu Dhabi
Email: info@credosystemz.com
Web: www.credosystemz.com
Chat With Us

Follow us on





TRENDING COURSES

  • Python Training in Chennai
  • Data Science Training in Chennai
  • Big Data Hadoop Training in Chennai
  • Machine Learning Training in Chennai
  • Selenium Training in Chennai
  • Angular Training in Chennai
  • Oracle Primavera P6 Online Training
  • Mean Stack Training in Chennai
  • DevOps Training in Chennai
  • Microsoft Azure Training in Chennai
  • GCP Training in Chennai

Copyright 2022 CREDO SYSTEMZ | All Rights Reserved.