Distributed Cache is a facility provided by the Hadoop MapReduce framework. It caches files when...HDFS: Read & Write Commands using Java API
Hadoop Distributed File SystemIt is the most important component of the Hadoop Ecosystem. HDFS is...Hadoop : OOZIE
It is a workflow scheduler system to manage Apache Hadoop jobs. It combines multiple jobs...Hadoop MapReduce: Counters & Joins
MapReduce is the core component of Hadoop which provides data processing. MapReduce works by...Hadoop PIG
Hadoop Pig is nothing but an abstraction over MapReduce. While it comes to analyze large sets of...Hadoop PIG : Installation
Pig Installation Before we start with the actual process, change user to 'hduser' (user used for...Hadoop Setup - Installation & Configuration
Requirement: Ubuntu installed and running Java Installed Perform the following steps: 1)...Hadoop: Features, Components, Cluster & Topology
Apache HADOOP is a framework used to develop data processing applications which are executed in a...Hadoop: What is Sqoop and Flume?
Sqoop is a tool designed to transfer data between Hadoop and relational database servers. Sqoop...Introduction to BIG DATA
'Big Data' is a data but huge in size. It is also described as a collection of data that is huge...Limitations of Hadoop
Various limitations of Apache Hadoop are given below along with their solution- 1. Issues with...Understanding Hadoop High Availability Feature
Objective This blog provides you the description of Hadoop HDFS High Availability feature. In...What is MapReduce? How it Works?
MapReduce is the processing layer of Hadoop. MapReduce programming model is designed for...