Tag Archives: YARN

YARN

Yarn (Yet Another Resource Negotiator) is the new version of map reduce. We call it also mapreduce v2. Yarn has been developed to address several limitations of the original map reduce. Limitation in term of managing the hardware resource for an application: with the first version it is impossible to launch two applications/jobs at the same time. First of all, we need to finish the first application and then launch the second. With YARN we can now execute two applications/jobs at the same time on the same cluster. It gives the resource management tools to allocate and manage the hardware resource of each node to a specific application.

YARN architecture
YARN architecture

YARN splits the responsibility of the Jobtracker into separate entities: the job scheduling and the tasks progressing monitoring.

As you can see in figure “YARN architecture”, there is a resource manager to manage the use of the resources cross the cluster and an application master to manage a running application on the cluster. This architecture also defines the container: it represents a specific quantity of memory that can be used to execute an application. The main idea to better manage the resource on the cluster is that the application master negotiates a number of containers with the resource manager. The containers are under the supervision of the node manage. It is a daemon on each node that ensures that each container does not use more resource than it is authorized.