hadoop学习(三)

    技术2022-06-26  55

    Hadoop Core Server Configuration Default Shared File System URI and NameNode Location for HDFS The default value is file:///, which instructs the framework to use the local file system. An example of an HDFS URI is hdfs://NamenodeHost[:8020]/, that informs the framework to use the shared file system(HDFS). JobTracker Host and Port The URI specified in this parameter informs the Hadoop Core framework of the JobTracker’s location. The default value is local, which indicates that no JobTracker server is to be run, and all tasks will be run from a single JVM.The JobtrackerHost is the host on which the JobTracker server process will be run. This value may be altered by individual jobs. Maximum Concurrent Map Tasks per TaskTracke r The mapred.tasktracker.map.tasks.maximum parameter sets the maximum number of map tasks that may be run by a TaskTracker server process on a host at one time. One TaskTracker,one Map Task;one Map Task,many threads; This value may be altered by setting the number of threads via the following:     JobConf.set("mapred.map.multithreadedrunner.threads", threadCount); Maximum Concurrent Reduce Tasks per TaskTracker Reduce tasks tend to be I/O bound, and it is not uncommon to have the per-machine maximum reduce task value set to 1 or 2. JVM Options for the Task Virtual Machines During the run phase of a job, there may be up to mapred.tasktracker.map.tasks.maximum map tasks and mapred.tasktracker.reduce.tasks.maximum reduce tasks running simultaneously on each TaskTracker node, as well as the TaskTracker JVM. Enable Job Control Options on the Web Interfaces Both the JobTracker and the NameNode provide a web interface for monitoring and control. By default, the JobTracker provides web service on http://JobtrackerHost:50030 and the NameNode provides web service on http://NamenodeHost:50070.

    interprocess communications (IPC) Configuration Requirements Network Requirements Hadoop Core uses Secure Shell (SSH) to launch the server processes on the slave nodes.Hadoop Core requires that passwordless SSH work between the master machines and all of the slave and secondary machines. Advanced Networking: Support for Multihomed Machines dfs.datanode.dns.interface: If set, this parameter is the name of the network interface to be used for HDFS transactions to the DataNode. The IP address of this interface will be advertised by the DataNode as its contact address. dfs.datanode.dns.nameserver: If set, this parameter is the hostname or IP address of a machine to use to perform a reverse host lookup on the IP address associated with the specified network interface. rsync unix 远程同步命令。可以将配置文件同步的其他node上。


    最新回复(0)