java - Accessing files from other filesystems along with hdfs files in a hadoop mapreduce application -
I know that we can reduce the map with a normal Java application. Now in my case there are also HDFs and files on other filesystems to deal with files to reduce jobs. Is it possible that we can use files from other file systems, while simultaneously using files on HDFS. Is this possible?
So basically my intention is that I have a large file that I want to put in HDFS for parallel computing and then compare the blocks in this file with some other files (which I Do not want to put in HDFS code, they need to be used once as a full-length file.
You can use the files in your In order to distribute them, they can open and read the file in the Edit Reduce the job to access the file from the local file system in your map, you distribute those files You can add cache to when you set up your job configuration. MapReduce Framework will ensure that those Files will be accessible by your Mappers. and delete files when you're done. Configure () method (do not read them in
map () because it will be multiple times .
JobConnect Job = New JobConf (); Distributed Cache.AdcacheFile (New URI ("/ myapp / lookup.dat # lookup.dat"), Job);
Configure a public VoID ID (JobConf job) {// Cache archives / files path [local files = distributed cache.GetLocalKeach files (jobs); // open, read and store for use in the map phase}
Comments
Post a Comment