I have spent many hours to get hive run locally in MacOS but couldn’t make it. Last time I get to the very end of this tutorial except the last step. This time I proceed a little further but bugs keep poping up:
$ hdfs dfs -mkdir /user
Cannot create directory /user. Name node is in safe mode.
$ hdfs dfsadmin -safemode leave
Safe mode is OFF
$ hdfs dfsadmin -safemode get
Safe mode is ON
Anyway, I try to record every step of my journey.
According to Quora, the minimum requirement for a local machine is 500 GB. This may be reason I failed.
Download tar files from respective official sites:
- oracle Java SE
- hadoop: http://hadoop.apache.org/releases.html
- hive
- derby
bash command line refresh
export varname=value # export a variable to environment
env # disply all environment variables, note that different shells have different default env variables
cat .bash_profile # see a file in command window
less .bash_profile # another way to see, less overwhelming
echo $varname # display variable value, note the dollar sign
eval $fun # evaluate function
history # display command history
hash # display command history and path
pwd # equal to echo $PWD which is a buit-in variable
let arg1=2 # define variable value, space is forbidden
let arg2=$arg1**3
echo $arg2
printf "result=%d\n" $arg2
Most compiler/commands are stored at
/usr/local/bin
.path setup
# setup environment for hadoop
export HADOOP_HOME=/usr/local/hadoop-2.8.0
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
# setup environment for hive
export HIVE_HOME=/usr/local/apache-hive-2.1.1-bin
export PATH=$PATH:$HIVE_HOME/bin
export CLASSPATH=$CLASSPATH:/usr/local/hadoop-2.8.0/lib/*:.
export CLASSPATH=$CLASSPATH:/usr/local/hive-2.1.1/lib*:.
# setup environment for Derby
export DERBY_HOME=/usr/local/db-derby-10.13.1.1-bin
export PATH=$PATH:$DERBY_HOME/bin:$HIVE_HOME/bin
export CLASSPATH=$CLASSPATH:$DERBY_HOME/lib/derby.jar:$DERBY_HOME/lib/derbytools.jar
hadoop initialize and commands
cd /usr/local/hadoop-2.8.0/
hdfs namenode -format
sbin/start-dfs.sh # start Hadoop file system
# open http://localhost:50070/
sbin/start-yarn.sh
# open http://localhost:8088/
hadoop fs -mkdir /tmp
hadoop fs -mkdir -p ~/hive/warehouse #also make pararent dir
hadoop fs -chmod 777 /user # change permission of file or folder
hdfs dfs -mkdir /user/hadoop # make folder
hdfs dfs -put a.csv /user/hadoop/a.csv # move from local to HDFS
hdfs dfs -ls /user/hadoop # list content of a folder
hdfs dfs -du /user/hadoop/ # display utilization (size)
hdfs dfs -get /user/hadoop/ /home/ # get from HDFS to local
hdfs dfs -cp /user/hadoop/folderA /user/hadoop/folderB # copy
hdfs fs -rm -r <directory> # remove
Hive metastore_db initialize
schematool -initSchema -dbType derby # may fail
mv metastore_db metastore_db.tmp #
schematool -initSchema -dbType derby #rerun
hive
show tables;
create table myGod (name string);
hive metastore configuration
add follows to hive-site.xml
<property>
<name>system:java.io.tmpdir</name>
<value>/usr/local/apache-hive-2.1.1-bin /iotmp</value>
<description/>
</property>
Hive use Derty database as default. You may change it to mySQL database by following the above link.