Tuesday, April 2, 2019

more Apache Hadoop space, big data terminology

  1. Apache Parquet is a data storage format with columns.
  2. Apache Oozie (sounds like Uzi) is a job scheduler for Hadoop.
  3. Apache Sqoop (sounds like scoop) is a command line tool for moving data between relational, SQL database and Hadoop. The name comes from SQL-to-Hadoop.
  4. HDFS is the Hadoop Distributed File System sometimes just called Hadoop File System.

No comments:

Post a Comment