Monday, August 10, 2015

Reading a parquet files using parquet tools

// Building a parquet tools
git clone https://github.com/Parquet/parquet-mr.git
cd parquet-mr/parquet-tools/
mvn clean package -Plocal
// know the schema of the parquet file
java -jar parquet-tools-1.6.0rc3-SNAPSHOT.jar schema sample.parquet
// Read parquet file
java -jar parquet-tools-1.6.0rc3-SNAPSHOT.jar cat sample.parquet
// Read few lines in parquet file
java -jar parquet-tools-1.6.0rc3-SNAPSHOT.jar head -n5 sample.parquet
// know the meta information of the parquet file
java -jar parquet-tools-1.6.0rc3-SNAPSHOT.jar meta sample.parquet

No comments: