parquet()

Description:

Retrieve data from a Parquet file.

Syntax:

f.parquet([col,…];[filter];[n])

Note:

External library function (See External Library Guide).

 

The function retrieves data from a local Parquet file or a Parquet file in HDFS.

Parameter:

f

A file object.

col

To-be-retrieved fields; return all fields by default.

filter

A filtering condition, which uses a comparison operator such as >,>=,<,<=,=,!=, not,in and like; this parameter becomes invalid when @v option works.

n

A positive integer representing the number of records to be retrieved; return all records when this parameter is absent. This parameter becomes invalid when @c option works.

Option:

@c

Return a cursor.

@m

Retrieve data in multiple threads. When @c option is present, retrieve data through and from multiple cursors.

@v

Column-wise retrieval, which increases efficiency when a large volume of data is involved; default is row-wise retrieval. The column-wise retrieval does not support compound data types.

Return value:

Table sequence

Example:

 

A

 

1

=file("F:/tmp/mytest.parquet")

Open a local Parquet file.

2

=A1.parquet()

Retrieve data from A1’s file and return all fields.

3

=A1.parquet@v()

Retrieve data column-wise.

4

=file("hdfs://localhost:9000/user/hive/warehouse/test1.parquet")

Open a Parquet file in HDFS.

5

=A4.parquet("id","product","store";"id < 20";10)

Retrieve the specified fields and only return the first 10 records meeting the filtering condition.

6

=hive_open("hdfs://localhost:9000","thrift://localhost:9083","hive","asus")

Connect to HIVE database.

7

=hive_table@p(A6)

Find all Parquet tables.

8

=A7.select(tableName=="myParquet")

Select myParquet table.

9

=file(A8.location)

Load a file in HDFS.

10

=A9.parquet(;;10)

Retrieve data from A9’s table and return the first 10 records only.

11

=A9.parquet@cm()

Retrieve data using multiple cursors.

12

=A11.fetch(10)

Same as A10.

13

=hive_close(A6)