Description:
Read a file stored in a local Spark database or a Spark standalone.
Syntax:
spark_read(con,sfile,k:v,...)
Note:
External library function (See External Library Guide).
The function reads content of a file stored in a local Spark database or a Spark standalone, and returns result as a table sequence.
Parameter:
con |
Database connection string, which can be a local connection or a connection to Spark standalone. |
sfile |
File name. |
k:v |
Separator setting for txt and csv files. For example, to set "#" as separator of a txt file, the parameters will be "sep":"#". By default, use comma as the separator for a txt file, and semicolon as the separator for a csv file. |
Option:
@c |
Read content of a file and return result as a cursor. |
@t |
Read the first row of a text file as field names; by default, use the automatically generated field names c0, _c1…. |
@x |
Close Spark database connection. |
Return value:
Table sequence/Cursor
Example:
|
A |
|
1 |
=spark_open() |
Connect to a local Spark database. |
2 |
=spark_read(A1,"D:/people.txt","sep":" ") |
Read a tab-separated txt file. |
3 |
=spark_read@c(A1,"D:/student.csv","sep":",") |
Read a comma-separated csv file and return a cursor. |
4 |
=spark_read@t(A1,"D:/score.txt","sep":"\t") |
Read a tab-separated txt file and use the first row as field names. |
5 |
=spark_read(A1,"D:/people.json") |
Read content of file people.json. |
6 |
>spark_close(A1) |
Close the Spark database connection. |
7 |
=spark_open("spark.properties") |
Connect to a Spark database. |
8 |
=spark_read@x(A7,"hdfs://localhost:9000/user/hive/warehouse/people.csv") |
Read people.csv stored in the Spark database and the close the connection. |