Spark dataframe Examples: Reading and Writing Dataframes

Spark dataframe Examples: Reading and Writing Dataframes

Last updated:
Table of Contents

WIP Alert This is a work in progress. Current information is correct but more content may be added in the future.

Read local table

TODO

Write to local table

TODO

Read parquet from S3

To read a parquet on S3 to a spark dataframe, use spark.read.parquet

val df = spark.read.parquet("s3://path/to/df.parquet")

Write parquet to S3

To save a dataframe as a parquet file on S3, use df.write.parquet()

val df = ....

df.write.parquet("s3://path/to/df.parquet")

Using override:

Using mode("overwrite") can cause some weird errors because some old partitions may not get overwritten

val df = ...

df.write.mode("overwrite").parquet("s3://path/to/df.parquet")

Dialogue & Discussion