Spark SQL: Examples on pyspark
Last updated:Table of Contents
- SparkContext
- Create SQLContext from SparkContext
- Load json file into DataFrame
- where and select
- join example
- groupBy example
WIP Alert This is a work in progress. Current information is correct but more content may be added in the future.
SparkContext
Skip this step if
sc
is already available to you
from pyspark import SparkContext
sc = SparkContext("local", "Simple App")
Create SQLContext from SparkContext
The SQLContext is used for operations such as creating DataFrames.
from pyspark.sql import SQLContext
# sc is the sparkContext
sqlContext = SQLContext(sc)
Load json file into DataFrame
TODO
where
and select
TODO
join
example
TODO
groupBy example
TODO
RDD Operations
TODO