8000 GitHub - essien1990/Apache-Spark: Batch Processing using Apache Spark and Python for data exploration
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

essien1990/Apache-Spark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

Apache Spark Using Python3 for data analysis

  • Batch Processing using Apache Spark and Python3 for data exploration
  • Dataset was downloded from https://www.kaggle.com/
  • Focusing on Pyspark SQL libraries
    • from pyspark.sql.types import BooleanType
    • from pyspark.sql.functions import udf
    • from pyspark.sql import functions as F
    • from pyspark.sql import SparkSession
    • from pyspark.sql import Window

About

Batch Processing using Apache Spark and Python for data exploration

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0