Read Multiple CSV Files
To read multiple CSV files, we will pass a python list of paths of the CSV files as string type.
Python3
from pyspark.sql import SparkSession spark = SparkSession.builder.appName( 'Read Multiple CSV Files' ).getOrCreate() path = [ '/content/authors.csv' , '/content/book_author.csv' ] files = spark.read.csv(path, sep = ',' , inferSchema = True , header = True ) df1 = files.toPandas() display(df1.head()) display(df1.tail()) |
Output:
Here, we imported authors.csv and book_author.csv present in the same current working directory having delimiter as comma ‘,‘ and the first row as Header.
PySpark – Read CSV file into DataFrame
In this article, we are going to see how to read CSV files into Dataframe. For this, we will use Pyspark and Python.
Files Used:
- authors
- book_author
- books