Read Multiple CSV Files

To read multiple CSV files, we will pass a python list of paths of the CSV files as string type. 

Python3




from pyspark.sql import SparkSession
 
spark = SparkSession.builder.appName('Read Multiple CSV Files').getOrCreate()
 
path = ['/content/authors.csv',
        '/content/book_author.csv']
 
files = spark.read.csv(path, sep=',',
                       inferSchema=True, header=True)
 
df1 = files.toPandas()
display(df1.head())
display(df1.tail())


Output:

Here, we imported authors.csv and book_author.csv present in the same current working directory having delimiter as comma ‘,‘ and the first row as Header.

PySpark – Read CSV file into DataFrame

In this article, we are going to see how to read CSV files into Dataframe. For this, we will use Pyspark and Python.

Files Used:

  • authors
  • book_author
  • books

Similar Reads

Read CSV File into DataFrame

Here we are going to read a single CSV into dataframe using spark.read.csv and then create dataframe with this data using .toPandas()....

Read Multiple CSV Files

...

Read All CSV Files in Directory

To read multiple CSV files, we will pass a python list of paths of the CSV files as string type....