Loading Multiple CSVs Files

The primary use of make_csv_dataset method can be seen when we have to import multiple CSV files into our dataset. We will use the fonts dataset, which contains different language fonts.

Example: In this example, we use the Keras get_file feature to read multiple datasets onto the disk, and cache_dir and cache_subdir define where to store these.

Once we have the datasets saved, then using the file_pattern command in our make_csv_dataset we can specify the path to all files to be imported. Create a new file and execute the following code:

Python3




fonts = tf.keras.utils.get_file('fonts.zip'
    "https://archive.ics.uci.edu/ml/machine-learning-databases/00417/fonts.zip",
    cache_dir='.', cache_subdir='fonts',
    extract=True)
  
fonts_data = tf.data.experimental.make_csv_dataset(
    file_pattern="fonts/*.csv",
    batch_size=10, num_epochs=1,
    num_parallel_reads=4,
    shuffle_buffer_size=10000)
  
for features in fonts_data.take(1):
    for i, (name, value) in enumerate(features.items()):
        if i > 15:
            break
        print(f"{name:20s}: {value}")
    print(f"[total: {len(features)} features]")


We are displaying the first 15 features of each dataset and their values. The final count of total feature is displayed using len( ) function. In this example, we have 412 features in total.

Output:

 



Load CSV data in Tensorflow

This article will look at the ways to load CSV data in the Python programming language using TensorFlow.

TensorFlow library provides the make_csv_dataset( ) function, which is used to read the data and use it in our programs. 

Similar Reads

Loading single CSV File

To get the single CSV data file from the URL, we use the Keras get_file function. Here we will use the Titanic Dataset....

Loading Multiple CSVs Files:

...

Contact Us