Loading Multiple CSVs Files
The primary use of make_csv_dataset method can be seen when we have to import multiple CSV files into our dataset. We will use the fonts dataset, which contains different language fonts.
Example: In this example, we use the Keras get_file feature to read multiple datasets onto the disk, and cache_dir and cache_subdir define where to store these.
Once we have the datasets saved, then using the file_pattern command in our make_csv_dataset we can specify the path to all files to be imported. Create a new file and execute the following code:
Python3
fonts = tf.keras.utils.get_file( 'fonts.zip' , "https://archive.ics.uci.edu/ml/machine-learning-databases/00417/fonts.zip" , cache_dir = '.' , cache_subdir = 'fonts' , extract = True ) fonts_data = tf.data.experimental.make_csv_dataset( file_pattern = "fonts/*.csv" , batch_size = 10 , num_epochs = 1 , num_parallel_reads = 4 , shuffle_buffer_size = 10000 ) for features in fonts_data.take( 1 ): for i, (name, value) in enumerate (features.items()): if i > 15 : break print (f "{name:20s}: {value}" ) print (f "[total: {len(features)} features]" ) |
We are displaying the first 15 features of each dataset and their values. The final count of total feature is displayed using len( ) function. In this example, we have 412 features in total.
Output:
Load CSV data in Tensorflow
This article will look at the ways to load CSV data in the Python programming language using TensorFlow.
TensorFlow library provides the make_csv_dataset( ) function, which is used to read the data and use it in our programs.
Contact Us