Listing and Reading Files

Let’s now prepare the document data and read the context in the data.

Python3

# Get a list of student files 
student_file = [file for file in os.listdir() if file.endswith('.txt')] 
  
# Read the content of each student's file 
student_docs = [open(file).read() for file in student_file] 
  
# Print the list of student files and their content 
for filename, document in zip(student_file, student_docs): 
    print(f"File: {filename}") 
    print("Content:") 
    print(document) 
    print("-" * 30)  # Separator between documents 

output:

File: fatma.txt
Content:
Lorem Ipsum is simply dummy text of the printing and typesetting industry.
Lorem Ipsum has been the industry's
standard dummy text ever since the 1500s,
------------------------------
File: john.txt
Content:
t is a long established fact that a reader will be distracted by the readable content of
a page when looking at its layout.
The point of using Lorem Ipsum
------------------------------
File: juma.txt
Content:
t is a long established fact that a reader will
be distracted by the readable content of a
page when looking at its layout. The point of using Lorem Ipsum
------------------------------

Here, in this code, it collects a list of the student text files, reads their content and prints both the file names and their respective content, making it useful for inspecting and working with the content of the files.

Plagiarism Detection using Python

In this article, we are going to learn how to check plagiarism using Python.

Plagiarism: Plagiarism basically refers to cheating. It means stealing someone’s else work, ideas, or information from the resources without providing the necessary credits to the author. For example, copying text from different resources from word to word without mentioning any quotation marks.

Table of Content

What is Plagiarism detection?
Importing Libraries
Listing and Reading Files
TF-IDF Vectorization
Calculating Cosine Similarity
Creating Document-vector Pairs
Checking Plagiarism
Word Cloud Visualization
Conclusion

Listing and Reading Files

Python3

Plagiarism Detection using Python

Table of Content

Similar Reads

Contact Us