Declaring Custom Item Loaders Processors
Just like Items, Item Loaders too can be declared by using the class syntax. The declaration can be done, as follows:
Python3
# Import the Item Loader class from scrapy.loader import ItemLoader # Import the processors from scrapy.loader.processors import TakeFirst, MapCompose, Join # Extend the ItemLoader class class BookLoader(ItemLoader): # Mention the default output processor default_output_processor = Takefirst() # Input processor for book name book_name_in = MapCompose( unicode .title) # Output processor for book name book_name_out = Join() # Input processor for book price book_price_in = MapCompose( unicode .strip) |
The code can be understood as:
- The BookLoader class extends the ItemLoader.
- The book_name_in, has a MapCompose instance, with defined function unicode.title, that would get applied on the book_name item.
- The book_name_out is defined as Join() class instance.
- The book_price_in, has a MapCompose instance, with a defined function unicode.strip, that would get applied on the book_price item.
Scrapy â Item Loaders
In this article, we are going to discuss Item Loaders in Scrapy.
Scrapy is used for extracting data, using spiders, that crawl through the website. The obtained data can also be processed, in the form, of Scrapy Items. The Item Loaders play a significant role, in parsing the data, before populating the Item fields. In this article, we will learn about Item Loaders.
Contact Us