BeautifulSoup CSS selector – Selecting nth child
In this article, we will see how beautifulsoup can be employed to select nth-child. For this, select() methods of the module are used. The select() method uses the SoupSieve package to use the CSS selector against the parsed document.
Syntax: nth-child() Selector
Syntax: select(“css_selector”)
CSS SELECTOR:
- nth-of-type(n): Selects the nth paragraph child of the parent.
- nth-child(n): Selects paragraph which is the nth child of the parent
Access Child Div Element in BeautifulSoup
There are various ways to access the second div BeautifulSoup. here we are discussing some generally used methods for accessing second div BeautifulSoup those are following.
- By Extracting the 2nd <b> Element
- By Extract Specific HTML Element
In the specified approach, the first step involves importing the necessary module to facilitate web scraping. Following this, data is extracted from a webpage using scraping techniques. The next step focuses on parsing the string obtained and converting it into HTML format for easier manipulation. To pinpoint specific elements within the HTML structure, the find() function is employed, enabling the identification of tags based on criteria such as class name, ID, or tag name.
Extracting the Second Element from HTML
In this example Python code utilizes the BeautifulSoup module to parse an HTML markup containing nested elements. It then finds a specific parent element with the class “coding” and prints the 2nd <b> element using both the nth-of-type
and nth-child
selectors. The result demonstrates different ways to locate and extract specific elements within the HTML structure using BeautifulSoup.
Python3
# importing module from bs4 import BeautifulSoup markup = """ <html> <head> <title>Beginner FOR Beginner EXAMPLE</title> </head> <body> <p class="1"><b>Beginner for Beginner</b></p> <p class="coding">A Computer Science portal for Beginner. <h1>Heading</h1> <b class="gfg">Programming Articles</b>, <b class="gfg">Programming Languages</b>, <b class="gfg">Quizzes</b>; </p> <p class="coding">practice</p> </body> </html> """ # parsering string to HTML soup = BeautifulSoup(markup, 'html.parser' ) parent = soup.find( class_ = "coding" ) # assign n n = 2 # print the 2nd <b> of parent print (parent.select( "b:nth-of-type(" + str (n) + ")" )) print () # print the <b> which is the 2nd child of the parent print (parent.select( "b:nth-child(" + str (n) + ")" )) |
Output:
Extracting a Specific Element from a Webpage
In this example Python code utilizes the BeautifulSoup library to perform web scraping on the specified w3wiki webpage. It imports the necessary modules, requests the webpage content, and parses the HTML. The code then selects and prints the second <b>
element within a specific class using both nth-of-type
and nth-child
methods.
Python3
# importing module from bs4 import BeautifulSoup import requests # assign website page = requests.get(sample_website) # parsering string to HTML soup = BeautifulSoup(page.content, 'html.parser' ) parent = soup.find( class_ = "wrapper" ) # assign n n = 1 # print the 2nd <b> of parent print (parent.select( "b:nth-of-type(" + str (n) + ")" )) print () # print the <b> which is the 2nd child of the parent print (parent.select( "b:nth-child(" + str (n) + ")" )) |
Output:
Contact Us