How to Scrape NBA starting lineups and create a Pandas DataFrame?

642    Asked by dhanan_7781 in Python , Asked on Nov 1, 2023

I am having trouble parsing the code for the NBA starting lineups and would love some help if possible.

Here is my code so far:

import requests
from bs4 import BeautifulSoup
soup = BeautifulSoup(requests.get(url).text, "html.parser")
print(lineups)
lineups = soup.find_all(class_='lineup__player')

I am looking for the following data: 

  1. Player
  2. Team
  3. Position

I was hoping to scrape the data and then create a Pandas Dataframe from the output.

Here is an example of my desired output:

   Player        Team   Position
Dennis Schroder    BOS      PG
Robert Langford    BOS      SG
Jayson Tatum       BOS      SF
Jabari Parker      BOS      PF
Grant Williams     BOS      C
    Player        Team    Postion
Kyle Lowry         MIA      PG
Duncan Robinson    MIA      SG
Jimmy Butler       MIA      SF
P.J.Tucker         MIA      PF
Bam Adebayo        MIA      C
...               ...      ...

I was able to find the Player data but was unable to successfully parse it. I can see the Player data located inside 'Title'.

Any tips on how to complete this project will be greatly appreciated. Thank you in advance for any help that you may offer.

I am just looking for the 5 starting players... no need to add the bench players. And not sure if there is some way to add a space in between each team like my output above.

Here is and example of the current output that I would like to pars

Answered by Dhananjay Singh

You can scrape the desired data from NBArotowire lineups and organize it into a pandas data frame by using the Python script. The Python script uses the beautiful soup library to resolve the webpage and locate the information. This code begins by importing the required and necessary libraries. After importing libraries it defines the URL. Then it uses BeautifulSoup to extract all the elements in ‘the lineup_ _ player’.

For each player, it computes the player’s name, team, and position. These details are then compiled into a list of dictionaries. Each dictionary contains the player’s information.

After that, this list is converted into a Pandas Data Frame. For this, it uses ‘pd.DataFrame() function. This helps in easily changing and analyzing the data. You can adjust your column names and further information according to your particular requirements. In the following lines, there is an example provided in the form of the codes. You can easily go through the steps by looking at this example:-

Import requests
From bs4 import BeautifulSoup
Import pandas as pd
soup = BeautifulSoup(requests.get(url).text, “html.parser”)
lineups = soup.find_all(class_=’lineup__player’)
data = []
for player in lineups:
player_name = player.find(class_=’lineup__player-name’).get_text(strip=True)
team = player.find(class_=’lineup__player-team’).get_text(strip=True)
position = player.find(class_=’lineup__player-pos’).get_text(strip=True)
data.append({‘Player’: player_name, ‘Team’: team, ‘Position’: position})
df = pd.DataFrame(data)
print(df)

Now ace your python pramming by getting python certification online.



Your Answer

Interviews

Parent Categories