How to Fetch top 10 starred repositories of user on GitHub ?
GitHub Repositories
772
27-Jun-2018
Prakash nidhi Verma
28-Jun-2018Fetch top 10 starred repositories of user on GitHub Python :
Prerequisites:
- python
- urllib2
- BeautifulSoup
We write python scripts to make our task easier which is helps you to fetch top 10 starred repositories of any user on GitHub.You just need Github username .
example: mindstick2010
First access the repository url of user. example:
username = “mindstick2010”, then url =“https://github.com/mindstick2010?tab=repositories”
Now scrape the url page and fetch stars,repository name and repository url using BeautifulSoup. On one page there are 20 repositories,if user has more than 20 repositories,you need a loop to access all the pages by beautifulSoup lib.
The code shown below :
//Python3 script to fetch top 10 starred
//repositories of a user on github
top_limit = 9 def openWebsite(): username = str(input("enter GitHub username: ") repo_dict = {} url = "https://github.com/"+username+"?tab=repositories" while True: urllib2 : https://docs.python.org/2/library/urllib2.html BeautifulSoup : https://www.crummy.com/software/BeautifulSoup/bs4/doc/ cj = http.cookiejar.CookieJar() opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj)) resp = opener.open(url) doc = html.fromstring(resp.read()) repo_name = doc.xpath('//li[@class="col-12 d-block width-full py-4 border-bottom public source"]/div[@class="d-inline-block mb-1"]/h3/a/text()' repo_list = [] for name in repo_name: name = ' '.join(''.join(name).split()) repo_list.append(name) repo_dict[name] = 0 response = requests.get(url) soup = BeautifulSoup(response.text, 'html.parser') soup = BeautifulSoup(response.text, 'html.parser') div = soup.find_all('li', {'class': 'col-12 d-block width-full py-4 border-bottom public source'}) for d in div: temp = d.find_all('div',{'class':'f6 text-gray mt-2'}) for t in temp: x = t.find_all('a', attrs={'href': re.compile("^\/[a-zA-Z0-9\-\_\.]+\/[a-zA-Z0-9\.\- \_]+\/stargazers")}) if len(x) is not 0: name = x[0].get('href') name = name[len(username)+2:-11] repo_dict[name] = int(x[0].text) div = soup.find('a',{'class':'next_page'}) if __name__ == "__main__": openWebsite()