1 year ago
#315867
Idkwhatnomeis
Parsing Wikipedia raised KeyError: query (Python)
I'm trying to get wikipedia backlinks of more than 200 pages. To do this, I:
- look for URLs in italian, if it doesn't work I look for them in English
- put them in a list
- iterate over this list to get the number of languages they are available in on Wikipedia (with bs4)
- I append these languages in a list
- I iterate over both languages and urls to get page titles and backlinks to put in a dictinonary with key the language and value the number of backlinks available in that language
But I get the error "query". I don't know why
opere = df.label #works
listaurl = []
for x in opere:
try:
wiki_wiki = wikipediaapi.Wikipedia('it')
p = wiki_wiki.page(x).fullurl
listaurl.append(p)
print(p)
except:
wiki_wiki = wikipediaapi.Wikipedia('en')
p = wiki_wiki.page(x).fullurl
listaurl.append(p)
print(p)
lista = []
for url in listaurl:
soup = BeautifulSoup(urllib.request.urlopen(url))
links = [(el.get('lang'), el.get('href')) for el in soup.select('li.interlanguage-link > a')]
for language, link in links:
lista.append(language)
testo = soup.title.text.replace(" ", "")
import wikipediaapi
lista2 = []
regex = r"(?<=/wiki/).*$"
dik = {}
for lang in lista:
wikis = wikipediaapi.Wikipedia(lang)
for apage in listaurl:
wikipage = apage.split('/wiki/')[1]
page_py = wikis.page(wikipage)
print(page_py)
titles = page_py.title
print(titles)
back = page_py.backlinks
dik[lang] = len(back)
Example input to reproduce (the df):
item,label,authorlabel,authorlabel2,numWikipediaLanguages
http://www.wikidata.org/entity/Q172850,Il nome della rosa,,Umberto Eco,53
http://www.wikidata.org/entity/Q437791,Il pendolo di Foucault,,Umberto Eco,30
http://www.wikidata.org/entity/Q791487,Baudolino,,Umberto Eco,26
Error traceback:
Traceback (most recent call last):
File "C:....myfile.py", line 43, in <module>
back = page_py.backlinks
File "C:\....\wikipediaapi\__init__.py", line 1112, in backlinks
self._fetch('backlinks')
File "C:....\wikipediaapi\__init__.py", line 1148, in _fetch
getattr(self.wiki, call)(self)
File "C:....wikipediaapi\__init__.py", line 468, in backlinks
self._common_attributes(raw['query'], page)
KeyError: 'query'
python
python-3.x
web-scraping
wikipedia
wikipedia-api
0 Answers
Your Answer