web scraping - Getting the list of ALL topic names from Freebase -
According to
, they have 23,407,174 topics. What is the easiest way to get UI favorable names in all these topics (basically the topic of 'text' topic, JSON, example of a JSON of a subject)? I do not need any other Meta information
wget-o - http: //download.freebase .com / datadumps / latest / freebase-simple-theme-dump.tsv.bz2 | Benzip 2 | Cut-off 2 & gt; Although you may also want a freebase ID, so you can know what the name is: wget -O - http://download.freebase.com/datadumps/latest/freebase -simple-topic-dump.tsv.bz2 | Benzip 2 | Cut-F 1,2 Two extra bits of postprocessing are required:
- Skipped in tab \ t < Li> String \ N represents a blank (non-existent) name
Comments
Post a Comment