1 year ago
#280351
Faisal Mirza
How can I extract triples from the Freebase dump?
I would like to collect a large knowledge base of triples as: subject, object, predicate, so I downloaded the Freebase dump from the developers page, which contains triples in RDF format, and I want to decode it to a readable format. How can I achieve this?
Currently I am following the Github of nchah
and am running the shell script s0-run-parse-extract-triples.sh on VirtualBox Ubuntu, which should clean the input data of RDF's by removing URL's but keeping the ID's, and am passing my input data as freebase-triples.txt which is a sample of 100 rows from the 30Gb freebase-rdf-latest.gz as argument.
you can find the code here
Note that I was getting the message No such file in directory, so I removed line 8, and added $1 in line 17 instead of $INPUT_FILE which took care of this message, and also in line 21 I removed the # sign and changed gsed to sed, and I also added echo messages to do some tracing.
and this is how am running it:sh s0-run-parse-extract-triples.sh freebase-triples.txt
Check the error that am getting here
Am getting the output file fb-rdf-s01-c01 but it still has the URL's and its unchanged from my input, and am also getting the other file fb-rdf-s01-c02 but its empty .
rdf
freebase
triples
knowledge-graph
n-triples
0 Answers
Your Answer