1 year ago
#158395
noahnathan5
can you manually change the "node id" values in a graphml object created with igraph in R?
I primarily work in R, but am trying to switch over to python to use some of the new tools in the osmnx package and have become stuck on getting an .graphml file created by igraph in R to be read in properly by osmnx in python.
tl;dr: the main question is if it is possible to manually specify the "node id" values when using the write_graph()
function in R from the package igraph?
Longer version: I am taking an existing .graphml file "accra-1910.graphml" (downloadable from https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/KA5HJ3, within the "ghana-GHA_graphml.zip" file).
I'm reading this .graphml file into R using read_graph()
from the igraph package, subsetting to a subgraph of interest, saving a new .graphml file of the subgraph using write_graph()
from the igraph package, and then trying to read the resulting new .graphml file back into osmnx in python using osmnx.load_graphml()
.
So in R, this looks like:
gml <- read_graph("accra-1910.graphml", format="graphml")
sub.t1 <- subgraph(gml, V(gml)[id %in% osmids.t1])
#note that osmids.t1 is a vector of vertex/node ids to subset to that i've created earlier in the R script
write_graph(sub.t1, "accra_pre1991.graphml", format="graphml")
Then in python, when I try to read "accra_pre1991.graphml" using:
G = ox.load_graphml(filepath="accra_pre1991.graphml")
I always get this error message:
node_id = self.node_type(node_xml.get("id"))
ValueError: invalid literal for int() with base 10: 'n0'
In checking the raw .graphml objects, I can see why this happens. The original file "accra-1910.graphml" has nodes indexed in this format:
<node id="30729912">
<data key="d4">5.5784766</data>
<data key="d5">-0.1648661</data>
<data key="d6">3</data>
<data key="d7">51</data>
<data key="d8">51</data>
<data key="d9">51</data>
</node>
<node id="30729918">
<data key="d4">5.5821678</data>
<data key="d5">-0.1666711</data>
<data key="d6">3</data>
<data key="d7">40</data>
<data key="d8">40</data>
<data key="d9">46</data>
</node>
Note that the node_ids take character values such as "30729912", etc. However, the .graphml file produced by write_graph()
from igraph in R, always take this format instead:
<node id="n0">
<data key="v_highway"></data>
<data key="v_elevation_srtm">51</data>
<data key="v_elevation_aster">51</data>
<data key="v_elevation">51</data>
<data key="v_street_count">3</data>
<data key="v_x">-0.1648661</data>
<data key="v_y">5.5784766</data>
<data key="v_id">30729912</data>
</node>
<node id="n1">
<data key="v_highway"></data>
<data key="v_elevation_srtm">46</data>
<data key="v_elevation_aster">40</data>
<data key="v_elevation">40</data>
<data key="v_street_count">3</data>
<data key="v_x">-0.1666711</data>
<data key="v_y">5.5821678</data>
<data key="v_id">30729918</data>
</node>
Here, igraph has automatically created a new index of node ids "n0"
, "n1"
, "n2"
, and so on, and is storing the old ids as data attributes for the node, as in <data key="v_id">30729918</data>
. osmnx in python then can't parse this new id formatting and is getting stuck on the very first node id value (thus the error: "invalid literal for int() with base 10: 'n0'"
in which it can't handle a non-numeric character in the node id values)
Is there any way to get igraph (via write_graph()
in R) to export a .graphml file in the same formatting as the original .graphml file instead?
Apologies if this question isn't clear -- this is my first time posting on Stack Overflow and I'm not sure what all the rules are. Thanks for any help you can provide!
python
r
igraph
osmnx
graphml
0 Answers
Your Answer