1 year ago

#386726

test-img

Kairat Other

Neo4j cosine similarity with condition

I want to calculate and write as node property cosine similarity in neo4j in telecom domain. The difficulty I encountered is writing a specific condition. I need to find max cosine similarity between customers of our telecom and customers of non our telecoms.

I am using 1.8 version of neo4j GDS library

Our telecom has a phone number prefixes: 707, 700, 747, 708

Let’s define a graph

CREATE (a1:Abon {msisdn:'7071212121'})
CREATE (a2:Abon {msisdn:'7071313131'})
CREATE (a3:Abon {msisdn:'7071414141'})
CREATE (b1:Abon {msisdn:'7011010101'})
CREATE (b2:Abon {msisdn:'7012323232'})

CREATE (a1)-[:con {weight: 0.98}]->(a2)
CREATE (a1)-[:con {weight: 0.98}]->(b1)
CREATE (a1)-[:con {weight: 0.98}]->(b2)
CREATE (b1)-[:con {weight: 0.98}]->(a2)
CREATE (b2)-[:con {weight: 0.98}]->(a3)

Now I am looking for a1, I need to calculate cosine similarity only for customers of another telecoms:

a1 <> b1 = 0.98

a1 <> b2 = 0.76

After max similarity is taken which is 0.98 and saved as node parameter of a1.

I have started to write some script, but I can't getting it right

MATCH (a1:Abon), (a2:Abon)
MATCH (a1)-[conn:con]-(a2)
WITH {item:id(a1), weights: collect(coalesce(conn.weight, gds.util.NaN()))} AS abonData
WITH collect(abonData) AS Abons
WITH Abons,
    //  [value in Abons WHERE value.msisdn IN ['7072501005', '7072501006'] | value.item ] AS sourceIds
     [value in Abons WHERE value.msisdn IN ['7071206336', '7013193795'] | value.item ] AS targetIds
CALL gds.alpha.similarity.cosine.write({
 data: Abons,
//  sourceIds: sourceIds,
 targetIds: targetIds,
 topK: 1
})
YIELD item1, item2, similarity
WITH gds.util.asNode(item1) AS from, gds.util.asNode(item2) AS to, similarity
RETURN from.msisdn AS from, to.msisdn AS to, similarity
ORDER BY similarity DESC

graph

neo4j

0 Answers

Your Answer

Accepted video resources