1 year ago

#384969

test-img

xenotharm

Edit distance for a four-digit sequential ranking in R? (stringdist)

Right now, I am trying to create scale scores for participants who ranked four job candidates (A, B, C, and D) to a role from best fit to worst fit. The correct order is A, D, C, B. As far as my dataframe goes, the correct sequence for columns A, B, C, and D should therefore be 1, 4, 3, 2. Below is a sample of my dataframe with "Edit_score" representing what I think is degree of correctness, i.e. the degree to which the values in Concatted resemble "1432". I used stringdist in the following code to produce this column:

data$edit_score <- stringdist("1432", data$Concatted, method = "jw")

I am not sure if the Jaro-Winkler method is the most appropriate for this type of variable. Should I be using a different stringdist method? Is stringdist the function I should be using to calculate this? I am trying to take into account both placement and sequence and really just need to assign scores to Concatted valued based on how closely they resemble the sequence "1432".

A B C D Concatted Edit_score
4 2 3 1 4231 0.33333333
1 2 4 3 1243 0.16666667

r

edit-distance

stringdist

0 Answers

Your Answer

Accepted video resources