1 year ago
#375654
hallque
Count instances where pairs of words occur within a given distance of each other
I have two lists of words, like so:
LIST1 = ['whisky', 'spirits', 'liqueur']
LIST2 = ['bottle', 'barrel', 'can', 'cup']
I also have a string of text (call the string object TEXT) that I would like to search. The end result of the search should be a count of the number of times each word in LIST1 appears in TEXT within a given distance (e.g., within 10 words) of any of the words in LIST2. I can imagine complicated methods of accomplishing this by iterating regular expression searches over both lists. But my actual LIST1 and LIST2 are quite long, and the text that I am searching is large, so iterating isn't a good option. I was hopeful that there might be a purpose built tool when I found NLTK, but unless I am missing something there is no functionality of the type I need. Is there an easy way to accomplish my task?
Note: I can't tell for sure, but I think my problem may be similar to the one discussed in this unanswered post.
python
nlp
nltk
0 Answers
Your Answer