Performance of in operator using list and set
I had this use case where I have to check which elements of a list of words where available in another list of words. So I decided to use the operator in. Just for further reference a tried the following:
# common code for all test base_list = [...] query_list = [...]
- Pretty simple method:
for word in query_list: if word in base_list: # do somethingFor a list of 4284 elements against a list of 107 it took 9 seconds. Using simple lists, this method is the most straight forward of all, and also the slowest one.
- Sorting things:
base_list.sort() for word in query_list: if word in base_list: # do somethingAfter sorting the list, guess what? Yeap, nothing changed, same 9 seconds
- What about sets?
bs = set(base_list) for word in query_list: if word in bs: # do somethingUsing sets this is another history, 0.6 seconds for the same amount of data; but… if this could be achived turning one of lists into a set, what if…
- Using more sets
bs = set(base_list) qs = set(query_list) solution = bs.intersection(qs)
0.02 seconds.
Well, as you can see, sets are great.
Comments are currently closed.