diff --git a/blog_data/blog_authors.json b/blog_data/blog_authors.json index 01b373b85..9c56862d8 100644 --- a/blog_data/blog_authors.json +++ b/blog_data/blog_authors.json @@ -2,5 +2,9 @@ "chalarangelo": { "name": "Angelos Chalaris", "profile": "https://twitter.com/chalarangelo" + }, + "maciv": { + "name": "Isabelle Viktoria Maciohsek", + "profile": "https://github.com/Trinityyi" } } diff --git a/blog_images/code-anatomy-performant-python.jpg b/blog_images/code-anatomy-performant-python.jpg new file mode 100644 index 000000000..78efa6753 Binary files /dev/null and b/blog_images/code-anatomy-performant-python.jpg differ diff --git a/blog_posts/code-anatomy-performant-python.md b/blog_posts/code-anatomy-performant-python.md new file mode 100644 index 000000000..d8aa4ebeb --- /dev/null +++ b/blog_posts/code-anatomy-performant-python.md @@ -0,0 +1,63 @@ +--- +title: Code Anatomy - Writing high performance Python code +type: story +tags: python,list,performance +authors: maciv,chalarangelo +cover: blog_images/code-anatomy-performant-python.jpg +excerpt: Writing short, efficient Python code is not always straightforward. Read how we optimize our list snippets to increase performance using a couple of simple tricks. +--- + +Writing short and efficient Python code is not always easy or straightforward. However, it's often that we see a piece of code and we don't realize the thought process behind the way it was written. We will be taking a look at the [difference](/python/s/difference) snippet, which returns the difference between two iterables, in order to understand its structure. + +Based on the description of the snippet's functionality, we can naively write it like this: + +```py +def difference(a, b): + return [item for item in a if item not in b] +``` + +The above implementation may work well enough, but doesn't account for duplicates in `b`, making the code take more time than necessary in cases with many duplicates in the second list. To solve this issue, we can make use of the `set()` method, which will only keep the unique values in the list: + +```py +def difference(a, b): + return [item for item in a if item not in set(b)] +``` + +This version, while it seems like an improvement, may actually be slower than the previous one. If you look closely, you will see that `set()` is called for every `item` in `a` causing the result of `set(b)` to be evaluated every time. Here's an example where we wrap `set()` with another method to better showcase the problem: + +```py +def difference(a, b): + return [item for item in a if item not in make_set(b)] + +def make_set(itr): + print('Making set...') + return set(itr) + +print(difference([1, 2, 3], [1, 2, 4])) +# Making set... +# Making set... +# Making set... +# [3] +``` + +The solution to this issue is to call `set()` once before the list comprehension and store the result to speed up the process: + +```py +def difference(a, b): + _b = set(b) + return [item for item in a if item not in _b] +``` + +Another option worth mentioning when analyzing performance for this snippet is the use of a list comprehension versus using something like `filter()` and `list()`. Implementing the same code using the latter option would result in something like this: + +```py +def difference(a, b): + _b = set(b) + return list(filter(lambda item: item not in _b, a)) +``` + +Using `timeit` to analyze the performance of the last two code examples, it's pretty clear that using list comprehension can be up to ten times faster than the alternative, as it's a native language feature that works very similar to a simple `for` loop without the overhead of the extra function calls. This explains why we prefer it, apart from readability. + +This pretty much applies to most mathematical list operation snippets, such as [difference](/python/s/difference), [symmetric_difference](/python/s/symmetric-difference) and [intersection](/python/s/intersection). + +**Image credit:** [Kalen Emsley](https://unsplash.com/@kalenemsley?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText) on [Unsplash](https://unsplash.com/s/photos/code?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText)