In this series, I will explore some of built-in functions, namely map
, zip
, filter
, and reduce
. These functions provide a concise and readable way to perform operations on iterables and are widely used. In this series, we will take a deep dive into these functions and learn how to use them effectively in your code.
In addition to exploring the basics of these functions, I will also go through advanced topics such as optimization tips, common use cases, and limitations. Whether you are a beginner or an experienced Python developer, this series will provide valuable insights into these functions and help you write more efficient and readable code. So, let’s get started and learn how to master the power of map
, zip
, filter
, and reduce
in Python.
This post will cover the map.
The basics: syntax & functionality
map
As the name suggests, it maps something to something. What in particular?
map(func, iter)
What
map
will do is that, it will map each item ofiter
to the functionfunc
as an argument & saves the result to new iter_obj conserving the order.
Say you have list of integers and have to square it. You can do
nums = [...] # specimen/input
squared_nums = list(map(lambda x: x**x, nums))
If map
wasn’t used, you should have done this,
nums = [...] # specimen/input
squared_nums = []
for n in nums:
squared_nums.apped(n**n)
See? map
makes it easy to mentally apprehend. But, map
comes with a cost, read the limitations section.
If by any chance you don’t know what lambda is then you should learn it right now, it is a magical shortcut tool in python. Learn more 👇
Advanced Python: The lambda
Have you ever come across the keyword, lambda in python? did you know its quickie but ain’t shorty 🤯
Step by Step analysis
nums = [1,3,5,7]
if
map(lambda x: x**x, nums)
is executed…
map
creates a generator and when yielded produces results. It undertakes following steps for each yield by list()
call.
nums = [1,3,5,7] --> 1 ==> 1 ** 1 = 1 ==> squared_nums = [1,]
nums = [3,5,7] --> 3 ==> 3 ** 3 = 9 ==> squared_nums = [1,9]
nums = [5,7] --> 5 ==> 5 ** 5 = 25 ==> squared_nums = [1, 9, 25]
nums = [7] --> 7 ==> 7 ** 7 = 49 ==> squared_nums = [1,9, 25, 49]
Get it, 😊? If not let me know in the comment.
The behavior of map
When you execute an expression with map
it doesn’t just do it right away. It returns a map object, a generator. Thing about generator is they are lazy by nature. Unless you yield one by one, nothing happens.
An Example
Let tempt_f
be list of real numbers representing temperature reading of sensors in your warehouse. You are tasked to get reading in real-time and send it to internal agent in HQ. But the agent uses Celsius & sensors output in Fahrenheit. Pretty silly problem but this happens lot more than you imagine in the field. What are you going to do?
For a given time, take reading of all the sensors and in a loop convert them to Celsius and send them to HQ.
The normie way…
tempt_f = [...] # somehow aggregated tempterature for a given moment
tempt_c = []
for t in tempt_f:
tempt_c.append((t - 32) * (5/9))
tempt_c # send it to the destination somehow
Now the map way of doing it…
tempt_f = [...] # somehow aggregated tempterature for a given moment
tempt_c = [c for c in map(lambda f: (f-32)*5/9, tempt_f)]
tempt_c # send it to the destination somehow
In case you didn’t understand the code, I used list comprehension and inline function in python. If you want to learn more, visit these posts 👇
Advanced Python: Comprehensions
Powerful one-liners for generators in python3
Advanced Python: The lambda
Have you ever come across the keyword, lambda in python? did you know its quickie but ain’t shorty 🤯
Some common examples turned to map.
Square roots
from math import sqrt
from random import randint
start = randint(2003,2007)
stop = randint(4567,7890)
# the map
sqrts = map(sqrt, range(start, end))
# now sqrt can be yielded as per needed
Factorials
from math import factorial
from random import randint
n = randint(19, 2003) # somewhat intensive computations
# the map
factos = map(factorial, range(n+1))
# now yield factos as needed 🆒
Watermarking images in a directory
def watermark(name):
global mark, isizex,isizey
# Open the base image
base_image = Image.open(name)
# Create an ImageDraw object
idraw = ImageDraw.Draw(base_image)
# Calculate the position of the watermark text
position = (base_image.width - isizex - 10, base_image.height - isizey - 10)
# Add the watermark text to the image
draw.text(position, mark, font=font, fill=(255, 255, 255, 128))
# Save the result
base_image.save(name)
def get_img_names(dir):
exts = "jpg.png.gif".split(".")
return join(dir,f) for f in listdir(dir) if isfile(f) and (f[-3:-1] in exts)
from os import listdir
from os.path import realdir, join, isfile
from PIL import Image, ImageDraw, ImageFont
# Specify the font and font size
font = ImageFont.truetype("arial.ttf", 36)
# watermark
mark = "Birnadin E."
# Get the size of the watermark text
isizex, isizey = draw.textsize(mark, font=font)
# utils
def get_img_names(dir):
# ...
pass
def watermark(name):
# ...
pass
# watermarking
dir = "..." # dir that contains the images to be watermarked
_ = list(map(watermark, get_img_names(dir))
In these examples
map
may be just replacing multilinefor
loop block, butmap
may come in handy if you want to run these concurrently.
Concurrently watermarking images
import concurrent.futures
# other imports
# refer abover code blocks
# watermarking
dir = "..."
with concurrent.futures.ThreadPoolExecutor() as executor:
_ = list(executor.map(watermark, get_img_names(dir))
It just throwing in some executor
context object, nothing else though. But in conventional for loop, you may have to think about race conditions. Or, never have I ever faced a problem in my 7 years of python. If I am wrong, please let me know in the comments, I don’t want to die thinking wrong.
You can do the same with other examples too, just change the executor.map
(...)
part with usual map(...)
part; that’s it!
Limitations
map
unfortunately is not the pinnacle way always. Here are some limitations of map
: -
Memory footprint is high when dealt with large data.
Performance degrades if large data is processed.
Can be used only once!
first list() exhausted the map object!
Can only work with iterables and first argument should be callable. i.e. no methods 😳
If multiple iterables are given, then same length should they be!
map
object is immutable, so once created then if given data changes, themap
will produce invalid/unpredictable results.Mostly slower than primitive loops when using large datasets.
If you know more limitations, please let me know in the comments or reply on the tweet.
Epilogue
With these being said, I conclude this post, add see you in the next, in which let's learn the zip
.
If you are intrigued or interested in this kind of stuff, make sure you follow me on Twitter.
Till then, it’s me the BE, signing off 👋
Cover by Ylanite Koppens.