Writing descriptors in Python 3.6+
Special thanks to Luciano Ramalho. I learned most of the knowledge about descriptors from his workshop in PyBay 2017
Have you seen this code or maybe have written code like this?
This code snippet partially comes from the tutorial of a popular ORM package called SQLAlchemy. If you ever wonder why the attributes
name aren’t passed into the
__init__ method and bind to the instance like regular class does. , this post is for you.
This post starts with explaining descriptors, why to use them, how to write them in previous Python versions (<= 3.5,) and finally writing them in Python 3.6 with the new feature described in PEP 487 – Simpler customisation of class creation
If you are in a hurry or you just want to know what’s new, scroll all the way down to the bottom of this article. You’ll find the whole code.
What are descriptors
A great definition of descriptor is explained by Raymond Hettinger in Descriptor HowTo Guide:
In general, a descriptor is an object attribute with “binding behavior”, one whose attribute access has been overridden by methods in the descriptor protocol. Those methods are __get__(), __set__(), and __delete__(). If any of those methods are defined for an object, it is said to be a descriptor.
There are three ways to access an attribute. Let’s say we have the
a attribute on the object
- To lookup its value,
some_variable = obj.a,
- To change its value,
obj.a = 'new value', or
- To delete it,
Python is dynamic and flexible to allow users intercept the above expression/statement and bind behaviors to them.
Why you want to use descriptors
Let’s see an example:
Despite the lack of proper documentation, there is a bug:
Instead of using getter and setter methods and break the APIs, let’s use property to enforce
quantity be positive:
class Order: def __init__(self, name, price, quantity): self._name = name self.price = price self._quantity = quantity # (1) @property def quantity(self): return self._quantity @quantity.setter def quantity(self, value): if value < 0: raise ValueError('Cannot be negative.') self._quantity = value # (2) ... apple_order.quantity = -10 # ValueError: Cannot be negative
quantity from a simple attribute to a non-negative property. Notice line
(1) that the attribute are renamed to
_quantity to avoid line
(2) getting a
Are we done? Hell no. We forgot about the
price attribute cannot be negative neither. It might be attempting to just create another property for
price, but remember the DRY principle: when you find yourself doing the same thing twice, it’s a good sign to extract the reusable code. Also, in our example, there might be more attributes need to be added into this class in the future. Repeating the code isn’t fun for the writer or the reader. Let’s see how to use descriptors to help us.
How to write descriptors
With the descriptors in place, our new class definition would become:
class Order: price = NonNegative('price') # (3) quantity = NonNegative('quantity') def __init__(self, name, price, quantity): self._name = name self.price = price self.quantity = quantity def total(self): return self.price * self.quantity apple_order = Order('apple', 1, 10) apple_order.total() # 10 apple_order.price = -10 # ValueError: Cannot be negative apple_order.quantity = -10 # ValueError: Cannot be negative
Notice the class attributes defined before the
__init__ method? It’s a lot like the SQLAlchemy example showed on the very beginning of this post. This is where we are heading. We need to define the
NonNegative class and implement the descriptor protocols. Here’s how:
name attribute is needed because when the
NonNegative object is created on line
(3), the assignment to attribute named
price hasn’t happen yet. Thus, we need to explicitly pass the name
price to the initializer of the object to use as the key for the instance’s
Later, we’ll see how in Python 3.6+ we can avoid the redundancy.
The redundancy could be avoid in earlier versions of Python, but I think this would take too much effort to explain and is not the purpose of this post. Thus, not included.
(6): instead of using builtin function
setattr, we need to reach into the
__dict__ object directly, because the builtins would be intercepted by the descriptor protocols too and cause the
Welcome to Python 3.6+
We are still repeating ourself in line
(3). How do I get a cleaner API to use such that we write:
Let’s look at the new descriptor protocol in Python 3.6:
object.__set_name__(self, owner, name)
- Called at the time the owning class owner is created. The descriptor has been assigned to name.
With this protocol, we could remove the
__init__ and bind the attribute name to the descriptor:
To put all the codes together:
class NonNegative: def __get__(self, instance, owner): return instance.__dict__[self.name] def __set__(self, instance, value): if value < 0: raise ValueError('Cannot be negative.') instance.__dict__[self.name] = value def __set_name__(self, owner, name): self.name = name class Order: price = NonNegative() quantity = NonNegative() def __init__(self, name, price, quantity): self._name = name self.price = price self.quantity = quantity def total(self): return self.price * self.quantity apple_order = Order('apple', 1, 10) apple_order.total() # 10 apple_order.price = -10 # ValueError: Cannot be negative apple_order.quantity = -10 # ValueError: Cannot be negative
Python is a general purpose programming language. I love that it not only has very powerful features that are highly flexible and could possibly bend the language tremendously (e.g. Meta Classes,) but also has high-level APIs/protocols to serve 99% of the needs (e.g. Descriptors.) I believe there’s the right tool for the job. Descriptors are clearly the right tool for binding behaviors to attributes. Although Meta Classes could potentially do the same thing, Descriptor could solve the problem more gracefully. It’s also pleasing to see Python evolve for serving general people’s needs better.
Here’s my conclusion:
- Python 3.6 is by far the greatest Python.
- Descriptors are used to bind behaviors to accessing attributes.