langston-barrrett.github.io
Fun with Mypy: Reifying Runtime Relations on Types
I'm a big fan of type checkers for Python, like Mypy. One fun aspect of these tools is that their type systems tend to have features that reflect the dynamic nature of Python, like type narrowing. This post describes another, less-documented way to use runtime evidence of type information.
First, consider the cast
operation in in the
typing
module. The
docs say:
typing.cast(typ, val)
Cast a value to a type.
This returns the value unchanged. To the type checker this signals that the return value has the designated type, but at runtime we intentionally don't check anything (we want this to be as fast as possible).
In other words, it's up to the programmer to ensure that the runtime
type of the value argument really is a subtype of the type argument.
What if we could make a version of cast
, perhaps
safe_cast
, that makes sure that's the case? It would need
to meet two requirements:
- Performance: No overhead at calls to
safe_cast
, just likecast
. No assertions, no conditionals. - Safety: It should be the case that every call to
safe_cast
is guaranteed (up to bugs in Mypy or programmers doing something obviously tricky) to succeed.
Well, there's good news! We can:
from __future__ import annotations
from dataclasses import dataclass
from typing import Generic, Type, TypeVar, final
Sub = TypeVar("Sub")
Super = TypeVar("Super")
@final
class Subclass_(Generic[Sub, Super]):
def __init__(self, sub: Type[Sub], sup: Type[Super]) -> None:
assert issubclass(sub, sup)
def safe_cast(evidence: Subclass_[Sub, Super], value: Sub) -> Super:
return value # type: ignore[return-value]
Here's what it looks like at the REPL:
>>> Subclass_(bool, int)
Subclass_()
>>> Subclass_(bool, str)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/blog/reify.py", line 13, in __init__
assert issubclass(sub, sup)
AssertionError
>>> safe_cast(Subclass_(bool, int), True) + 2
3
So how does this work? Well, if you've got a value of type
Subclass_[Sub, Super]
in scope, then (unless you've done something
really tricky) it must be the case that at some point, the assertion
written in Subclass_.__init__
passed. This means that
Sub
is a subclass of Super
, and so by the
Liskov substitution principle you can use any value of type
Sub
in a context where a Super
is expected.
To understand the technique in generality, let's refine this example and take a look at a few more.
Subclass
For the sake of usability, let's extend the code with three additional features:
- Throw a custom exception instead of using assertions
- Add the types as fields (for reasons explained below)
@dataclass(frozen=True)
class NotASubclass(Generic[Sub, Super], Exception):
sub: Type[Sub]
sup: Type[Super]
def __str__(self) -> str:
return f"{self.sub} is not a subclass of {self.sup}"
@final
@dataclass
class Subclass(Generic[Sub, Super]):
sub: Type[Sub]
sup: Type[Super]
def __init__(self, sub: Type[Sub], sup: Type[Super]) -> None:
self.sub = sub
self.sup = sup
if not issubclass(sub, sup):
raise NotASubclass(sub, sup)
So why add the types as fields? It's not necessary, but it does extend the reach of our reasoning by enabling us to add "lemmas" like this:
T1 = TypeVar("T1")
T2 = TypeVar("T2")
T3 = TypeVar("T3")
def subclass_transitive(
evidence1: Subclass[T1, T2],
evidence2: Subclass[T2, T3]) -> Subclass[T1, T3]:
return Subclass(evidence1.sub, evidence2.sup)
If you don't like the memory overhead, you can instead "axiomatize"
lemmas, by returning a sentinel value such as
Subclass[object, object]
with a
# type: ignore[return-value]
comment.
SameType
Subclassing is a partial order on types, and in particular, it's antisymmetric. If two types are both subclasses of each other, they're the same type:
@final
@dataclass
class SameType(Generic[T1, T2]):
ty: Type[T1]
def __init__(self, evidence1: Subclass[T1, T2],
evidence2: Subclass[T2, T1]) -> None:
self.ty = evidence1.sub
# These would be necessary for writing any lemmas:
def get1(self) -> Type[T1]:
return self.ty
def get2(self) -> Type[T2]:
# Could also save the evidence and use safe_cast here
return self.ty # type: ignore[return-value]
Here, we don't need a custom exception type nor any assertion in the
constructor, since both of these are "inherited" from
Subclass
.
Some ideas for lemmas:
SameType[T1, T2]
impliesSubclass[T1, T2]
SameType
is symmetricSameType
is transitive
InTuple
Now for something a bit more involved. Recall PEP 585: Type Hinting
Generics In Standard
Collections. This PEP
obviates typing.Dict
, typing.List
,
typing.Tuple
, and more by allowing us to write the
corresponding types directly inside type signatures:python```python
def repeat(i: int, n: int) -> list[int]:
pass
Old style (equivalent):
from typing import List
def repeat_old(i: int, n: int) -> List[int]: pass
Crucially, types like `dict[int, str]` are also values:
``` python
>>> dict[int, str]
dict[int, str]
Moreover, their type arguments are available at runtime:
>>> dict[int, str].__args__
(<class 'int'>, <class 'str'>)
So what can we do with this? Here's a generic class
InTuple[T, Tup]
. It has two type parameters, the second of
which is bounded above by Tuple[Any, ...]
. If you can
construct one, it means that the type T
is one of the types
in the tuple type Tup
.
from typing import Any, Tuple
# For some reason, tuple[Any, ...] doesn't work here, even though it is valid as
# a value at the REPL.
Tup = TypeVar("Tup", bound=Tuple[Any, ...])
T = TypeVar("T")
@dataclass(frozen=True)
class NotInTuple(Generic[T, Tup], Exception):
ty: Type[T]
tup: Type[Tup]
def __str__(self) -> str:
return f"Type {self.ty} is not in tuple type {self.tup}"
@final
@dataclass
class InTuple(Generic[T, Tup]):
"""Witness that ``T`` is in ``Tup``."""
ty: Type[T]
tup: Type[Tup]
def __init__(self, ty: Type[T], tup: Type[Tup]) -> None:
self.ty = ty
self.tup = tup
# Mypy version 0.910 doesn't acknowledge that this attribute is defined,
# but it is.
for tup_ty in tup.__args__: # type: ignore[attr-defined]
if ty == tup_ty:
return
raise NotInTuple(ty, tup)
Let's try it out:
>>> InTuple(int, tuple[int, str])
InTuple(ty=<class 'int'>, tup=tuple[int, str])
>>> InTuple(int, tuple[str, list[str], bool])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/blog/reify.py", line 44, in __init__
raise NotInTuple(ty, tup)
__main__.NotIn: Type <class 'int'> is not in tuple type tuple[str, list[str], bool]
So what can we do with this? Well, probably lots of things! We can write a function that gets the first value of a given type out of a tuple, regardless of its index:python```python def first(evidence: InTuple[T, Tup], tup: Tup) -> T: for elem in tup: if isinstance(elem, evidence.ty): return elem assert False, f"Impossible: Bad 'InTuple' value {evidence=} {tup=}"
``` python
>>> first(InTuple(int, tuple[str, int, bool, int]), ("string", 5, False, 10))
5
This isn't so impressive when we construct a literal
InTuple
value as the first argument, but might be helpful
in more polymorphic scenarios.
It's also worth noting that this function can be a bit unintuitive when combined with certain subclass relationships:
>>> issubclass(bool, int)
True
>>> first(InTuple(int, tuple[bool, int, str, int]), (False, 5, "string", 10))
False
It's worth noting that any of these predicates or relationships could
be negated, simply by inverting the conditional and renaming the class.
You could probably have a lemma that the negation of
InTuple
is reflexive, but I can't think of how you'd use
it.
IsGenericArgument
In fact, parameters to all generic types are available at runtime:
>>> from typing import Mapping, Generic, TypeVar
>>> Mapping[int, str].__args__
(<class 'int'>, <class 'str'>)
>>> T = TypeVar("T")
>>> class G(Generic[T]):
... pass
...
>>> G[int].__args__
(<class 'int'>,)
So we can handily generalize the previous example just by creating a new
exception type, replacing the Tup
variable with something
unconstrained, and renaming the class to IsGenericArgument
.
NewtypeOf
How can we tell if a given type is just a NewType
wrapper
around another one? Well, let's investigate:
>>> from typing import NewType
>>> Pos = NewType("Pos", int)
>>> Pos
<function NewType.<locals>.new_type at 0x7fa08729b4c0>
>>> dir(Pos)
[..., '__supertype__']
>>> Pos.__supertype__
<class 'int'>
Ah, that's just what we need!
from typing import Callable
@dataclass(frozen=True)
class NotANewtypeOf(Generic[Sub, Super], Exception):
sub: Callable[[Super], Sub]
sup: Type[Super]
def __str__(self) -> str:
return f'{getattr(self.sub, "__name__", "<unknown>")} is not a newtype of {self.sup}'
@final
@dataclass
class NewtypeOf(Generic[Sub, Super]):
sub: Callable[[Super], Sub]
sup: Type[Super]
def __init__(self, sub: Callable[[Super], Sub], sup: Type[Super]) -> None:
# Mypy has trouble with Callable fields...
self.sub = sub # type: ignore[assignment]
self.sup = sup
if not getattr(sub, "__supertype__") == sup:
raise NotANewtypeOf(sub, sup)
Some ideas for lemmas:
- Is this transitive?
NewtypeOf
impliesSubtype
Newtype
The above example also leads naturally to our first predicate or
property of interest on types, rather than a relationship between
types, namely, whether or not a given type is in fact a
NewType
.
@dataclass(frozen=True)
class NotANewtype(Generic[Sub, Super], Exception):
sub: Callable[[Super], Sub]
def __str__(self) -> str:
return f'{getattr(self.sub, "__name__", "<unknown>")} is not a newtype'
@final
@dataclass
class Newtype(Generic[Sub, Super]):
sub: Callable[[Super], Sub]
def __init__(self, sub: Callable[[Super], Sub]) -> None:
self.sub = sub # type: ignore[assignment]
if getattr(sub, "__supertype__") is None:
raise NotANewtype(sub)
More Ideas
What other predicates on and relations between types do we care about? A few ideas:
-
Subclassable
: Some types aren't! For example,<class 'function'>
(trytype(lambda: None)
in the REPL). -
IsAbstract
, orNotAbstract
: I don't know if it's possible to check this at runtime, but it would be cool if it were! -
HasMetaclass
: Check that a class has a given metaclass.
Do you have other ideas or use-cases? Drop me a line!