langston-barrrett.github.io

Fun with Mypy: Reifying Runtime Relations on Types

I'm a big fan of type checkers for Python, like Mypy. One fun aspect of these tools is that their type systems tend to have features that reflect the dynamic nature of Python, like type narrowing. This post describes another, less-documented way to use runtime evidence of type information.

First, consider the cast operation in in the typing module. The docs say:

typing.cast(typ, val)

Cast a value to a type.

This returns the value unchanged. To the type checker this signals that the return value has the designated type, but at runtime we intentionally don't check anything (we want this to be as fast as possible).

In other words, it's up to the programmer to ensure that the runtime type of the value argument really is a subtype of the type argument. What if we could make a version of cast, perhaps safe_cast, that makes sure that's the case? It would need to meet two requirements:

Performance: No overhead at calls to safe_cast, just like cast. No assertions, no conditionals.
Safety: It should be the case that every call to safe_cast is guaranteed (up to bugs in Mypy or programmers doing something obviously tricky) to succeed.

Well, there's good news! We can:

from __future__ import annotations

from dataclasses import dataclass
from typing import Generic, Type, TypeVar, final

Sub = TypeVar("Sub")
Super = TypeVar("Super")


@final
class Subclass_(Generic[Sub, Super]):
    def __init__(self, sub: Type[Sub], sup: Type[Super]) -> None:
        assert issubclass(sub, sup)


def safe_cast(evidence: Subclass_[Sub, Super], value: Sub) -> Super:
    return value  # type: ignore[return-value]

Here's what it looks like at the REPL:

>>> Subclass_(bool, int)
Subclass_()
>>> Subclass_(bool, str)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/blog/reify.py", line 13, in __init__
    assert issubclass(sub, sup)
AssertionError
>>> safe_cast(Subclass_(bool, int), True) + 2
3

So how does this work? Well, if you've got a value of type Subclass_[Sub, Super] in scope, then (unless you've done something really tricky) it must be the case that at some point, the assertion written in Subclass_.__init__ passed. This means that Sub is a subclass of Super, and so by the Liskov substitution principle you can use any value of type Sub in a context where a Super is expected.

To understand the technique in generality, let's refine this example and take a look at a few more.

`Subclass`

For the sake of usability, let's extend the code with three additional features:

Throw a custom exception instead of using assertions
Add the types as fields (for reasons explained below)

@dataclass(frozen=True)
class NotASubclass(Generic[Sub, Super], Exception):
    sub: Type[Sub]
    sup: Type[Super]

    def __str__(self) -> str:
        return f"{self.sub} is not a subclass of {self.sup}"


@final
@dataclass
class Subclass(Generic[Sub, Super]):

    sub: Type[Sub]
    sup: Type[Super]

    def __init__(self, sub: Type[Sub], sup: Type[Super]) -> None:
        self.sub = sub
        self.sup = sup
        if not issubclass(sub, sup):
            raise NotASubclass(sub, sup)

So why add the types as fields? It's not necessary, but it does extend the reach of our reasoning by enabling us to add "lemmas" like this:

T1 = TypeVar("T1")
T2 = TypeVar("T2")
T3 = TypeVar("T3")

def subclass_transitive(
        evidence1: Subclass[T1, T2],
        evidence2: Subclass[T2, T3]) -> Subclass[T1, T3]:
    return Subclass(evidence1.sub, evidence2.sup)

If you don't like the memory overhead, you can instead "axiomatize" lemmas, by returning a sentinel value such as Subclass[object, object] with a # type: ignore[return-value] comment.

`SameType`

Subclassing is a partial order on types, and in particular, it's antisymmetric. If two types are both subclasses of each other, they're the same type:

@final
@dataclass
class SameType(Generic[T1, T2]):

    ty: Type[T1]

    def __init__(self, evidence1: Subclass[T1, T2],
                 evidence2: Subclass[T2, T1]) -> None:
        self.ty = evidence1.sub

    # These would be necessary for writing any lemmas:

    def get1(self) -> Type[T1]:
        return self.ty

    def get2(self) -> Type[T2]:
        # Could also save the evidence and use safe_cast here
        return self.ty  # type: ignore[return-value]

Here, we don't need a custom exception type nor any assertion in the constructor, since both of these are "inherited" from Subclass.

Some ideas for lemmas:

SameType[T1, T2] implies Subclass[T1, T2]
SameType is symmetric
SameType is transitive

`InTuple`

Now for something a bit more involved. Recall PEP 585: Type Hinting Generics In Standard Collections. This PEP obviates typing.Dict, typing.List, typing.Tuple, and more by allowing us to write the corresponding types directly inside type signatures:python```python def repeat(i: int, n: int) -> list[int]: pass

Old style (equivalent):

from typing import List

def repeat_old(i: int, n: int) -> List[int]: pass


Crucially, types like `dict[int, str]` are also values:

``` python
>>> dict[int, str]
dict[int, str]

Moreover, their type arguments are available at runtime:

>>> dict[int, str].__args__
(<class 'int'>, <class 'str'>)

So what can we do with this? Here's a generic class InTuple[T, Tup]. It has two type parameters, the second of which is bounded above by Tuple[Any, ...]. If you can construct one, it means that the type T is one of the types in the tuple type Tup.

from typing import Any, Tuple

# For some reason, tuple[Any, ...] doesn't work here, even though it is valid as
# a value at the REPL.
Tup = TypeVar("Tup", bound=Tuple[Any, ...])
T = TypeVar("T")


@dataclass(frozen=True)
class NotInTuple(Generic[T, Tup], Exception):
    ty: Type[T]
    tup: Type[Tup]

    def __str__(self) -> str:
        return f"Type {self.ty} is not in tuple type {self.tup}"


@final
@dataclass
class InTuple(Generic[T, Tup]):
    """Witness that ``T`` is in ``Tup``."""

    ty: Type[T]
    tup: Type[Tup]

    def __init__(self, ty: Type[T], tup: Type[Tup]) -> None:
        self.ty = ty
        self.tup = tup
        # Mypy version 0.910 doesn't acknowledge that this attribute is defined,
        # but it is.
        for tup_ty in tup.__args__:  # type: ignore[attr-defined]
            if ty == tup_ty:
                return
        raise NotInTuple(ty, tup)

Let's try it out:

>>> InTuple(int, tuple[int, str])
InTuple(ty=<class 'int'>, tup=tuple[int, str])

>>> InTuple(int, tuple[str, list[str], bool])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/blog/reify.py", line 44, in __init__
    raise NotInTuple(ty, tup)
__main__.NotIn: Type <class 'int'> is not in tuple type tuple[str, list[str], bool]

So what can we do with this? Well, probably lots of things! We can write a function that gets the first value of a given type out of a tuple, regardless of its index:python```python def first(evidence: InTuple[T, Tup], tup: Tup) -> T: for elem in tup: if isinstance(elem, evidence.ty): return elem assert False, f"Impossible: Bad 'InTuple' value {evidence=} {tup=}"


``` python
>>> first(InTuple(int, tuple[str, int, bool, int]), ("string", 5, False, 10))
5

This isn't so impressive when we construct a literal InTuple value as the first argument, but might be helpful in more polymorphic scenarios.

It's also worth noting that this function can be a bit unintuitive when combined with certain subclass relationships:

>>> issubclass(bool, int)
True
>>> first(InTuple(int, tuple[bool, int, str, int]), (False, 5, "string", 10))
False

It's worth noting that any of these predicates or relationships could be negated, simply by inverting the conditional and renaming the class. You could probably have a lemma that the negation of InTuple is reflexive, but I can't think of how you'd use it.

`IsGenericArgument`

In fact, parameters to all generic types are available at runtime:

>>> from typing import Mapping, Generic, TypeVar

>>> Mapping[int, str].__args__
(<class 'int'>, <class 'str'>)

>>> T = TypeVar("T")
>>> class G(Generic[T]):
...     pass
...
>>> G[int].__args__
(<class 'int'>,)

So we can handily generalize the previous example just by creating a new exception type, replacing the Tup variable with something unconstrained, and renaming the class to IsGenericArgument.

`NewtypeOf`

How can we tell if a given type is just a NewType wrapper around another one? Well, let's investigate:

>>> from typing import NewType
>>> Pos = NewType("Pos", int)
>>> Pos
<function NewType.<locals>.new_type at 0x7fa08729b4c0>
>>> dir(Pos)
[..., '__supertype__']
>>> Pos.__supertype__
<class 'int'>

Ah, that's just what we need!

from typing import Callable


@dataclass(frozen=True)
class NotANewtypeOf(Generic[Sub, Super], Exception):
    sub: Callable[[Super], Sub]
    sup: Type[Super]

    def __str__(self) -> str:
        return f'{getattr(self.sub, "__name__", "<unknown>")} is not a newtype of {self.sup}'


@final
@dataclass
class NewtypeOf(Generic[Sub, Super]):

    sub: Callable[[Super], Sub]
    sup: Type[Super]

    def __init__(self, sub: Callable[[Super], Sub], sup: Type[Super]) -> None:
        # Mypy has trouble with Callable fields...
        self.sub = sub  # type: ignore[assignment]
        self.sup = sup
        if not getattr(sub, "__supertype__") == sup:
            raise NotANewtypeOf(sub, sup)

Some ideas for lemmas:

Is this transitive?
NewtypeOf implies Subtype

`Newtype`

The above example also leads naturally to our first predicate or property of interest on types, rather than a relationship between types, namely, whether or not a given type is in fact a NewType.

@dataclass(frozen=True)
class NotANewtype(Generic[Sub, Super], Exception):
    sub: Callable[[Super], Sub]

    def __str__(self) -> str:
        return f'{getattr(self.sub, "__name__", "<unknown>")} is not a newtype'


@final
@dataclass
class Newtype(Generic[Sub, Super]):

    sub: Callable[[Super], Sub]

    def __init__(self, sub: Callable[[Super], Sub]) -> None:
        self.sub = sub  # type: ignore[assignment]
        if getattr(sub, "__supertype__") is None:
            raise NotANewtype(sub)

More Ideas

What other predicates on and relations between types do we care about? A few ideas:

Subclassable: Some types aren't! For example, <class 'function'> (try type(lambda: None) in the REPL).
IsAbstract, or NotAbstract: I don't know if it's possible to check this at runtime, but it would be cool if it were!
HasMetaclass: Check that a class has a given metaclass.

Do you have other ideas or use-cases? Drop me a line!