From the example of Django Book, I understand if I create models as following:

from xxx import B

class A(models.Model):
    b = ManyToManyField(B)

The Django would create a new table(A_B) beyond Table A, which has three columns:

  • id
  • a_id
  • b_id

But now I want to add a new column in the Table A_B, thus would be very easy if I use normal SQL, but now anyone can help me how to do? I can’t find any useful information in this book.

It’s very easy using django too! You can use through to define your own manytomany intermediary tables

Documentation provides an example addressing your issue:

Extra fields on many-to-many relationships

class Person(models.Model):
    name = models.CharField(max_length=128)

    def __unicode__(self):
        return self.name

class Group(models.Model):
    name = models.CharField(max_length=128)
    members = models.ManyToManyField(Person, through="Membership")

    def __unicode__(self):
        return self.name

class Membership(models.Model):
    person = models.ForeignKey(Person, on_delete=models.CASCADE)
    group = models.ForeignKey(Group, on_delete=models.CASCADE)
    date_joined = models.DateField()
    invite_reason = models.CharField(max_length=64)

Under the hood, Django creates automatically a through model. It is possible to modify this automatic model foreign key column names.

I could not test the implications on all scenarios, so far it works properly for me.

Using Django 1.8 and onwards’ _meta api:

class Person(models.Model):
    pass

class Group(models.Model):
    members = models.ManyToManyField(Person)

Group.members.through._meta.get_field('person').column = 'alt_person_id'
Group.members.through._meta.get_field('group' ).column =  'alt_group_id'

# Prior to Django 1.8 _meta can also be used, but is more hackish than this
Group.members.through.person.field.column = 'alt_person_id'
Group.members.through.group .field.column =  'alt_group_id'

As @dm03514 has answered it is indeed very easy to add column to M2M table via
defining explicitly the M2M through model and adding the desired field there.

However if you would like to add some column to all m2m tables – such approach
wouldn’t be sufficient
, because it would require to define the M2M through
models for all ManyToManyField‘s that have been defined across the project.

In my case I wanted to add a “created” timestamp column to all M2M tables that
Django generates “under the hood”
without the necessity of defining a separate
model for every ManyToManyField field used in the project. I came up with a
neat solution presented bellow. Cheers!

Introduction

While Django scans your models at startup it creates automatically an implicit
through model for every ManyToManyField that does not define it explicitly.

class ManyToManyField(RelatedField):
    # (...)

    def contribute_to_class(self, cls, name, **kwargs):
        # (...)
        super().contribute_to_class(cls, name, **kwargs)

        # The intermediate m2m model is not auto created if:
        #  1) There is a manually specified intermediate, or
        #  2) The class owning the m2m field is abstract.
        #  3) The class owning the m2m field has been swapped out.
        if not cls._meta.abstract:
            if self.remote_field.through:
                def resolve_through_model(_, model, field):
                    field.remote_field.through = model
                lazy_related_operation(resolve_through_model, cls, self.remote_field.through, field=self)
            elif not cls._meta.swapped:
                self.remote_field.through = create_many_to_many_intermediary_model(self, cls)

Source: ManyToManyField.contribute_to_class()

For creation of this implicit model Django uses the
create_many_to_many_intermediary_model() function, which constructs new class
that inherits from models.Model and contains foreign keys to both sides of the
M2M relation. Source: django.db.models.fields.related.create_many_to_many_intermediary_model()

In order to add some column to all auto generated M2M through tables you will
need to monkeypatch this function.

The solution

First you should create the new version of the function that will be used to
patch the original Django function. To do so just copy the code of the function
from Django sources and add the desired fields to the class it returns:

# For example in: <project_root>/lib/monkeypatching/custom_create_m2m_model.py
def create_many_to_many_intermediary_model(field, klass):
    # (...)
    return type(name, (models.Model,), {
        'Meta': meta,
        '__module__': klass.__module__,
        from_: models.ForeignKey(
            klass,
            related_name="%s+" % name,
            db_tablespace=field.db_tablespace,
            db_constraint=field.remote_field.db_constraint,
            on_delete=CASCADE,
        ),
        to: models.ForeignKey(
            to_model,
            related_name="%s+" % name,
            db_tablespace=field.db_tablespace,
            db_constraint=field.remote_field.db_constraint,
            on_delete=CASCADE,
        ),
        # Add your custom-need fields here:
        'created': models.DateTimeField(
            auto_now_add=True,
            verbose_name="Created (UTC)",
        ),
    })

Then you should enclose the patching logic in a separate function:

# For example in: <project_root>/lib/monkeypatching/patches.py
def django_m2m_intermediary_model_monkeypatch():
    """ We monkey patch function responsible for creation of intermediary m2m
        models in order to inject there a "created" timestamp.
    """
    from django.db.models.fields import related
    from lib.monkeypatching.custom_create_m2m_model import create_many_to_many_intermediary_model
    setattr(
        related,
        'create_many_to_many_intermediary_model',
        create_many_to_many_intermediary_model
    )

Finally you have to perform patching, before Django kicks in. Put such code in
__init__.py file located next to your Django project settings.py file:

# <project_root>/<project_name>/__init__.py
from lib.monkeypatching.patches import django_m2m_intermediary_model_monkeypatch
django_m2m_intermediary_model_monkeypatch()

Few other things worth mentioning

  1. Remember that this does not affect m2m tables that have been created in the
    db in the past
    , so if you are introducing this solution in a project that
    already had ManyToManyField fields migrated to db, you will need to prepare a
    custom migration that will add your custom columns to the tables which were
    created before the monkeypatch. Sample migration provided below 🙂

    from django.db import migrations
    
    def auto_created_m2m_fields(_models):
        """ Retrieves M2M fields from provided models but only those that have auto
            created intermediary models (not user-defined through models).
        """
        for model in _models:
            for field in model._meta.get_fields():
                if (
                        isinstance(field, models.ManyToManyField)
                        and field.remote_field.through._meta.auto_created
                ):
                    yield field
    
    def add_created_to_m2m_tables(apps, schema_editor):
        # Exclude proxy models that don't have separate tables in db
        selected_models = [
            model for model in apps.get_models()
            if not model._meta.proxy
        ]
    
        # Select only m2m fields that have auto created intermediary models and then
        # retrieve m2m intermediary db tables
        tables = [
            field.remote_field.through._meta.db_table
            for field in auto_created_m2m_fields(selected_models)
        ]
    
        for table_name in tables:
            schema_editor.execute(
                f'ALTER TABLE {table_name} ADD COLUMN IF NOT EXISTS created '
                'timestamp with time zone NOT NULL DEFAULT now()',
            )
    
    
    class Migration(migrations.Migration):
        dependencies = []
        operations = [migrations.RunPython(add_created_to_m2m_tables)]
    
  2. Remember that the solution presented only affects the tables that Django
    creates automatically for ManyToManyField fields that do not define the
    through model
    . If you already have some explicit m2m through models you will
    need to add your custom-need columns there manually.

  3. The patched create_many_to_many_intermediary_model function will apply also
    to the models of all 3rd-party apps listed in your INSTALLED_APPS setting.

  4. Last but not least, remember that if you upgrade Django version the original
    source code of the patched function may change (!)
    . It’s a good idea to setup a
    simple unit test that will warn you if such situation happens in the future.

To do so modify the patching function to save the original Django function:

# For example in: <project_root>/lib/monkeypatching/patches.py
def django_m2m_intermediary_model_monkeypatch():
    """ We monkey patch function responsible for creation of intermediary m2m
        models in order to inject there a "created" timestamp.
    """
    from django.db.models.fields import related
    from lib.monkeypatching.custom_create_m2m_model import create_many_to_many_intermediary_model
    # Save the original Django function for test
    original_function = related.create_many_to_many_intermediary_model
    setattr(
        create_many_to_many_intermediary_model,
        '_original_django_function',
        original_function
    )
    # Patch django function with our version of this function
    setattr(
        related,
        'create_many_to_many_intermediary_model',
        create_many_to_many_intermediary_model
    )

Compute the hash of the source code of the original Django function and prepare
a test that checks whether it is still the same as when you patched it:

def _hash_source_code(_obj):
    from inspect import getsourcelines
    from hashlib import md5
    source_code="".join(getsourcelines(_obj)[0])
    return md5(source_code.encode()).hexdigest()

def test_original_create_many_to_many_intermediary_model():
    """ This test checks whether the original Django function that has been
        patched did not changed. The hash of function source code is compared
        and if it does not match original hash, that means that Django version
        could have been upgraded and patched function could have changed.
    """
    from django.db.models.fields.related import create_many_to_many_intermediary_model
    original_function_md5_hash="69d8cea3ce9640f64ce7b1df1c0934b8" # hash obtained before patching (Django 2.0.3)
    original_function = getattr(
        create_many_to_many_intermediary_model,
        '_original_django_function',
        None
    )
    assert original_function
    assert _hash_source_code(original_function) == original_function_md5_hash

Cheers

I hope someone will find this answer useful 🙂