Migrations¶

As the the development goes, data schema in the application evolves. As this is NoSQL, there is no notion of “column” hence no way to update a whole table at a time. In a sense, this is a good point. Migrations may be done lazily with no need to lock the database for hours.

Migration module aims to provide simple tools for most common migration scenarios.

Migration concepts¶

Migrations involves 2 steps

detecting the current version

if need be, perform operations

Version detection will always be performed as long as a Migration class is associated with the DynamoDbModel to make sure the object is up to date.

The version is detected by running check_N successively on the raw boto data. N is a revision integer. Revisions number do not need to be consecutive and are sorted in natural decreasing order. Its means that N=11 is considered bigger than N=2.

If check_N returns True, detected version version will be N.

If check_N returns False, go on with the immediate lower version

If no check_N succeed, VersionError is raised.

Migration in itself is performed by successively running migrate_to_N on the raw boto data. This enables you to run incremental migration. The first migrator ran has N > current_version. Revision number N needs not be consecutive nor to have check_N equivalents.

If your lowest possible version is n, you need to have a check_n but no migrate_to_n as there is no lower version to migrate to n. On the contrary, you need to have both a migrator and a version checker to the latest revisions. The migrator will be needed to update older objects while the the version checker will ensure the Item is at the latest revision. If it returns True, no migration will be performed.

At the end of the process, the version is assumed to be the latest. No additional check will be performed. The migrated object needs to be saved manually.

When will the migration be useful?¶

Non null field is added

detection: no field in raw_data
migration: add the field in raw_data
Note: this is of no use if empty values are allowed as there is no distinction between empty and non existing values in boto

Renamed field

detection: old field name in raw_data
migration: insert a new field with the old value and del the old field in raw_data.

Deleted field

detection: old field still exist in raw data
migration: del old field from raw data

Type change

detection: if converting the raw data field to the expected type fails.
migration: perform the type conversion manually and serialize it back before returning other data

When will it be of no use?¶

Table rename: You need to manually fall-back to the old table.
Field migration between table: You still need some high level magic.

For complex use cases, you may consider freezing you application and running an EMR on it.

Use case: Rename field ‘mail’ to ‘email’¶

Migration engine¶

from dynamodb_mapper.migration import Migration

class UserMigration(Migration):
    # Is it at least compatible with first revision ?
    def check_1(self, raw_data):
        field_count = 0
        field_count += u"id" in raw_data and isinstance(raw_data[u"id"], unicode)
        field_count += u"energy" in raw_data and isinstance(raw_data[u"energy"], int)
        field_count += u"mail" in raw_data and isinstance(raw_data[u"mail"], unicode)

        return field_count == len(raw_data)

    #No migrator to version 1: in can not be older than version 1 !

    # Is the object Up to date ?
    def check_2(self, raw_data):
        field_count = 0
        field_count += u"id" in raw_data and isinstance(raw_data[u"id"], unicode)
        field_count += u"energy" in raw_data and isinstance(raw_data[u"energy"], int)
        field_count += u"email" in raw_data and isinstance(raw_data[u"email"], unicode)

        return field_count == len(raw_data)

    # migrate from previous revision (1) to this one (the latest)
    def migrate_to_2(self, raw_data):
        raw_data[u"email"] = raw_data[u"mail"]
        del raw_data[u"mail"]
        return raw_data

Enable migrations in model¶

from dynamodb_mapper.model import DynamoDBModel

class User(DynamoDBModel):
    __table__ = "user"
    __hash_key__ = "id"
    __migrator__ = UserMigration # Single line to add !
    __schema__ = {
        "id": unicode,
        "energy": int,
        "email": unicode
    }

Example run¶

Let’s say you have an object at revision 1 in the db. It will look like this:

raw_data_version_1 = {
    u"id": u"Jackson",
    u"energy": 6742348,
    u"mail": u"jackson@tldr-ludia.com",
}

Now, migrate it:

>>> jackson = User.get(u"Jackson")
# Done, jackson is migrated, but let's check it
>>> print jackson.email
u"jackson@tldr-ludia.com" #Alright !
>>> jackson.save(raise_on_conflict=True)
# Should go fine if no concurrent access

`raise_on_conflict` integration¶

Internally, raise_on_conflict relies on the raw data dict from boto to generate a non conflict detection. This dict is stored in the model instance before the migration engine is triggered so that raise_on_conflict feature will keep on working as expected.

This behavior guarantees that Transactions works as expected even when dealing with migrated objects.

Migrations¶

Migration concepts¶

When will the migration be useful?¶

When will it be of no use?¶

Use case: Rename field ‘mail’ to ‘email’¶

Migration engine¶

Enable migrations in model¶

Example run¶

`raise_on_conflict` integration¶

Project Versions

Table Of Contents

Previous topic

Next topic

This Page

Navigation

Migrations¶

Migration concepts¶

When will the migration be useful?¶

When will it be of no use?¶

Use case: Rename field ‘mail’ to ‘email’¶

Migration engine¶

Enable migrations in model¶

Example run¶

raise_on_conflict integration¶

Related exceptions¶

VersionError¶

Project Versions

RTD Search

Table Of Contents

Previous topic

Next topic

This Page

Quick search

Navigation

`raise_on_conflict` integration¶