.. _migrations: ########## Migrations ########## As the the development goes, data schema in the application evolves. As this is NoSQL, there is no notion of "column" hence no way to update a whole table at a time. In a sense, this is a good point. Migrations may be done lazily with no need to lock the database for hours. Migration module aims to provide simple tools for most common migration scenarios. Migration concepts ================== Migrations involves 2 steps 1. detecting the current version 2. if need be, perform operations Version detection will **always** be performed as long as a ``Migration`` class is associated with the ``DynamoDbModel`` to make sure the object is up to date. The version is detected by running ``check_N`` successively on the raw boto data. ``N`` is a revision integer. Revisions number do not need to be consecutive and are sorted in natural decreasing order. Its means that ``N=11`` is considered bigger than ``N=2``. - If ``check_N`` returns ``True``, detected version version will be ``N``. - If ``check_N`` returns ``False``, go on with the immediate lower version - If no ``check_N`` succeed, :py:class:`~.VersionError` is raised. Migration in itself is performed by successively running ``migrate_to_N`` on the raw boto data. This enables you to run incremental migration. The first migrator ran has ``N > current_version``. Revision number ``N`` needs not be consecutive nor to have ``check_N`` equivalents. If your lowest possible version is ``n``, you need to have a ``check_n`` but no ``migrate_to_n`` as there is no lower version to migrate to ``n``. On the contrary, you need to have both a migrator and a version checker to the latest revisions. The migrator will be needed to update older objects while the the version checker will ensure the Item is at the latest revision. If it returns ``True``, no migration will be performed. At the end of the process, the version is assumed to be the latest. No additional check will be performed. The migrated object needs to be saved manually. When will the migration be useful? ---------------------------------- Non null field is added - **detection**: no field in raw_data - **migration**: add the field in raw_data - Note: this is of no use if empty values are allowed as there is no distinction between empty and non existing values in boto Renamed field - **detection**: old field name in raw_data - **migration**: insert a new field with the old value and ``del`` the old field in raw_data. Deleted field - **detection**: old field still exist in raw data - **migration**: ``del`` old field from raw data Type change - **detection**: if converting the raw data field to the expected type fails. - **migration**: perform the type conversion manually and serialize it back *before* returning other data When will it be of no use? -------------------------- Table rename You need to manually fall-back to the old table. Field migration between table You still need some high level magic. For complex use cases, you may consider freezing you application and running an EMR on it. Use case: Rename field 'mail' to 'email' ======================================== Migration engine ---------------- :: from dynamodb_mapper.migration import Migration class UserMigration(Migration): # Is it at least compatible with first revision ? def check_1(self, raw_data): field_count = 0 field_count += u"id" in raw_data and isinstance(raw_data[u"id"], unicode) field_count += u"energy" in raw_data and isinstance(raw_data[u"energy"], int) field_count += u"mail" in raw_data and isinstance(raw_data[u"mail"], unicode) return field_count == len(raw_data) #No migrator to version 1: in can not be older than version 1 ! # Is the object Up to date ? def check_2(self, raw_data): field_count = 0 field_count += u"id" in raw_data and isinstance(raw_data[u"id"], unicode) field_count += u"energy" in raw_data and isinstance(raw_data[u"energy"], int) field_count += u"email" in raw_data and isinstance(raw_data[u"email"], unicode) return field_count == len(raw_data) # migrate from previous revision (1) to this one (the latest) def migrate_to_2(self, raw_data): raw_data[u"email"] = raw_data[u"mail"] del raw_data[u"mail"] return raw_data Enable migrations in model -------------------------- :: from dynamodb_mapper.model import DynamoDBModel class User(DynamoDBModel): __table__ = "user" __hash_key__ = "id" __migrator__ = UserMigration # Single line to add ! __schema__ = { "id": unicode, "energy": int, "email": unicode } Example run ----------- Let's say you have an object at revision 1 in the db. It will look like this: :: raw_data_version_1 = { u"id": u"Jackson", u"energy": 6742348, u"mail": u"jackson@tldr-ludia.com", } Now, migrate it: >>> jackson = User.get(u"Jackson") # Done, jackson is migrated, but let's check it >>> print jackson.email u"jackson@tldr-ludia.com" #Alright ! >>> jackson.save(raise_on_conflict=True) # Should go fine if no concurrent access ``raise_on_conflict`` integration ================================= Internally, ``raise_on_conflict`` relies on the raw data dict from boto to generate a non conflict detection. This dict is stored in the model instance *before* the migration engine is triggered so that `raise_on_conflict` feature will keep on working as expected. This behavior guarantees that :ref:`transactions` works as expected even when dealing with migrated objects. Related exceptions ================== VersionError ------------ .. autoclass:: dynamodb_mapper.migration.VersionError