Dumping and Reloading CMF Content
tl;dr
A proposal for an architecture supporting dump and reload of content in a CMF site.
Overview
When moving content between systems, or when migrating the CMS between software versions, being able to dump the content to a "useful" filesystem representation can be amazingly freeing. This proposal outlines a pluggable architecture for enabling such operations, leveraging the mechanisms offered by the CMFSetup framework.
Lossy vs. Lossless Dump
Some usecases need to preserve all bits for content objects; such dumps are isomorphic with the ZODB export formats, with every attribute and subobject included, often in formats which are not "useful" (because they are opaque). Even the XML version of the ZODB export format is not very useful (although a sufficiently enterprising team of XSLT hackers might, in time, get something useful out of the format).
Many other applications, however, do not require lossless export; some may even have reasons to want to abandon some of the bits (e.g., to clean out cruft).
The architecture proposed in this document does not mandate either lossy or lossless export / import, treating that choice as a policy left to the plugins which do the actual work.
Comparison with Zope3's fssync
Zope3 offers a somewhat similar framework for serialization of content
to the filesystem. The focus of fssync
differs from this proposal:
the primary usecase for fssync
is to facilitate checking content out
of the Zope3 application, modifying it on the filesystem with traditional
tools, and then checking it back into Zope.
In order to support this use case, fssync
chooses to create a "lossless"
representation, as follows:
"File-like" objects serialize as a (single) file. Attributes which do not fit into this representation are written into an annotations directory, located under a special directory in its parent.
"Folderish" objects map to a directory, with the special annotations subdirectory used by both the folder and its child items.
CMFSetup Architecture
CMFSetup offers a couple of useful abstractions for this task:
ImportContext
objects abstract the reading of profile files, allowing the export to be in any of several formats (directory on a filesystem, created-on-the-fly tarball, even a separate folder tree in the same ZODB!)ExportContext
objects likewise abstract the writing of profile filesWe could include some or all of a site's initial content in a site profile, assuming that we register our handlers as part of the MetaProfile for that configuration.
Framework Interfaces
The adapters which must be supplied in order for content objects to play in this framework implement the following interfaces:
class IFilesystemExporter(Interface): """ Plugin interface for site structure export. """ def export(export_context, subdir): """ Export our 'context' using the API of 'export_context'. o 'export_context' must implement Products.GenericSupport.interfaces.IExportContext. o 'subdir', if passed, is the relative subdirectory containing our context within the site. """ def listExportableItems(): """ Return a sequence of the child items to be exported. o Each item in the returned sequence will implement IFilesystemExporter. """ class IFilesystemImporter(Interface): """ Plugin interface for site structure export. """ def import_(import_context, subdir): """ Import our 'context' using the API of 'import_context'. o 'import_context' must implement Products.GenericSupport.interfaces.IImportContext. o 'subdir', if passed, is the relative subdirectory containing our context within the site. """
Issues
-
Should this proposal be tied to the split of GenericSetup out from CMFSetup?
Consensus seems to favor this, which is good news, as we won't need to modify imports in the existing framework code.
Could we make use of the Zope3
fssync
framework? If it lacks features we need, could we extend it for our purposes, and submit the changes back to Zope3?