fastimport - Import/export of history via fast-import format

Version 0.10.0dev
Branch lp:bzr-fastimport
Home page https://launchpad.net/bzr-fastimport
Owner jelmer
GNU/Linux Yes
Windows Yes
Mac OS X Yes

FastImport Plugin

The fastimport plugin provides stream-based importing and exporting of data into and out of Bazaar. As well as enabling interchange between multiple VCS tools, fastimport/export can be useful for complex branch operations, e.g. partitioning off part of a code base in order to Open Source it.

The normal import recipe is:

bzr fast-export-from-xxx SOURCE project.fi
bzr fast-import project.fi project.bzr

If fast-export-from-xxx doesn’t exist yet for the tool you’re importing from, the alternative recipe is:

front-end > project.fi
bzr fast-import project.fi project.bzr

In either case, if you wish to save disk space, project.fi can be compressed to gzip format after it is generated like this:

(generate project.fi)
gzip project.fi
bzr fast-import project.fi.gz project.bzr

The list of known front-ends and their status is documented on http://bazaar-vcs.org/BzrFastImport/FrontEnds. The fast-export-from-xxx commands provide simplified access to these so that the majority of users can generate a fast-import dump file without needing to study up on all the options - and the best combination of them to use - for the front-end relevant to them. In some cases, a fast-export-from-xxx wrapper will require that certain dependencies are installed so it checks for these before starting. A wrapper may also provide a limited set of options. See the online help for the individual commands for details:

bzr help fast-export-from-cvs
bzr help fast-export-from-darcs
bzr help fast-export-from-hg
bzr help fast-export-from-git
bzr help fast-export-from-mtn
bzr help fast-export-from-p4
bzr help fast-export-from-svn

Once a fast-import dump file is created, it can be imported into a Bazaar repository using the fast-import command. If required, you can manipulate the stream first using the fast-import-filter command. This is useful for creating a repository with just part of a project or for removing large old binaries (say) from history that are no longer valuable to retain. For further details on importing, manipulating and reporting on fast-import streams, see the online help for the commands:

bzr help fast-import
bzr help fast-import-filter
bzr help fast-import-info
bzr help fast-import-query

Finally, you may wish to generate a fast-import dump file from a Bazaar repository. The fast-export command is provided for that purpose.

To report bugs or publish enhancements, visit the bzr-fastimport project page on Launchpad, https://launchpad.net/bzr-fastimport.

fast-export

Purpose

Generate a fast-import stream from a Bazaar branch.

Usage

bzr fast-export SOURCE [DESTINATION]

Options

-v, --verbose Display more information.
--plain Exclude metadata to maximise interoperability.
--marks=FILE Import marks from and export marks to file.
-h, --help Show help message.
-q, --quiet Only display errors and warnings.
--import-marks=FILE
 Import marks from file.
--checkpoint=N Checkpoint every N revisions (default=10000).
-b FILE, --git-branch=FILE
 Name of the git branch to create (default=master).
--usage Show usage message and options.
--export-marks=FILE
 Export marks to file.
-r ARG, --revision=ARG
 See “help revisionspec” for details.

Description

This program generates a stream from a Bazaar branch in fast-import format used by tools such as bzr fast-import, git-fast-import and hg-fast-import.

If no destination is given or the destination is ‘-‘, standard output is used. Otherwise, the destination is the name of a file. If the destination ends in ‘.gz’, the output will be compressed into gzip format.

Round-tripping

Recent versions of the fast-import specification support features that allow effective round-tripping of many Bazaar branches. As such, fast-exporting a branch and fast-importing the data produced will create a new repository with equivalent history, i.e. “bzr log -v -p –include-merges –forward” on the old branch and new branch should produce similar, if not identical, results.

Note

Be aware that the new repository may appear to have similar history but internally it is quite different with new revision-ids and file-ids assigned. As a consequence, the ability to easily merge with branches based on the old repository is lost. Depending on your reasons for producing a new repository, this may or may not be an issue.

Interoperability

fast-export can use the following “extended features” to produce a richer data stream:

  • multiple-authors - if a commit has multiple authors (as commonly occurs in pair-programming), all authors will be included in the output, not just the first author
  • commit-properties - custom metadata per commit that Bazaar stores in revision properties (e.g. branch-nick and bugs fixed by this change) will be included in the output.
  • empty-directories - directories, even the empty ones, will be included in the output.

To disable these features and produce output acceptable to git 1.6, use the –plain option. To enable these features, use –no-plain. Currently, –plain is the default but that will change in the near future once the feature names and definitions are formally agreed to by the broader fast-import developer community.

Examples

To produce data destined for import into Bazaar:

bzr fast-export --no-plain my-bzr-branch my.fi.gz

To produce data destined for Git 1.6:

bzr fast-export --plain my-bzr-branch my.fi

To import several unmerged but related branches into the same repository, use the –{export,import}-marks options, and specify a name for the git branch like this:

bzr fast-export --export-marks=marks.bzr project.dev |
       GIT_DIR=project/.git git-fast-import --export-marks=marks.git

bzr fast-export --import-marks=marks.bzr -b other project.other |
       GIT_DIR=project/.git git-fast-import --import-marks=marks.git

If you get a “Missing space after source” error from git-fast-import, see the top of the commands.py module for a work-around.

See also

fast-import, fast-import-filter

fast-export-from-cvs

Purpose

Generate a fast-import file from a CVS repository.

Usage

bzr fast-export-from-cvs SOURCE DESTINATION

Options

--sort=PATH GNU sort program location if not on the path.
--trunk-only Export just the trunk, ignoring tags and branches.
-v, --verbose Display more information.
--encoding=CODEC
 Encoding used for filenames, commit messages and author names if not ascii.
-q, --quiet Only display errors and warnings.
--usage Show usage message and options.
-h, --help Show help message.

Description

Destination is a dump file, typically named xxx.fi where xxx is the name of the project. If ‘-‘ is given, standard output is used.

cvs2svn 2.3 or later must be installed as its cvs2bzr script is used under the covers to do the export.

The source must be the path on your filesystem to the part of the repository you wish to convert. i.e. either that path or a parent directory must contain a CVSROOT subdirectory. The path may point to either the top of a repository or to a path within it. In the latter case, only that project within the repository will be converted.

Note

Remote access to the repository is not sufficient - the path must point into a copy of the repository itself. See http://cvs2svn.tigris.org/faq.html#repoaccess for instructions on how to clone a remote CVS repository locally.

By default, the trunk, branches and tags are all exported. If you only want the trunk, use the –trunk-only option.

By default, filenames, log messages and author names are expected to be encoded in ascii. Use the –encoding option to specify an alternative. If multiple encodings are used, specify the option multiple times. For a list of valid encoding names, see http://docs.python.org/lib/standard-encodings.html.

Windows users need to install GNU sort and use the –sort option to specify its location. GNU sort can be downloaded from http://unxutils.sourceforge.net/.

See also

fast-import, fast-import-filter

fast-export-from-darcs

Purpose

Generate a fast-import file from a Darcs repository.

Usage

bzr fast-export-from-darcs SOURCE DESTINATION

Options

--usage Show usage message and options.
--encoding=CODEC
 Encoding used for commit messages if not utf-8.
-v, --verbose Display more information.
-q, --quiet Only display errors and warnings.
-h, --help Show help message.

Description

Destination is a dump file, typically named xxx.fi where xxx is the name of the project. If ‘-‘ is given, standard output is used.

Darcs 2.2 or later must be installed as various subcommands are used to access the source repository. The source may be a network URL but using a local URL is recommended for performance reasons.

See also

fast-import, fast-import-filter

fast-export-from-git

Purpose

Generate a fast-import file from a Git repository.

Usage

bzr fast-export-from-git SOURCE DESTINATION

Options

--usage Show usage message and options.
-v, --verbose Display more information.
-q, --quiet Only display errors and warnings.
-h, --help Show help message.

Description

Destination is a dump file, typically named xxx.fi where xxx is the name of the project. If ‘-‘ is given, standard output is used.

Git 1.6 or later must be installed as the git fast-export subcommand is used under the covers to generate the stream. The source must be a local directory.

Note

Earlier versions of Git may also work fine but are likely to receive less active support if problems arise.

See also

fast-import, fast-import-filter

fast-export-from-hg

Purpose

Generate a fast-import file from a Mercurial repository.

Usage

bzr fast-export-from-hg SOURCE DESTINATION

Options

--usage Show usage message and options.
-v, --verbose Display more information.
-q, --quiet Only display errors and warnings.
-h, --help Show help message.

Description

Destination is a dump file, typically named xxx.fi where xxx is the name of the project. If ‘-‘ is given, standard output is used.

Mercurial 1.2 or later must be installed as its libraries are used to access the source repository. Given the APIs currently used, the source repository must be a local file, not a network URL.

See also

fast-import, fast-import-filter

fast-export-from-mtn

Purpose

Generate a fast-import file from a Monotone repository.

Usage

bzr fast-export-from-mtn SOURCE DESTINATION

Options

--usage Show usage message and options.
-v, --verbose Display more information.
-q, --quiet Only display errors and warnings.
-h, --help Show help message.

Description

Destination is a dump file, typically named xxx.fi where xxx is the name of the project. If ‘-‘ is given, standard output is used.

Monotone 0.43 or later must be installed as the mtn git_export subcommand is used under the covers to generate the stream. The source must be a local directory.

See also

fast-import, fast-import-filter

fast-export-from-p4

Purpose

Generate a fast-import file from a Perforce repository.

Usage

bzr fast-export-from-p4 SOURCE DESTINATION

Options

--usage Show usage message and options.
-v, --verbose Display more information.
-q, --quiet Only display errors and warnings.
-h, --help Show help message.

Description

Source is a Perforce depot path, e.g., //depot/project

Destination is a dump file, typically named xxx.fi where xxx is the name of the project. If ‘-‘ is given, standard output is used.

bzrp4 must be installed as its p4_fast_export.py module is used under the covers to do the export. bzrp4 can be downloaded from https://launchpad.net/bzrp4/.

The P4PORT environment variable must be set, and you must be logged into the Perforce server.

By default, only the HEAD changelist is exported. To export all changelists, append '@all‘ to the source. To export a revision range, append a comma-delimited pair of changelist numbers to the source, e.g., ‘100,200’.

See also

fast-import, fast-import-filter

fast-export-from-svn

Purpose

Generate a fast-import file from a Subversion repository.

Usage

bzr fast-export-from-svn SOURCE DESTINATION

Options

--tags-path=STR
 Path in repo to /tags.
-v, --verbose Display more information.
-q, --quiet Only display errors and warnings.
--trunk-path=STR
 Path in repo to /trunk. May be regex:/cvs/(trunk)/proj1/(.*) in which case the first group is used as the branch name and the second group is used to match files.
--usage Show usage message and options.
--branches-path=STR
 Path in repo to /branches.
-h, --help Show help message.

Description

Destination is a dump file, typically named xxx.fi where xxx is the name of the project. If ‘-‘ is given, standard output is used.

Python-Subversion (Python bindings to the Subversion APIs) 1.4 or later must be installed as this library is used to access the source repository. The source may be a network URL but using a local URL is recommended for performance reasons.

See also

fast-import, fast-import-filter

fast-import

Purpose

Backend for fast Bazaar data importers.

Usage

bzr fast-import SOURCE [DESTINATION]

Options

--info=ARG Path to file containing caching hints.
--count=ARG Import this many revisions then exit.
-v, --verbose Display more information.
--import-marks=ARG
 Import marks from file.
--format=ARG Specify a format for the created repository. See “bzr help formats” for details.
-q, --quiet Only display errors and warnings.
--trees Update all working trees, not just trunk’s.
--checkpoint=ARG
 Checkpoint automatically every N revisions. The default is 10000.
--user-map=ARG Path to file containing a map of user-ids.
--autopack=ARG Pack every N checkpoints. The default is 4.
--usage Show usage message and options.
--inv-cache=ARG
 Number of inventories to cache.
--export-marks=ARG
 Export marks to file.
-h, --help Show help message.
Import Algorithm:
--classic Use the original algorithm (mutable inventories).
--default Use the preferred algorithm (inventory deltas).
--experimental Enable experimental features.

Description

This command reads a mixed command/data stream and creates branches in a Bazaar repository accordingly. The preferred recipe is:

bzr fast-import project.fi project.bzr

Numerous commands are provided for generating a fast-import file to use as input. These are named fast-export-from-xxx where xxx is one of cvs, darcs, git, hg, mtn, p4 or svn. To specify standard input as the input stream, use a source name of ‘-‘ (instead of project.fi). If the source name ends in ‘.gz’, it is assumed to be compressed in gzip format.

project.bzr will be created if it doesn’t exist. If it exists already, it should be empty or be an existing Bazaar repository or branch. If not specified, the current directory is assumed.

fast-import will intelligently select the format to use when creating a repository or branch. If you are running Bazaar 1.17 up to Bazaar 2.0, the default format for Bazaar 2.x (“2a”) is used. Otherwise, the current default format (“pack-0.92” for Bazaar 1.x) is used. If you wish to specify a custom format, use the –format option.

Note

To maintain backwards compatibility, fast-import lets you create the target repository or standalone branch yourself. It is recommended though that you let fast-import create these for you instead.

Branch mapping rules

Git reference names are mapped to Bazaar branch names as follows:

  • refs/heads/foo is mapped to foo
  • refs/remotes/origin/foo is mapped to foo.remote
  • refs/tags/foo is mapped to foo.tag
  • */master is mapped to trunk, trunk.remote, etc.
  • */trunk is mapped to git-trunk, git-trunk.remote, etc.

Branch creation rules

When a shared repository is created or found at the destination, branches are created inside it. In the simple case of a single branch (refs/heads/master) inside the input file, the branch is project.bzr/trunk.

When a standalone branch is found at the destination, the trunk is imported there and warnings are output about any other branches found in the input file.

When a branch in a shared repository is found at the destination, that branch is made the trunk and other branches, if any, are created in sister directories.

Working tree updates

The working tree is generated for the trunk branch. If multiple branches are created, a message is output on completion explaining how to create the working trees for other branches.

Custom exporters

The fast-export-from-xxx commands typically call more advanced xxx-fast-export scripts. You are welcome to use the advanced scripts if you prefer.

If you wish to write a custom exporter for your project, see http://bazaar-vcs.org/BzrFastImport for the detailed protocol specification. In many cases, exporters can be written quite quickly using whatever scripting/programming language you like.

User mapping

Some source repositories store just the user name while Bazaar prefers a full email address. You can adjust user-ids while importing by using the –user-map option. The argument is a text file with lines in the format:

old-id = new-id

Blank lines and lines beginning with # are ignored. If old-id has the special value ‘@’, then users without an email address will get one created by using the matching new-id as the domain, unless a more explicit address is given for them. For example, given the user-map of:

@ = example.com
bill = William Jones <bill@example.com>

then user-ids are mapped as follows:

maria => maria <maria@example.com>
bill => William Jones <bill@example.com>

Note

User mapping is supported by both the fast-import and fast-import-filter commands.

Blob tracking

As some exporters (like git-fast-export) reuse blob data across commits, fast-import makes two passes over the input file by default. In the first pass, it collects data about what blobs are used when, along with some other statistics (e.g. total number of commits). In the second pass, it generates the repository and branches.

Note

The initial pass isn’t done if the –info option is used to explicitly pass in information about the input stream. It also isn’t done if the source is standard input. In the latter case, memory consumption may be higher than otherwise because some blobs may be kept in memory longer than necessary.

Restarting an import

At checkpoints and on completion, the commit-id -> revision-id map is saved to a file called ‘fastimport-id-map’ in the control directory for the repository (e.g. .bzr/repository). If the import is interrupted or unexpectedly crashes, it can be started again and this file will be used to skip over already loaded revisions. As long as subsequent exports from the original source begin with exactly the same revisions, you can use this feature to maintain a mirror of a repository managed by a foreign tool. If and when Bazaar is used to manage the repository, this file can be safely deleted.

Examples

Import a Subversion repository into Bazaar:

bzr fast-export-from-svn /svn/repo/path project.fi
bzr fast-import project.fi project.bzr

Import a CVS repository into Bazaar:

bzr fast-export-from-cvs /cvs/repo/path project.fi
bzr fast-import project.fi project.bzr

Import a Git repository into Bazaar:

bzr fast-export-from-git /git/repo/path project.fi
bzr fast-import project.fi project.bzr

Import a Mercurial repository into Bazaar:

bzr fast-export-from-hg /hg/repo/path project.fi
bzr fast-import project.fi project.bzr

Import a Darcs repository into Bazaar:

bzr fast-export-from-darcs /darcs/repo/path project.fi
bzr fast-import project.fi project.bzr

See also

fast-export, fast-import-filter, fast-import-info

fast-import-filter

Purpose

Filter a fast-import stream to include/exclude files & directories.

Usage

bzr fast-import-filter [SOURCE]

Options

-v, --verbose Display more information.
-x ARG, --exclude_paths=ARG
 Exclude these paths from commits.
-q, --quiet Only display errors and warnings.
-i ARG, --include_paths=ARG
 Only include commits affecting these paths. Directories should have a trailing /.
--usage Show usage message and options.
--user-map=ARG Path to file containing a map of user-ids.
-h, --help Show help message.

Description

This command is useful for splitting a subdirectory or bunch of files out from a project to create a new project complete with history for just those files. It can also be used to create a new project repository that removes all references to files that should not have been committed, e.g. security-related information (like passwords), commercially sensitive material, files with an incompatible license or large binary files like CD images.

To specify standard input as the input stream, use a source name of ‘-‘. If the source name ends in ‘.gz’, it is assumed to be compressed in gzip format.

File/directory filtering

This is supported by the -i and -x options. Excludes take precedence over includes.

When filtering out a subdirectory (or file), the new stream uses the subdirectory (or subdirectory containing the file) as the root. As fast-import doesn’t know in advance whether a path is a file or directory in the stream, you need to specify a trailing ‘/’ on directories passed to the –includes option. If multiple files or directories are given, the new root is the deepest common directory.

Note: If a path has been renamed, take care to specify the original path name, not the final name that it ends up with.

User mapping

Some source repositories store just the user name while Bazaar prefers a full email address. You can adjust user-ids by using the –user-map option. The argument is a text file with lines in the format:

old-id = new-id

Blank lines and lines beginning with # are ignored. If old-id has the special value ‘@’, then users without an email address will get one created by using the matching new-id as the domain, unless a more explicit address is given for them. For example, given the user-map of:

@ = example.com
bill = William Jones <bill@example.com>

then user-ids are mapped as follows:

maria => maria <maria@example.com>
bill => William Jones <bill@example.com>

Note

User mapping is supported by both the fast-import and fast-import-filter commands.

Examples

Create a new project from a library (note the trailing / on the directory name of the library):

front-end | bzr fast-import-filter -i lib/xxx/ > xxx.fi
bzr fast-import xxx.fi mylibrary.bzr
(lib/xxx/foo is now foo)

Create a new repository without a sensitive file:

front-end | bzr fast-import-filter -x missile-codes.txt > clean.fi
bzr fast-import clean.fi clean.bzr

See also

fast-import

fast-import-info

Purpose

Output information about a fast-import stream.

Usage

bzr fast-import-info SOURCE

Options

--usage Show usage message and options.
-v, --verbose Display more information.
-q, --quiet Only display errors and warnings.
-h, --help Show help message.

Description

This command reads a fast-import stream and outputs statistics and interesting properties about what it finds. When run in verbose mode, the information is output as a configuration file that can be passed to fast-import to assist it in intelligently caching objects.

To specify standard input as the input stream, use a source name of ‘-‘. If the source name ends in ‘.gz’, it is assumed to be compressed in gzip format.

Examples

Display statistics about the import stream produced by front-end:

front-end | bzr fast-import-info -

Create a hints file for running fast-import on a large repository:

front-end | bzr fast-import-info -v - > front-end.cfg

See also

fast-import