Obsolete changesets are, for example, create by the Evolve
extension. This patch switches to an unfiltered repository (the
filtered one throws on an attempt to access obsolete revisions) and
then filters out the obsolete revisions when it comes across them.
Fixes#173
When version v171002 introduced a new mapping file format for branches
and authors, that change never made it to the remapping of tags
although the README documents it.
Fixes#172.
Make it possible to completely disable the name sanitizer by the
--no-auto-sanitize flag. Previously the sanitizer was run on user
remapped names. As the sanitizer rewrites perfectly legal git
names (such as __.*) this is probably not what the user wants.
Closes#155.
This adds a new command line option (--subrepo-map) that will
map mercurial subrepos to git submodules.
The --subrepo-map takes a mapping file as an argument that will
be used to map a subrepo folder to a git submodule.
For more information see the README-SUBMODULES.md.
This commit is inspired by the changes made by daolis in PR#38
that was never merged.
Closes: #51Closes: #147
From PEP 394 [1]:
* python2 will refer to some version of Python 2.x.
* end users should be aware that python refers to python3 on at least
Arch Linux (that change is what prompted the creation of this PEP),
so python should be used in the shebang line only for scripts that
are source compatible with both Python 2 and 3.
So to make sure that we run correctly on a system where python refers
to python3 and avoid problems like issue #11 we change the shebangs.
[1] https://www.python.org/dev/peps/pep-0394/
From time to time contributors spend time doing work that will not be
accepted as it duplicates functionality that is already provided with
the mapping files. Try to dissuade them from doing that by explaining
the reasons in the comment.
If a branch name starts with '/' it will be split into ['', ...] and
then mapped over with dot(), only dot() does not handle the empty
string. Teach dot() to handle the empty string.
This fixes the underlying problem in issue #91.
As all branches created on the git side are transformed by
sanitize_name(), this should be a safe backwards compatible change. If a
user is doing incremental imports and sanitize_name() now suddenly
modifies the branch name, verify_heads() would already have complained
on the first incremental run.
Thanks goes to Steve Tousignant<s.tousignant@gmail.com> for discovering
the problem.
This is a piece of code which frequently attracts pull requests which
are summarily rejected. As there is no "git blame" for rejected pull
requests, try to avoid misguided work by adding a comment at the
relevant place.
When an import is restarted the first new note commit must use
refs/notes/hg^0 as the parent. As refs/notes/hg is only updated at the
end of a session we cannot have it present in all note commits. Neither
can we generate new marks for note commits as that would require a new
mapping scheme from hg versions numbers to git marks. A new mapping
scheme would break existing incremental import setups.
We therefore restructure the code to do the notes at the end of an
import session, thus only requiring a refs/notes/hg^0 reference in the
first commit.
Branch and tag names can now be renamed using a mechanism similar to the
-A option for author names.
-B specifies a mapping file for branch names, and -T a mapping file for
tags.
Apparently a bug (http://bz.selenic.com/show_bug.cgi?id=3511) in
multiple released versions of Mercurial could produce commits where
files had absolute paths.
As a "healthy" repo should not contain any absolute paths, it should be
safe to always strip a leading '/' from the path and let the conversion
continue.
When a mercurial repository does not use utf-8 for encoding author
strings and commit messages the "-e <encoding>" command line option
can be used to force fast-export to convert incoming meta data from
<encoding> to utf-8.
When "-e <encoding>" is given, we use Python's string
decoding/encoding API to convert meta data on the fly when processing
commits.
If the --hg-hash argument is given, the converted commits are
annotated with the original hg hash as a git note in the "hg"
namespace.
The notes can be shown by git log using the "--notes=hg" argument.
In a merge commit, the first parent is always the same parent that
would be recorded if the commit were not a merge and the other
parent(s) record the commit(s) being merged in.
Preserving this order is important so that log --first-parent works
properly and also so that the merge history is not distorted by an
incorrect permutation of the DAG.
Remove the code that sorts the merge parents based on node id so
that the correct DAG order is preserved.
The authors file format accepted by git-svnimport and git-cvsimport
actually allows blank lines and comment lines that start with '#'.
Ignore blank lines and lines starting with '#' as the first
non-whitespace character to be compatible with the authors file
format accepted by the referenced tools.
Add support for a new --hgtags option. When given, any .hgtags
files that may be present are exported.
Normally this is not desirable. However, when attempting to mimic
the actions of other hg exporters that always export any .hgtags
files this option can help produce matching export data.
If the file mode changes (for example from 10644 to 10755), but the
actual text of the file itself does not, then the change could be
missed since the hashes would remain the same.
If the hashes match, also compare the gitmode values before deciding
the file is unchanged.
Since hg runs and supports older versions of python, hg-fast-export.py
should too. Replace dictionary comprehension with equivalent code that
supports versions of python older than 2.7.
Because on Windows sys.stdout is initially in text mode, any LF
characters written to it will be transformed to CRLF, which causes git
to blow up. This change uses Windows platform-specific code to change
sys.stdout to binary mode.
After an update to Mercurial 2.3 the module 'repo' was removed and the
program crashed when trying to convert a repository. I checked the
imports with 'pyflakes' and removed all unused ones, repo (among
others) was never used.
http://www.selenic.com/repo/hg/rev/1ac628cd7113#l9.1
The previous code did an awful lot of work to infer the parents of an
exported commit, incorporating information from many sources. But
there were multiple bugs in this scheme, sometimes resulting in merge
commits with two parents pointing to the same commit object.
Instead, use a much more straightforward process of mapping the
parents stored in hg.
hg-fast-export uses hg's branch order (from the log) when merging,
this is a problem. Consider the case:
HG repo A has revisions 1-10. Repository B is cloned from that.
Subsequently, A adds revision 11, and B adds a different change which
also has revision 11. If B now pulls from A, A's rev11 will have the
number 12; if A then pulls from B, the reverse also holds. So the logs
are different even though they contain the exact same changes.
hg-fast-export will thus create different git repositories for A and B,
even though the contents are identical for all practical purposes.
In particular, the repos would be identical if A and B had used git from
the beginning.
To fix that, compare HG revisions instead of log positions.
Previously we fed the full revision only for the first one and deltas
for all following including branches being forked off. This doesn't work
with branches that are forked from revision 0. In case such a branch is
found, we now also feed the full revision.
Signed-off-by: Rocco Rutte <pdmef@gmx.net>