It's pointless for many branches to print the validation message for the
first revision already; the same counts for incremental runs.
Signed-off-by: Rocco Rutte <pdmef@gmx.net>
Now we have three methods of feeding out changes
1) full for first revision or
2) thorough delta for merges (compare checksums with all parents) or
3) simple delta else (only got with manifest)
This requires some cleanup so that we have only place where we actually
call the appropriate dumping method.
The export_file_contents() method now also sorts its file list before
writing out anything as this seems to speed up hg data retrival a bit.
Signed-off-by: Rocco Rutte <pdmef@gmx.net>
For the mutt and hg repos, it didn't make a difference, but attempting
to run the conversion on the opensolaris repo looks like this is needed.
When we attempt to export some commit, special-case the revision number
0 and export all files the manifest has while labeling this a "full
revision export". Otherwise we do what we did before labeling this a
"delta revision export".
Signed-off-by: Rocco Rutte <pdmef@gmx.net>
Previously, when no head was present under .git/refs/heads, we simply
died as we couldn't open the file. Now, simply return None in case we
cannot read from it.
Signed-off-by: Rocco Rutte <pdmef@gmx.net>
The current regex may leave us with keys/values having trailing/leading
spaces in all flavours which will break lookup. Solution: strip() key
and value.
Signed-off-by: Rocco Rutte <pdmef@gmx.net>
By allowing spaces in keys we allow for (re-)mapping complete lines
like "Joe User <joe@host>" to be mapped to something else.
Signed-off-by: Rocco Rutte <pdmef@gmx.net>
This broke incremental imports as hg2git.sh wrapper overwrites headsfile
with current values after the import is done.
Signed-off-by: Rocco Rutte <pdmef@gmx.net>
IMHO it's highly unusual to have these lines in hg projects but who
knows. As it's slow to parse these types of lines (with regex), it's
disabled by default and the 'author' command of git-fast-import isn't
used at all.
It can be enabled by giving the -s switch to hg2git.sh.
Signed-off-by: Rocco Rutte <pdmef@gmx.net>
In the git repo there may be any number branches that are not hg
imported branches, so it doesn't make sense to print warnings when a
non-hg head isn't at what it was last time.
Now we get a list of branchtags hg has and only verify these.
Signed-off-by: Rocco Rutte <pdmef@gmx.net>
Unfortunately, it's not configurable yet (read: cannot be disabled) as
it may take some time to match against regex all the time (especially
from some initial import).
This also enables cleaning up usernames by stripping silly leading and
trailing chars like '"' (which is the only one supported ATM).
Signed-off-by: Rocco Rutte <pdmef@gmx.net>
The mapping is a python dictionary given to the hg2git() function. This
isn't extremely useful as there's no option passing from hg2git.sh to
hg2git.py (yet).
Signed-off-by: Rocco Rutte <pdmef@gmx.net>
Now this can even be used as a module from other python scripts by
simply calling the hg2git() function.
Except some config values nobody really ever wants to change, it's even
save to run several hg2git() functions in parallel as no global vars or
the like are used by intention (but it makes the code uglier).
Signed-off-by: Rocco Rutte <pdmef@gmx.net>
As git-fast-import already generates at least one pack per run, don't
even further split these up on a (default) 1k changeset boundary. Also
rework the documentation on that one a little.
Signed-off-by: Rocco Rutte <pdmef@gmx.net>
To git-fast-import(1) we feed in changed and added files completely, so
thers's no real difference except UI output (potentially for debugging).
Signed-off-by: Rocco Rutte <pdmef@gmx.net>
Instead of feeding in everything or only something and getting merges
wrong, build up a list of changed (incl. added) and deleted files by
1) comparing manifest (deleted, added)
2) comparing checksums if file is present in parent and child (change)
The hg-crew and mutt imports now go in <15 minutes and md5 sums match.
Thanks to Theodore Tso for the hint.
While at it, fix a regression that upon incremental import start we
always merged a branch plus initializing it. A single test showed that
the new detection get starting off from a merge commit right, too.
Signed-off-by: Rocco Rutte <pdmef@gmx.net>