Now this can even be used as a module from other python scripts by
simply calling the hg2git() function.
Except some config values nobody really ever wants to change, it's even
save to run several hg2git() functions in parallel as no global vars or
the like are used by intention (but it makes the code uglier).
Signed-off-by: Rocco Rutte <pdmef@gmx.net>
As git-fast-import already generates at least one pack per run, don't
even further split these up on a (default) 1k changeset boundary. Also
rework the documentation on that one a little.
Signed-off-by: Rocco Rutte <pdmef@gmx.net>
To git-fast-import(1) we feed in changed and added files completely, so
thers's no real difference except UI output (potentially for debugging).
Signed-off-by: Rocco Rutte <pdmef@gmx.net>
Instead of feeding in everything or only something and getting merges
wrong, build up a list of changed (incl. added) and deleted files by
1) comparing manifest (deleted, added)
2) comparing checksums if file is present in parent and child (change)
The hg-crew and mutt imports now go in <15 minutes and md5 sums match.
Thanks to Theodore Tso for the hint.
While at it, fix a regression that upon incremental import start we
always merged a branch plus initializing it. A single test showed that
the new detection get starting off from a merge commit right, too.
Signed-off-by: Rocco Rutte <pdmef@gmx.net>