Commit Graph

11 Commits

Author SHA1 Message Date
chrisjbillington
b961f146df Support Python 3
Port hg-fast-import to Python 2/3 polyglot code.

Since mercurial accepts and returns bytestrings for all repository data,
the approach I've taken here is to use bytestrings throughout the
hg-fast-import code. All strings pertaining to repository data are
bytestrings. This means the code is using the same string datatype for
this data on Python 3 as it did (and still does) on Python 2.

Repository data coming from subprocess calls to git, or read from files,
is also left as the bytestrings either returned from
subprocess.check_output or as read from the file in 'rb' mode.

Regexes and string literals that are used with repository data have
all had a b'' prefix added.

When repository data is used in error/warning messages, it is decoded
with the UTF8 codec for printing.

With this patch, hg-fast-export.py writes binary output to
sys.stdout.buffer on Python 3 - on Python 2 this doesn't exist and it
still uses sys.stdout.

The only strings that are left as "native" strings and not coerced to
bytestrings are filepaths passed in on the command line, and dictionary
keys for internal data structures used by hg-fast-import.py, that do
not originate in repository data.

Mapping files are read in 'rb' mode, and thus bytestrings are read from
them. When an encoding is given, their contents are decoded with that
encoding, but then immediately encoded again with UTF8 and they are
returned as the resulting bytestrings

Other necessary changes were:

 - indexing byestrings with a single index returns an integer on Python.
   These indexing operations have been replaced with a one-element
   slice: x[0] -> x[0:1] or x[-1] -> [-1:] so at to return a bytestring.

 - raw_hash.encode('hex_codec') replaced with binascii.hexlify(raw_hash)

 - str(integer) -> b'%d' % integer

 - 'string_escape' codec replaced with 'unicode_escape' (which was
    backported to python 2.7). Strings decoded with this codec were then
    immediately re-encoded with UTF8.

 - Calls to map() intended to execute their contents immediately were
   unwrapped or converted to list comprehensions, since map() is an
   iterator and does not execute until iterated over.

hg-fast-export.sh has been modified to not require Python 2. Instead, if
PYTHON has not been defined, it checks python2, python, then python3,
and uses the first one that exists and can import the mercurial module.
2020-02-13 14:35:19 -05:00
Frej Drejhammar
1d0f6cb7ca Fix broken support for bare repositories
The change in 6cf9397bd6 broke support for
bare repositories. In a bare repo git rev-parse --show-toplevel would
return an empty string and cwd would then be changed to the user's home
directory. In the home directory git rev-parse --git-dir would either
fail or return an unrelated repo.

Problem reported by Ralf Rösch.
2016-10-01 14:45:48 +02:00
Frej Drejhammar
6cf9397bd6 Do not rely on git internals, support Git >= 2.10
Fast-export has traditionally sourced the internal git-sh-setup from
Git, following the release of Git 2.10 this no longer works. Fast-export
only uses the functionality of git-sh-setup for two things: cd:ing to
the git repo dir and setting up the GIT_REPO environment variable. To
future-proof fast-export start doing what we need by hand in
fast-export.

Acknowledgments to Louis Sautier who reported the problem and tested the
fix.
2016-09-14 14:15:11 +02:00
Kyle J. McKay
779e2f6da8 hg-fast-export.sh/hg-reset.sh: replace egrep with grep
According to the POSIX standard, egrep is an obsolescent equivalent
of grep -E.  In fact, the patterns actually being used with egrep do
not require use of extended regular expressions at all, so a plain
'grep' can be used rather than 'grep -E'.

Replace egrep with grep to improve compatibility across systems.
2014-04-13 15:40:07 +02:00
Frej Drejhammar
276f54c38f Merge branch 'from-jmcmullan' into develop
Conflicts:
	hg-fast-export.py
2008-12-20 19:57:39 +01:00
Jason S. McMullan
b4833029a4 hg export: Support tag movement
HG tag movement is now supported with this patch.

This patch creates a .git/hg2git-mapping file, which maps
HG revision numbers to HG hashes. Combined with the
.git/hg2git-marks file, which maps HG revisions to GIT hashes,
we can now reprocess all tags at the end of each hg export
operation.
2008-12-11 09:05:05 -05:00
Rocco Rutte
fdbb1decaa hg2git: Update copyrights and maintainership information.
Signed-off-by: Rocco Rutte <pdmef@gmx.net>
2008-11-25 11:25:22 +01:00
Jonathan Nieder
8be4e6b3d0 hg-fast-export: work still if git-commands are not in PATH
In git 1.6.0, most git tools with a dash in the name will no
longer be installed in $bindir.  This patch makes hg-fast-export
use the "git <command>" form so it will work even if "git" is
the only piece of git machinery in the user's PATH.

On the other hand, the "git <command>" form does not help for
sourcing a shell script (with ".").  So use the full path to
source "git-sh-setup".

Signed-off-by: Jonathan Nieder <jrnieder@uchicago.edu>
2008-07-31 07:18:16 +02:00
Rocco Rutte
258f03a9ba Fix shell substitution typo
Signed-off-by: Rocco Rutte <pdmef@gmx.net>
2007-10-22 08:28:41 +00:00
Rocco Rutte
4cc1d7cf17 Allow for $PYTHON environment variable specifying python binary to use
Signed-off-by: Rocco Rutte <pdmef@gmx.net>
2007-10-22 09:34:40 +02:00
Rocco Rutte
8aff9df2c3 hg-reset.sh: Helper for partially re-importing from hg
Given a hg revision to reset to, these scripts get the latest changes
per hg branch and print git SHA1. The user then needs to manually reset
branches as needed, tune the state file and can re-import things again.

Signed-off-by: Rocco Rutte <pdmef@gmx.net>
2007-03-19 09:04:42 +00:00