Commit Graph

62 Commits

Author SHA1 Message Date
Ekin Dursun
c49dd0cf60 Remove Python 2 compatibility code
Python 2 support was removed recently, so we don't need the
compatibility code anymore.
2023-11-18 20:22:18 +03:00
Felipe Contreras
e1e15b2091 Avoid revsymbol()
We can just do repo[rev].

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
2023-03-09 19:48:44 -06:00
Felipe Contreras
534d2bdd92 Don't deal with the node in get_changeset()
It's not necessary.

It could be fetched with repo[rev].node(), but why bother?

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
2023-03-09 19:48:44 -06:00
Felipe Contreras
23f41c0ff1 Use revision directly instead of revnode
We don't need the revnode.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
2023-03-09 19:48:44 -06:00
Felipe Contreras
7886016978 hg2git: set proper default branch
So that cfg_master is picked up in get_branch().

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
2023-03-09 19:48:44 -06:00
Frej Drejhammar
3910044a97 Avoid crash during rev-parse when the default encoding is ascii
In some locales the default encoding is ascii in which case
subprocess.check_output() will fail if it is given a non-ascii ref as
one of the arguments. By forcing the ref to be utf8 we will avoid a
crash while still behaving correctly when the default encoding is
utf8.

The credits for this fix go to Nikita Bazhinov for discovering the fix
and Chris J Billington for explaining it.

Co-Authored-By: Nikita Bazhinov <nbazhinov@syntellect.ru>
Co-Authored-By: Chris J Billington <chrisjbillington@gmail.com>
2020-07-10 16:41:38 +02:00
Toni Sissala
90eeef2ff4 Fix TypeError when using -M command line argument
hg-fast-export.sanitize_name expects branch name to be a bytes
object. Command line parser gives out str objects. Convert
possible str object to bytes in hg2git.set_default_branch().
2020-03-25 11:19:25 +02:00
chrisjbillington
b961f146df Support Python 3
Port hg-fast-import to Python 2/3 polyglot code.

Since mercurial accepts and returns bytestrings for all repository data,
the approach I've taken here is to use bytestrings throughout the
hg-fast-import code. All strings pertaining to repository data are
bytestrings. This means the code is using the same string datatype for
this data on Python 3 as it did (and still does) on Python 2.

Repository data coming from subprocess calls to git, or read from files,
is also left as the bytestrings either returned from
subprocess.check_output or as read from the file in 'rb' mode.

Regexes and string literals that are used with repository data have
all had a b'' prefix added.

When repository data is used in error/warning messages, it is decoded
with the UTF8 codec for printing.

With this patch, hg-fast-export.py writes binary output to
sys.stdout.buffer on Python 3 - on Python 2 this doesn't exist and it
still uses sys.stdout.

The only strings that are left as "native" strings and not coerced to
bytestrings are filepaths passed in on the command line, and dictionary
keys for internal data structures used by hg-fast-import.py, that do
not originate in repository data.

Mapping files are read in 'rb' mode, and thus bytestrings are read from
them. When an encoding is given, their contents are decoded with that
encoding, but then immediately encoded again with UTF8 and they are
returned as the resulting bytestrings

Other necessary changes were:

 - indexing byestrings with a single index returns an integer on Python.
   These indexing operations have been replaced with a one-element
   slice: x[0] -> x[0:1] or x[-1] -> [-1:] so at to return a bytestring.

 - raw_hash.encode('hex_codec') replaced with binascii.hexlify(raw_hash)

 - str(integer) -> b'%d' % integer

 - 'string_escape' codec replaced with 'unicode_escape' (which was
    backported to python 2.7). Strings decoded with this codec were then
    immediately re-encoded with UTF8.

 - Calls to map() intended to execute their contents immediately were
   unwrapped or converted to list comprehensions, since map() is an
   iterator and does not execute until iterated over.

hg-fast-export.sh has been modified to not require Python 2. Instead, if
PYTHON has not been defined, it checks python2, python, then python3,
and uses the first one that exists and can import the mercurial module.
2020-02-13 14:35:19 -05:00
Dave Townsend
b54046d3aa Avoid showing a warning when the mercurial repository has obsolete markers. 2019-10-20 19:49:25 +02:00
Daniel Small
2bb173ef68 hg 4.7: Replace call to util.email with templatefilters.email
This change is required for Mercurial 4.7 support and fixes #137.
2018-08-11 15:49:08 +02:00
Frej Drejhammar
ac60034ba3 Adhere to PEP 394
From PEP 394 [1]:

* python2 will refer to some version of Python 2.x.

* end users should be aware that python refers to python3 on at least
  Arch Linux (that change is what prompted the creation of this PEP),
  so python should be used in the shebang line only for scripts that
  are source compatible with both Python 2 and 3.

So to make sure that we run correctly on a system where python refers
to python3 and avoid problems like issue #11 we change the shebangs.

[1] https://www.python.org/dev/peps/pep-0394/
2018-08-11 15:07:19 +02:00
Frej Drejhammar
e200cec39f Adapt to changes in Mercurial 4.6
Starting with Mercurial 4.6 repo.lookup() no longer accepts raw hashes
for lookups.
2018-06-10 15:51:09 +02:00
Frej Drejhammar
f8792d9c5c Switch from os.popen() to subprocess.check_output() for running git rev-parse
os.popen() uses the shell, this is dangerous when the branch-name
contains characters which are interpreted by the shell, therefore switch
to subprocess.check_output() which doesn't involve the shell.

This closes issue #66.
2016-04-15 15:43:21 +02:00
zed
e87c9cb3b8 Add option for specifying the text encoding used by Mercurial
When a mercurial repository does not use utf-8 for encoding author
strings and commit messages the "-e <encoding>" command line option
can be used to force fast-export to convert incoming meta data from
<encoding> to utf-8.

When "-e <encoding>" is given, we use Python's string
decoding/encoding API to convert meta data on the fly when processing
commits.
2014-10-25 22:39:08 +02:00
frej
e63f780004 Merge pull request #6 from aried3r/master
Fix for Mercurial 2.3 compatibility
2012-08-10 09:10:44 -07:00
Anton Rieder
0dcbd3d195 Organized imports
After an update to Mercurial 2.3 the module 'repo' was removed and the
program crashed when trying to convert a repository. I checked the
imports with 'pyflakes' and removed all unused ones, repo (among
others) was never used.

http://www.selenic.com/repo/hg/rev/1ac628cd7113#l9.1
2012-08-07 01:35:09 +02:00
Daniel Harding
4ce8835d11 Make hg-fast-export work on Windows
* use sys.stdout.write instead of print to avoid end-of-line issues
* use os.devnull instead of hard-coding /dev/null
2012-05-28 18:57:30 +01:00
Barry Wardell
0a9570c676 Support the case where the author field has an empty email address, i.e. it is of the form 'name <>'. 2011-11-26 15:07:43 +01:00
Paul O’Shannessy
3e00d99d39 Use hg methods to extract name and email when doing user fixup 2011-10-18 16:20:54 -07:00
Frej Drejhammar
486690e176 Remove \" from the user string before trying to extract name and email
Signed-off-by: Frej Drejhammar <frej.drejhammar@gmail.com>
Reported-by: Cole Robinson <crobinso@redhat.com>

Thank's to Cole Robinson for reporting the bug and providing a fix
which was adapted to this patch.

The original bug report:

I was recently converting a few mercurial repositories to git, and
noticed certain commits had their date reset to Jan 1 1970.

An example repo:

http://hg.fedorahosted.org/hg/virt-manager

An example commit:

http://hg.fedorahosted.org/hg/virt-manager/rev/41182500ddef

After some poking, it seems the culprit was that the "author:" was
surrounded by quotation marks
2011-03-18 19:12:45 +01:00
Rocco Rutte
1464dabbff Maintain backwards compatibility for ui setup
Signed-off-by: Rocco Rutte <pdmef@gmx.net>
2009-05-25 15:17:33 +02:00
Rocco Rutte
ff19982cc2 Update to work with mercurial ui refactorings
Signed-off-by: Rocco Rutte <pdmef@gmx.net>
2009-05-14 14:53:42 +02:00
Fabrizio Chiarello
a984e233c2 hg-fast-export: add option to track remote branches under a custom namespace
Add -o, --origin <name> to allow user to set a namespace used
when importing remote branches.

Signed-off-by: Fabrizio Chiarello <ponch@autistici.org>
2008-12-20 19:51:02 +01:00
Rocco Rutte
fdbb1decaa hg2git: Update copyrights and maintainership information.
Signed-off-by: Rocco Rutte <pdmef@gmx.net>
2008-11-25 11:25:22 +01:00
Fabrizio Chiarello
1ab60e492b hg-fast-export: Make default branch customizable
Add -M, --default-branch <branch_name> to allow user to set
the default branch where to pull into

Signed-off-by: Fabrizio Chiarello <ponch@autistici.org>
2008-09-19 08:03:44 +02:00
Jonathan Nieder
8be4e6b3d0 hg-fast-export: work still if git-commands are not in PATH
In git 1.6.0, most git tools with a dash in the name will no
longer be installed in $bindir.  This patch makes hg-fast-export
use the "git <command>" form so it will work even if "git" is
the only piece of git machinery in the user's PATH.

On the other hand, the "git <command>" form does not help for
sourcing a shell script (with ".").  So use the full path to
source "git-sh-setup".

Signed-off-by: Jonathan Nieder <jrnieder@uchicago.edu>
2008-07-31 07:18:16 +02:00
Rocco Rutte
205c76749a Revert "hg2git: Replaces space with "_" in branches name"
The get_branch() function's purpose is to detect whether a mercurial
branch name actually should be considered the default branch.

Sanitizing branch and tag names for git is done in sanitize_name().

Noted by Jonathan Nieder.

This reverts commit cdfdae36c8.
2008-06-03 13:53:08 +02:00
Rocco Rutte
d89b42a631 Clarify where 'HEAD' branch name comes from
Signed-off-by: Rocco Rutte <pdmef@gmx.net>
2008-05-30 13:44:07 +02:00
Felipe Zimmerle
cdfdae36c8 hg2git: Replaces space with "_" in branches name
Since space doesn't conform to GIT branches name standards,
it should be replaced or with another character.

Signed-off-by: Felipe Zimmerle <felipe.zimmerle@indt.org.br>
Signed-off-by: Rocco Rutte <pdmef@gmx.net>
2008-05-30 13:41:56 +02:00
Rocco Rutte
8551771d2b hg2git.py: Allow consumers to modify keys of dicts returned by load_cache()
By default, the key is not changed. This will allow us for fixing up the
off-by-one issue with marks restored using load_cache().

Signed-off-by: Rocco Rutte <pdmef@gmx.net>
2007-10-22 09:50:52 +02:00
Rocco Rutte
3d1f111d30 hg2git.py: Use git-rev-parse to get SHA1s instead of reading files below refs/ directly
This should now also properly support packed refs.

Signed-off-by: Rocco Rutte <pdmef@gmx.net>
2007-10-22 09:48:48 +02:00
Rocco Rutte
fb5cd150a6 hg2git.py: Map 'HEAD', 'default' and '' hg branches to 'master' in git
Also add a note where HEAD is comming from.

Signed-off-by: Rocco Rutte <pdmef@gmx.net>
2007-10-22 09:44:12 +02:00
Rocco Rutte
5cc155e367 hg-reset.py: Print details for changed branches only
It doesn't make sense to suggest resetting branch HEADs to their current
value.

Signed-off-by: Rocco Rutte <pdmef@gmx.net>
2007-03-19 09:27:37 +00:00
Rocco Rutte
7044bdd4d1 Add hg2git.py with library routines
Unfortunately, I can't do 'import hg-fast-export' from python itself, so
we need to move some common methods into 'hg2git.py' which is to be used
as a library for common hg->git routines.

Signed-off-by: Rocco Rutte <pdmef@gmx.net>
2007-03-19 08:45:42 +00:00
Rocco Rutte
c84790da82 Use MIT license, adjust hg2git script names to match fast-export repo style
Signed-off-by: Rocco Rutte <pdmef@gmx.net>
2007-03-14 10:29:24 +00:00
Rocco Rutte
f9879136a9 hg2git.py: Only print verification message for branches we have
It's pointless for many branches to print the validation message for the
first revision already; the same counts for incremental runs.

Signed-off-by: Rocco Rutte <pdmef@gmx.net>
2007-03-14 10:13:27 +00:00
Rocco Rutte
287365c160 hg2git.py: Add simple delta revision feed
Now we have three methods of feeding out changes
  1) full for first revision or
  2) thorough delta for merges (compare checksums with all parents) or
  3) simple delta else (only got with manifest)

This requires some cleanup so that we have only place where we actually
call the appropriate dumping method.

The export_file_contents() method now also sorts its file list before
writing out anything as this seems to speed up hg data retrival a bit.

Signed-off-by: Rocco Rutte <pdmef@gmx.net>
2007-03-14 10:02:15 +00:00
Rocco Rutte
af2237607c hg2git.py: Create only leightweight tags
The annotated tag with commit message summary was primarily only for
debugging.

Signed-off-by: Rocco Rutte <pdmef@gmx.net>
2007-03-14 08:34:18 +00:00
Rocco Rutte
d988112549 hg2git.py: add -f/--force option to bypass validation checks
Signed-off-by: Rocco Rutte <pdmef@gmx.net>
2007-03-13 16:43:20 +00:00
Rocco Rutte
ad283a91ca hg2git.py: Bail out for certain errors
New is that we also check for multiple tips having the same branch name,
i.e. no unnamed heads.

Signed-off-by: Rocco Rutte <pdmef@gmx.net>
2007-03-13 16:31:57 +00:00
Rocco Rutte
d9bb3271a4 Add a note about hg's unnamed branches and multiple heads
Signed-off-by: Rocco Rutte <pdmef@gmx.net>
2007-03-13 15:27:29 +00:00
Rocco Rutte
5732cd0313 hg2git.py: For the first revision, feed out full manifest
For the mutt and hg repos, it didn't make a difference, but attempting
to run the conversion on the opensolaris repo looks like this is needed.

When we attempt to export some commit, special-case the revision number
0 and export all files the manifest has while labeling this a "full
revision export". Otherwise we do what we did before labeling this a
"delta revision export".

Signed-off-by: Rocco Rutte <pdmef@gmx.net>
2007-03-13 10:59:22 +00:00
Rocco Rutte
191928202b hg2git.py: Don't complain die for non-existent heads
Previously, when no head was present under .git/refs/heads, we simply
died as we couldn't open the file. Now, simply return None in case we
cannot read from it.

Signed-off-by: Rocco Rutte <pdmef@gmx.net>
2007-03-12 11:13:48 +00:00
Rocco Rutte
cedbd0fb86 hg2git.py: Remove leading/trailing spaces from authormap
The current regex may leave us with keys/values having trailing/leading
spaces in all flavours which will break lookup. Solution: strip() key
and value.

Signed-off-by: Rocco Rutte <pdmef@gmx.net>
2007-03-12 10:45:32 +00:00
Rocco Rutte
59a481a2b0 hg2git.py: Allow for spaces in authorfile
By allowing spaces in keys we allow for (re-)mapping complete lines
like "Joe User <joe@host>" to be mapped to something else.

Signed-off-by: Rocco Rutte <pdmef@gmx.net>
2007-03-12 10:26:46 +00:00
Rocco Rutte
230a320c84 Basic support for an author map
As git-(cvs|svn)import support it, make futrue git-hgimport :) support it, too.

Signed-off-by: Rocco Rutte <pdmef@gmx.net>
2007-03-12 08:54:30 +00:00
Rocco Rutte
20b4ca920b hg2git.py: Fix typo saving status to headsfile instead of statusfile
This broke incremental imports as hg2git.sh wrapper overwrites headsfile
with current values after the import is done.

Signed-off-by: Rocco Rutte <pdmef@gmx.net>
2007-03-12 08:13:40 +00:00
Rocco Rutte
80f028a16c hg2git.py: Display our max revision as progress, not tip
Displaying tip doesn't make sense when we have some max given with -m/--max.

Signed-off-by: Rocco Rutte <pdmef@gmx.net>
2007-03-12 08:12:08 +00:00
Rocco Rutte
469d4f3305 hg2git.py: Disable parsing Signef-off-by lines and add -s to enable
IMHO it's highly unusual to have these lines in hg projects but who
knows. As it's slow to parse these types of lines (with regex), it's
disabled by default and the 'author' command of git-fast-import isn't
used at all.

It can be enabled by giving the -s switch to hg2git.sh.

Signed-off-by: Rocco Rutte <pdmef@gmx.net>
2007-03-12 08:00:18 +00:00
Rocco Rutte
045eea436c Basic support for command line options in hg2git.py
Signed-off-by: Rocco Rutte <pdmef@gmx.net>
2007-03-12 07:33:40 +00:00