Files
Fast-Export/hg-reset.py

157 lines
5.0 KiB
Python
Raw Permalink Normal View History

#!/usr/bin/env python3
# Copyright (c) 2007, 2008 Rocco Rutte <pdmef@gmx.net> and others.
# License: GPLv2
from mercurial import node
from hg2git import setup_repo,load_cache,get_changeset,get_git_sha1
from optparse import OptionParser
import sys
Support Python 3 Port hg-fast-import to Python 2/3 polyglot code. Since mercurial accepts and returns bytestrings for all repository data, the approach I've taken here is to use bytestrings throughout the hg-fast-import code. All strings pertaining to repository data are bytestrings. This means the code is using the same string datatype for this data on Python 3 as it did (and still does) on Python 2. Repository data coming from subprocess calls to git, or read from files, is also left as the bytestrings either returned from subprocess.check_output or as read from the file in 'rb' mode. Regexes and string literals that are used with repository data have all had a b'' prefix added. When repository data is used in error/warning messages, it is decoded with the UTF8 codec for printing. With this patch, hg-fast-export.py writes binary output to sys.stdout.buffer on Python 3 - on Python 2 this doesn't exist and it still uses sys.stdout. The only strings that are left as "native" strings and not coerced to bytestrings are filepaths passed in on the command line, and dictionary keys for internal data structures used by hg-fast-import.py, that do not originate in repository data. Mapping files are read in 'rb' mode, and thus bytestrings are read from them. When an encoding is given, their contents are decoded with that encoding, but then immediately encoded again with UTF8 and they are returned as the resulting bytestrings Other necessary changes were: - indexing byestrings with a single index returns an integer on Python. These indexing operations have been replaced with a one-element slice: x[0] -> x[0:1] or x[-1] -> [-1:] so at to return a bytestring. - raw_hash.encode('hex_codec') replaced with binascii.hexlify(raw_hash) - str(integer) -> b'%d' % integer - 'string_escape' codec replaced with 'unicode_escape' (which was backported to python 2.7). Strings decoded with this codec were then immediately re-encoded with UTF8. - Calls to map() intended to execute their contents immediately were unwrapped or converted to list comprehensions, since map() is an iterator and does not execute until iterated over. hg-fast-export.sh has been modified to not require Python 2. Instead, if PYTHON has not been defined, it checks python2, python, then python3, and uses the first one that exists and can import the mercurial module.
2020-02-10 21:39:13 -05:00
from binascii import hexlify
def heads(ui,repo,start=None,stop=None,max=None):
# this is copied from mercurial/revlog.py and differs only in
# accepting a max argument for xrange(startrev+1,...) defaulting
# to the original repo.changelog.count()
if start is None:
start = node.nullid
if stop is None:
stop = []
if max is None:
max = repo.changelog.count()
stoprevs = dict.fromkeys([repo.changelog.rev(n) for n in stop])
startrev = repo.changelog.rev(start)
reachable = {startrev: 1}
heads = {startrev: 1}
parentrevs = repo.changelog.parentrevs
Support Python 3 Port hg-fast-import to Python 2/3 polyglot code. Since mercurial accepts and returns bytestrings for all repository data, the approach I've taken here is to use bytestrings throughout the hg-fast-import code. All strings pertaining to repository data are bytestrings. This means the code is using the same string datatype for this data on Python 3 as it did (and still does) on Python 2. Repository data coming from subprocess calls to git, or read from files, is also left as the bytestrings either returned from subprocess.check_output or as read from the file in 'rb' mode. Regexes and string literals that are used with repository data have all had a b'' prefix added. When repository data is used in error/warning messages, it is decoded with the UTF8 codec for printing. With this patch, hg-fast-export.py writes binary output to sys.stdout.buffer on Python 3 - on Python 2 this doesn't exist and it still uses sys.stdout. The only strings that are left as "native" strings and not coerced to bytestrings are filepaths passed in on the command line, and dictionary keys for internal data structures used by hg-fast-import.py, that do not originate in repository data. Mapping files are read in 'rb' mode, and thus bytestrings are read from them. When an encoding is given, their contents are decoded with that encoding, but then immediately encoded again with UTF8 and they are returned as the resulting bytestrings Other necessary changes were: - indexing byestrings with a single index returns an integer on Python. These indexing operations have been replaced with a one-element slice: x[0] -> x[0:1] or x[-1] -> [-1:] so at to return a bytestring. - raw_hash.encode('hex_codec') replaced with binascii.hexlify(raw_hash) - str(integer) -> b'%d' % integer - 'string_escape' codec replaced with 'unicode_escape' (which was backported to python 2.7). Strings decoded with this codec were then immediately re-encoded with UTF8. - Calls to map() intended to execute their contents immediately were unwrapped or converted to list comprehensions, since map() is an iterator and does not execute until iterated over. hg-fast-export.sh has been modified to not require Python 2. Instead, if PYTHON has not been defined, it checks python2, python, then python3, and uses the first one that exists and can import the mercurial module.
2020-02-10 21:39:13 -05:00
for r in range(startrev + 1, max):
for p in parentrevs(r):
if p in reachable:
if r not in stoprevs:
reachable[r] = 1
heads[r] = 1
if p in heads and p not in stoprevs:
del heads[p]
Support Python 3 Port hg-fast-import to Python 2/3 polyglot code. Since mercurial accepts and returns bytestrings for all repository data, the approach I've taken here is to use bytestrings throughout the hg-fast-import code. All strings pertaining to repository data are bytestrings. This means the code is using the same string datatype for this data on Python 3 as it did (and still does) on Python 2. Repository data coming from subprocess calls to git, or read from files, is also left as the bytestrings either returned from subprocess.check_output or as read from the file in 'rb' mode. Regexes and string literals that are used with repository data have all had a b'' prefix added. When repository data is used in error/warning messages, it is decoded with the UTF8 codec for printing. With this patch, hg-fast-export.py writes binary output to sys.stdout.buffer on Python 3 - on Python 2 this doesn't exist and it still uses sys.stdout. The only strings that are left as "native" strings and not coerced to bytestrings are filepaths passed in on the command line, and dictionary keys for internal data structures used by hg-fast-import.py, that do not originate in repository data. Mapping files are read in 'rb' mode, and thus bytestrings are read from them. When an encoding is given, their contents are decoded with that encoding, but then immediately encoded again with UTF8 and they are returned as the resulting bytestrings Other necessary changes were: - indexing byestrings with a single index returns an integer on Python. These indexing operations have been replaced with a one-element slice: x[0] -> x[0:1] or x[-1] -> [-1:] so at to return a bytestring. - raw_hash.encode('hex_codec') replaced with binascii.hexlify(raw_hash) - str(integer) -> b'%d' % integer - 'string_escape' codec replaced with 'unicode_escape' (which was backported to python 2.7). Strings decoded with this codec were then immediately re-encoded with UTF8. - Calls to map() intended to execute their contents immediately were unwrapped or converted to list comprehensions, since map() is an iterator and does not execute until iterated over. hg-fast-export.sh has been modified to not require Python 2. Instead, if PYTHON has not been defined, it checks python2, python, then python3, and uses the first one that exists and can import the mercurial module.
2020-02-10 21:39:13 -05:00
return [(repo.changelog.node(r), b"%d" % r) for r in heads]
def get_branches(ui,repo,heads_cache,marks_cache,mapping_cache,max):
h=heads(ui,repo,max=max)
stale=dict.fromkeys(heads_cache)
changed=[]
unchanged=[]
for node,rev in h:
_,_,user,(_,_),_,desc,branch,_=get_changeset(ui,repo,rev)
del stale[branch]
git_sha1=get_git_sha1(branch)
Support Python 3 Port hg-fast-import to Python 2/3 polyglot code. Since mercurial accepts and returns bytestrings for all repository data, the approach I've taken here is to use bytestrings throughout the hg-fast-import code. All strings pertaining to repository data are bytestrings. This means the code is using the same string datatype for this data on Python 3 as it did (and still does) on Python 2. Repository data coming from subprocess calls to git, or read from files, is also left as the bytestrings either returned from subprocess.check_output or as read from the file in 'rb' mode. Regexes and string literals that are used with repository data have all had a b'' prefix added. When repository data is used in error/warning messages, it is decoded with the UTF8 codec for printing. With this patch, hg-fast-export.py writes binary output to sys.stdout.buffer on Python 3 - on Python 2 this doesn't exist and it still uses sys.stdout. The only strings that are left as "native" strings and not coerced to bytestrings are filepaths passed in on the command line, and dictionary keys for internal data structures used by hg-fast-import.py, that do not originate in repository data. Mapping files are read in 'rb' mode, and thus bytestrings are read from them. When an encoding is given, their contents are decoded with that encoding, but then immediately encoded again with UTF8 and they are returned as the resulting bytestrings Other necessary changes were: - indexing byestrings with a single index returns an integer on Python. These indexing operations have been replaced with a one-element slice: x[0] -> x[0:1] or x[-1] -> [-1:] so at to return a bytestring. - raw_hash.encode('hex_codec') replaced with binascii.hexlify(raw_hash) - str(integer) -> b'%d' % integer - 'string_escape' codec replaced with 'unicode_escape' (which was backported to python 2.7). Strings decoded with this codec were then immediately re-encoded with UTF8. - Calls to map() intended to execute their contents immediately were unwrapped or converted to list comprehensions, since map() is an iterator and does not execute until iterated over. hg-fast-export.sh has been modified to not require Python 2. Instead, if PYTHON has not been defined, it checks python2, python, then python3, and uses the first one that exists and can import the mercurial module.
2020-02-10 21:39:13 -05:00
cache_sha1=marks_cache.get(b"%d" % (int(rev)+1))
if git_sha1!=None and git_sha1==cache_sha1:
Support Python 3 Port hg-fast-import to Python 2/3 polyglot code. Since mercurial accepts and returns bytestrings for all repository data, the approach I've taken here is to use bytestrings throughout the hg-fast-import code. All strings pertaining to repository data are bytestrings. This means the code is using the same string datatype for this data on Python 3 as it did (and still does) on Python 2. Repository data coming from subprocess calls to git, or read from files, is also left as the bytestrings either returned from subprocess.check_output or as read from the file in 'rb' mode. Regexes and string literals that are used with repository data have all had a b'' prefix added. When repository data is used in error/warning messages, it is decoded with the UTF8 codec for printing. With this patch, hg-fast-export.py writes binary output to sys.stdout.buffer on Python 3 - on Python 2 this doesn't exist and it still uses sys.stdout. The only strings that are left as "native" strings and not coerced to bytestrings are filepaths passed in on the command line, and dictionary keys for internal data structures used by hg-fast-import.py, that do not originate in repository data. Mapping files are read in 'rb' mode, and thus bytestrings are read from them. When an encoding is given, their contents are decoded with that encoding, but then immediately encoded again with UTF8 and they are returned as the resulting bytestrings Other necessary changes were: - indexing byestrings with a single index returns an integer on Python. These indexing operations have been replaced with a one-element slice: x[0] -> x[0:1] or x[-1] -> [-1:] so at to return a bytestring. - raw_hash.encode('hex_codec') replaced with binascii.hexlify(raw_hash) - str(integer) -> b'%d' % integer - 'string_escape' codec replaced with 'unicode_escape' (which was backported to python 2.7). Strings decoded with this codec were then immediately re-encoded with UTF8. - Calls to map() intended to execute their contents immediately were unwrapped or converted to list comprehensions, since map() is an iterator and does not execute until iterated over. hg-fast-export.sh has been modified to not require Python 2. Instead, if PYTHON has not been defined, it checks python2, python, then python3, and uses the first one that exists and can import the mercurial module.
2020-02-10 21:39:13 -05:00
unchanged.append([branch,cache_sha1,rev,desc.split(b'\n')[0],user])
else:
Support Python 3 Port hg-fast-import to Python 2/3 polyglot code. Since mercurial accepts and returns bytestrings for all repository data, the approach I've taken here is to use bytestrings throughout the hg-fast-import code. All strings pertaining to repository data are bytestrings. This means the code is using the same string datatype for this data on Python 3 as it did (and still does) on Python 2. Repository data coming from subprocess calls to git, or read from files, is also left as the bytestrings either returned from subprocess.check_output or as read from the file in 'rb' mode. Regexes and string literals that are used with repository data have all had a b'' prefix added. When repository data is used in error/warning messages, it is decoded with the UTF8 codec for printing. With this patch, hg-fast-export.py writes binary output to sys.stdout.buffer on Python 3 - on Python 2 this doesn't exist and it still uses sys.stdout. The only strings that are left as "native" strings and not coerced to bytestrings are filepaths passed in on the command line, and dictionary keys for internal data structures used by hg-fast-import.py, that do not originate in repository data. Mapping files are read in 'rb' mode, and thus bytestrings are read from them. When an encoding is given, their contents are decoded with that encoding, but then immediately encoded again with UTF8 and they are returned as the resulting bytestrings Other necessary changes were: - indexing byestrings with a single index returns an integer on Python. These indexing operations have been replaced with a one-element slice: x[0] -> x[0:1] or x[-1] -> [-1:] so at to return a bytestring. - raw_hash.encode('hex_codec') replaced with binascii.hexlify(raw_hash) - str(integer) -> b'%d' % integer - 'string_escape' codec replaced with 'unicode_escape' (which was backported to python 2.7). Strings decoded with this codec were then immediately re-encoded with UTF8. - Calls to map() intended to execute their contents immediately were unwrapped or converted to list comprehensions, since map() is an iterator and does not execute until iterated over. hg-fast-export.sh has been modified to not require Python 2. Instead, if PYTHON has not been defined, it checks python2, python, then python3, and uses the first one that exists and can import the mercurial module.
2020-02-10 21:39:13 -05:00
changed.append([branch,cache_sha1,rev,desc.split(b'\n')[0],user])
changed.sort()
unchanged.sort()
return stale,changed,unchanged
def get_tags(ui,repo,marks_cache,mapping_cache,max):
l=repo.tagslist()
good,bad=[],[]
for tag,node in l:
Support Python 3 Port hg-fast-import to Python 2/3 polyglot code. Since mercurial accepts and returns bytestrings for all repository data, the approach I've taken here is to use bytestrings throughout the hg-fast-import code. All strings pertaining to repository data are bytestrings. This means the code is using the same string datatype for this data on Python 3 as it did (and still does) on Python 2. Repository data coming from subprocess calls to git, or read from files, is also left as the bytestrings either returned from subprocess.check_output or as read from the file in 'rb' mode. Regexes and string literals that are used with repository data have all had a b'' prefix added. When repository data is used in error/warning messages, it is decoded with the UTF8 codec for printing. With this patch, hg-fast-export.py writes binary output to sys.stdout.buffer on Python 3 - on Python 2 this doesn't exist and it still uses sys.stdout. The only strings that are left as "native" strings and not coerced to bytestrings are filepaths passed in on the command line, and dictionary keys for internal data structures used by hg-fast-import.py, that do not originate in repository data. Mapping files are read in 'rb' mode, and thus bytestrings are read from them. When an encoding is given, their contents are decoded with that encoding, but then immediately encoded again with UTF8 and they are returned as the resulting bytestrings Other necessary changes were: - indexing byestrings with a single index returns an integer on Python. These indexing operations have been replaced with a one-element slice: x[0] -> x[0:1] or x[-1] -> [-1:] so at to return a bytestring. - raw_hash.encode('hex_codec') replaced with binascii.hexlify(raw_hash) - str(integer) -> b'%d' % integer - 'string_escape' codec replaced with 'unicode_escape' (which was backported to python 2.7). Strings decoded with this codec were then immediately re-encoded with UTF8. - Calls to map() intended to execute their contents immediately were unwrapped or converted to list comprehensions, since map() is an iterator and does not execute until iterated over. hg-fast-export.sh has been modified to not require Python 2. Instead, if PYTHON has not been defined, it checks python2, python, then python3, and uses the first one that exists and can import the mercurial module.
2020-02-10 21:39:13 -05:00
if tag==b'tip': continue
rev=int(mapping_cache[hexlify(node)])
cache_sha1=marks_cache.get(b"%d" % (int(rev)+1))
_,_,user,(_,_),_,desc,branch,_=get_changeset(ui,repo,rev)
if int(rev)>int(max):
Support Python 3 Port hg-fast-import to Python 2/3 polyglot code. Since mercurial accepts and returns bytestrings for all repository data, the approach I've taken here is to use bytestrings throughout the hg-fast-import code. All strings pertaining to repository data are bytestrings. This means the code is using the same string datatype for this data on Python 3 as it did (and still does) on Python 2. Repository data coming from subprocess calls to git, or read from files, is also left as the bytestrings either returned from subprocess.check_output or as read from the file in 'rb' mode. Regexes and string literals that are used with repository data have all had a b'' prefix added. When repository data is used in error/warning messages, it is decoded with the UTF8 codec for printing. With this patch, hg-fast-export.py writes binary output to sys.stdout.buffer on Python 3 - on Python 2 this doesn't exist and it still uses sys.stdout. The only strings that are left as "native" strings and not coerced to bytestrings are filepaths passed in on the command line, and dictionary keys for internal data structures used by hg-fast-import.py, that do not originate in repository data. Mapping files are read in 'rb' mode, and thus bytestrings are read from them. When an encoding is given, their contents are decoded with that encoding, but then immediately encoded again with UTF8 and they are returned as the resulting bytestrings Other necessary changes were: - indexing byestrings with a single index returns an integer on Python. These indexing operations have been replaced with a one-element slice: x[0] -> x[0:1] or x[-1] -> [-1:] so at to return a bytestring. - raw_hash.encode('hex_codec') replaced with binascii.hexlify(raw_hash) - str(integer) -> b'%d' % integer - 'string_escape' codec replaced with 'unicode_escape' (which was backported to python 2.7). Strings decoded with this codec were then immediately re-encoded with UTF8. - Calls to map() intended to execute their contents immediately were unwrapped or converted to list comprehensions, since map() is an iterator and does not execute until iterated over. hg-fast-export.sh has been modified to not require Python 2. Instead, if PYTHON has not been defined, it checks python2, python, then python3, and uses the first one that exists and can import the mercurial module.
2020-02-10 21:39:13 -05:00
bad.append([tag,branch,cache_sha1,rev,desc.split(b'\n')[0],user])
else:
Support Python 3 Port hg-fast-import to Python 2/3 polyglot code. Since mercurial accepts and returns bytestrings for all repository data, the approach I've taken here is to use bytestrings throughout the hg-fast-import code. All strings pertaining to repository data are bytestrings. This means the code is using the same string datatype for this data on Python 3 as it did (and still does) on Python 2. Repository data coming from subprocess calls to git, or read from files, is also left as the bytestrings either returned from subprocess.check_output or as read from the file in 'rb' mode. Regexes and string literals that are used with repository data have all had a b'' prefix added. When repository data is used in error/warning messages, it is decoded with the UTF8 codec for printing. With this patch, hg-fast-export.py writes binary output to sys.stdout.buffer on Python 3 - on Python 2 this doesn't exist and it still uses sys.stdout. The only strings that are left as "native" strings and not coerced to bytestrings are filepaths passed in on the command line, and dictionary keys for internal data structures used by hg-fast-import.py, that do not originate in repository data. Mapping files are read in 'rb' mode, and thus bytestrings are read from them. When an encoding is given, their contents are decoded with that encoding, but then immediately encoded again with UTF8 and they are returned as the resulting bytestrings Other necessary changes were: - indexing byestrings with a single index returns an integer on Python. These indexing operations have been replaced with a one-element slice: x[0] -> x[0:1] or x[-1] -> [-1:] so at to return a bytestring. - raw_hash.encode('hex_codec') replaced with binascii.hexlify(raw_hash) - str(integer) -> b'%d' % integer - 'string_escape' codec replaced with 'unicode_escape' (which was backported to python 2.7). Strings decoded with this codec were then immediately re-encoded with UTF8. - Calls to map() intended to execute their contents immediately were unwrapped or converted to list comprehensions, since map() is an iterator and does not execute until iterated over. hg-fast-export.sh has been modified to not require Python 2. Instead, if PYTHON has not been defined, it checks python2, python, then python3, and uses the first one that exists and can import the mercurial module.
2020-02-10 21:39:13 -05:00
good.append([tag,branch,cache_sha1,rev,desc.split(b'\n')[0],user])
good.sort()
bad.sort()
return good,bad
def mangle_mark(mark):
Support Python 3 Port hg-fast-import to Python 2/3 polyglot code. Since mercurial accepts and returns bytestrings for all repository data, the approach I've taken here is to use bytestrings throughout the hg-fast-import code. All strings pertaining to repository data are bytestrings. This means the code is using the same string datatype for this data on Python 3 as it did (and still does) on Python 2. Repository data coming from subprocess calls to git, or read from files, is also left as the bytestrings either returned from subprocess.check_output or as read from the file in 'rb' mode. Regexes and string literals that are used with repository data have all had a b'' prefix added. When repository data is used in error/warning messages, it is decoded with the UTF8 codec for printing. With this patch, hg-fast-export.py writes binary output to sys.stdout.buffer on Python 3 - on Python 2 this doesn't exist and it still uses sys.stdout. The only strings that are left as "native" strings and not coerced to bytestrings are filepaths passed in on the command line, and dictionary keys for internal data structures used by hg-fast-import.py, that do not originate in repository data. Mapping files are read in 'rb' mode, and thus bytestrings are read from them. When an encoding is given, their contents are decoded with that encoding, but then immediately encoded again with UTF8 and they are returned as the resulting bytestrings Other necessary changes were: - indexing byestrings with a single index returns an integer on Python. These indexing operations have been replaced with a one-element slice: x[0] -> x[0:1] or x[-1] -> [-1:] so at to return a bytestring. - raw_hash.encode('hex_codec') replaced with binascii.hexlify(raw_hash) - str(integer) -> b'%d' % integer - 'string_escape' codec replaced with 'unicode_escape' (which was backported to python 2.7). Strings decoded with this codec were then immediately re-encoded with UTF8. - Calls to map() intended to execute their contents immediately were unwrapped or converted to list comprehensions, since map() is an iterator and does not execute until iterated over. hg-fast-export.sh has been modified to not require Python 2. Instead, if PYTHON has not been defined, it checks python2, python, then python3, and uses the first one that exists and can import the mercurial module.
2020-02-10 21:39:13 -05:00
return b"%d" % (int(mark)-1)
if __name__=='__main__':
def bail(parser,opt):
sys.stderr.write('Error: No option %s given\n' % opt)
parser.print_help()
sys.exit(2)
parser=OptionParser()
parser.add_option("--marks",dest="marksfile",
help="File to read git-fast-import's marks from")
2013-06-21 18:35:38 +02:00
parser.add_option("--mapping",dest="mappingfile",
help="File to read last run's hg-to-git SHA1 mapping")
parser.add_option("--heads",dest="headsfile",
help="File to read last run's git heads from")
parser.add_option("--status",dest="statusfile",
help="File to read status from")
parser.add_option("-r","--repo",dest="repourl",
help="URL of repo to import")
parser.add_option("-R","--revision",type=int,dest="revision",
help="Revision to reset to")
(options,args)=parser.parse_args()
if options.marksfile==None: bail(parser,'--marks option')
2013-06-21 18:35:38 +02:00
if options.mappingfile==None: bail(parser,'--mapping option')
if options.headsfile==None: bail(parser,'--heads option')
if options.statusfile==None: bail(parser,'--status option')
if options.repourl==None: bail(parser,'--repo option')
if options.revision==None: bail(parser,'-R/--revision')
heads_cache=load_cache(options.headsfile)
marks_cache=load_cache(options.marksfile,mangle_mark)
state_cache=load_cache(options.statusfile)
2013-06-21 18:35:38 +02:00
mapping_cache = load_cache(options.mappingfile)
Support Python 3 Port hg-fast-import to Python 2/3 polyglot code. Since mercurial accepts and returns bytestrings for all repository data, the approach I've taken here is to use bytestrings throughout the hg-fast-import code. All strings pertaining to repository data are bytestrings. This means the code is using the same string datatype for this data on Python 3 as it did (and still does) on Python 2. Repository data coming from subprocess calls to git, or read from files, is also left as the bytestrings either returned from subprocess.check_output or as read from the file in 'rb' mode. Regexes and string literals that are used with repository data have all had a b'' prefix added. When repository data is used in error/warning messages, it is decoded with the UTF8 codec for printing. With this patch, hg-fast-export.py writes binary output to sys.stdout.buffer on Python 3 - on Python 2 this doesn't exist and it still uses sys.stdout. The only strings that are left as "native" strings and not coerced to bytestrings are filepaths passed in on the command line, and dictionary keys for internal data structures used by hg-fast-import.py, that do not originate in repository data. Mapping files are read in 'rb' mode, and thus bytestrings are read from them. When an encoding is given, their contents are decoded with that encoding, but then immediately encoded again with UTF8 and they are returned as the resulting bytestrings Other necessary changes were: - indexing byestrings with a single index returns an integer on Python. These indexing operations have been replaced with a one-element slice: x[0] -> x[0:1] or x[-1] -> [-1:] so at to return a bytestring. - raw_hash.encode('hex_codec') replaced with binascii.hexlify(raw_hash) - str(integer) -> b'%d' % integer - 'string_escape' codec replaced with 'unicode_escape' (which was backported to python 2.7). Strings decoded with this codec were then immediately re-encoded with UTF8. - Calls to map() intended to execute their contents immediately were unwrapped or converted to list comprehensions, since map() is an iterator and does not execute until iterated over. hg-fast-export.sh has been modified to not require Python 2. Instead, if PYTHON has not been defined, it checks python2, python, then python3, and uses the first one that exists and can import the mercurial module.
2020-02-10 21:39:13 -05:00
l=int(state_cache.get(b'tip',options.revision))
if options.revision+1>l:
sys.stderr.write('Revision is beyond last revision imported: %d>%d\n' % (options.revision,l))
sys.exit(1)
ui,repo=setup_repo(options.repourl)
stale,changed,unchanged=get_branches(ui,repo,heads_cache,marks_cache,mapping_cache,options.revision+1)
good,bad=get_tags(ui,repo,marks_cache,mapping_cache,options.revision+1)
Support Python 3 Port hg-fast-import to Python 2/3 polyglot code. Since mercurial accepts and returns bytestrings for all repository data, the approach I've taken here is to use bytestrings throughout the hg-fast-import code. All strings pertaining to repository data are bytestrings. This means the code is using the same string datatype for this data on Python 3 as it did (and still does) on Python 2. Repository data coming from subprocess calls to git, or read from files, is also left as the bytestrings either returned from subprocess.check_output or as read from the file in 'rb' mode. Regexes and string literals that are used with repository data have all had a b'' prefix added. When repository data is used in error/warning messages, it is decoded with the UTF8 codec for printing. With this patch, hg-fast-export.py writes binary output to sys.stdout.buffer on Python 3 - on Python 2 this doesn't exist and it still uses sys.stdout. The only strings that are left as "native" strings and not coerced to bytestrings are filepaths passed in on the command line, and dictionary keys for internal data structures used by hg-fast-import.py, that do not originate in repository data. Mapping files are read in 'rb' mode, and thus bytestrings are read from them. When an encoding is given, their contents are decoded with that encoding, but then immediately encoded again with UTF8 and they are returned as the resulting bytestrings Other necessary changes were: - indexing byestrings with a single index returns an integer on Python. These indexing operations have been replaced with a one-element slice: x[0] -> x[0:1] or x[-1] -> [-1:] so at to return a bytestring. - raw_hash.encode('hex_codec') replaced with binascii.hexlify(raw_hash) - str(integer) -> b'%d' % integer - 'string_escape' codec replaced with 'unicode_escape' (which was backported to python 2.7). Strings decoded with this codec were then immediately re-encoded with UTF8. - Calls to map() intended to execute their contents immediately were unwrapped or converted to list comprehensions, since map() is an iterator and does not execute until iterated over. hg-fast-export.sh has been modified to not require Python 2. Instead, if PYTHON has not been defined, it checks python2, python, then python3, and uses the first one that exists and can import the mercurial module.
2020-02-10 21:39:13 -05:00
print("Possibly stale branches:")
for b in stale:
sys.stdout.write('\t%s\n' % b.decode('utf8'))
print("Possibly stale tags:")
for b in bad:
sys.stdout.write(
'\t%s on %s (r%s)\n'
% (b[0].decode('utf8'), b[1].decode('utf8'), b[3].decode('utf8'))
)
print("Unchanged branches:")
for b in unchanged:
sys.stdout.write('\t%s (r%s)\n' % (b[0].decode('utf8'),b[2].decode('utf8')))
print("Unchanged tags:")
for b in good:
sys.stdout.write(
'\t%s on %s (r%s)\n'
% (b[0].decode('utf8'), b[1].decode('utf8'), b[3].decode('utf8'))
)
print("Reset branches in '%s' to:" % options.headsfile)
for b in changed:
sys.stdout.write(
'\t:%s %s\n\t\t(r%s: %s: %s)\n'
% (
b[0].decode('utf8'),
b[1].decode('utf8'),
b[2].decode('utf8'),
b[4].decode('utf8'),
b[3].decode('utf8'),
)
)
print("Reset ':tip' in '%s' to '%d'" % (options.statusfile,options.revision))