Commit Graph

554 Commits

Author SHA1 Message Date
Felipe Contreras
c3cbf1e04d Add wr_data helper
No functional changes.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
2023-03-03 19:34:29 -06:00
Felipe Contreras
4c10270302 Fix data handling
The length should be exactly the same as the data, for example if the
data is "hello" only 5 characters should be written on the stream. Thus
it should always be `len(data)`, not `len(data)+1` as it currently is in
some places.

Since the first commit of hg2git.py there was a wtf comment, presumably
Rocco was confused about this common discrepancy.

We can shuffle the logic around by adding '\n' to the data, and removing
+1 to the length.

Also, the data should be written without a newline (wr_no_nl).

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
2023-03-03 19:33:45 -06:00
Frej Drejhammar
723d8032ba Merge branch 'PR/294' 2022-11-25 16:31:18 +01:00
df
268299a358 Fix typo in README
Added dash to match the actual usage of the 'ignore-unnamed-heads' option
2022-11-19 18:15:04 +01:00
Frej Drejhammar
6700b164d0 Merge branch 'PR/293'
Closes #292
v221024
2022-10-23 14:47:04 +02:00
chrisjbillington
13c273f10c Resolve unicode escape sequences not being processed correctly
In `process_unicode_escape_sequences()`, any backslash escape sequences
in the original string are escaped upon the first
`.encode('unicode-escape')` and therefore round-trip the sequence of
`.encode('unicode-escape').decode('unicode-escape')`.

That is not what we want - we want these sequences to be passed-through
the `.encode` unchanged, so that they will be converted to the
character they represent upon `.decode()`.

This patch changes the `.encode()` step to pass through any ascii
characters unchanged, only escaping non-ascii characters. This ensures
any existing backslash escape sequences will be interpreted as the
character they represent upon `.decode()`.
2022-10-23 11:51:33 +11:00
Frej Drejhammar
667404e836 Merge branch 'PR291' v220921 2022-09-21 18:31:16 +02:00
Nicolas Vanhoren
38e236962d Update README.md to change recommandation for crlf filtering 2022-09-21 01:37:39 +02:00
Frej Drejhammar
dbb8158527 Merge branch 'frej/submodule-doc-improvement' 2022-02-10 20:05:07 +01:00
Frej Drejhammar
bb0bcda7ba Merge branch 'frej/fix-re-future-warning' 2022-02-10 20:04:14 +01:00
Frej Drejhammar
838b654614 Remove inconsistencies from submodule documentation
The submodule documentation is not consistent with regards to the
example directory structure. Update the example to be consistent.

Closes #277.
2022-02-09 15:58:48 +01:00
Frej Drejhammar
f179afce65 Fix FutureWarning about nested sets in re
Since Python 3.7 the re module warns for syntax which could, in the
future, be misparsed as a nested set. Avoid this by escaping the
literal `[` we search for in the regexp.

Reported by Monte Davidoff @mndavidoff

Closes #269.
2022-02-09 15:37:29 +01:00
Frej Drejhammar
5b7ca5aaec Give proper error message when refusing to overwrite existing branch
If fast-export was asked to export a Mercurial branch to Git and a
branch of the same name already existed in the Git repo but it was not
created by fast export, fast-export would crash while trying to format
an error message claiming that the destination branch was modified
behind its back.

This patch extends fast-export to detect the situation above and give
a proper error message which hopefully is less confusing to the user.

Credits for discovering the original crash goes to Shun-ichi Goto
<gotoh@taiyo.co.jp>.

Closes: #269.
v210917
2021-08-27 16:04:40 +02:00
Frej Drejhammar
4227621eed Update contribution guidelines and make github display them
Try to make it clear that sloppy, throw it over the fence, patches
won't be accepted without revision and try to make sure a potential
contributor sees the warning while creating a pull request.
2021-07-29 15:28:01 +02:00
Frej Drejhammar
bdfc0c08c7 Merge branch 'frej/issue-258'
Closes 258
2021-02-26 16:44:31 +01:00
Frej Drejhammar
001749e69d Merge branch 'PR/260'
Closes 257
2021-02-26 16:40:12 +01:00
SirIntellegence
20c22a3110 Add plugin support for the 'extra' field
Permits plugins to import other information such as svn conversion revisions
2021-02-22 13:09:48 -07:00
Frej Drejhammar
f741bf39f2 bugfix: Avoid starting incremental conversions from scratch
Keys and values in the state cache are byte strings, therefore a
lookup of 'tip' will always fail. The failure makes the conversion
start over from the beginning, but as fast-export is deterministic the
results are the same, just very inefficient. The bug has existed since
the port to Python 3.

This patch switches the 'tip' lookup to use a byte string which should
make incremental conversions restart at the last converted commit. As
'x' == b'x' in Python 2, this should be a backwards compatible change.

Bug reported and fix suggested by Tomas Kolda.

Fixes #258.
2021-02-19 16:47:53 +01:00
Frej Drejhammar
427663c766 Merge branch 'PR/254' 2021-01-10 15:18:28 +01:00
Ray Luo
056756f193 Remove some ".py" wording
Avoid confusion about which file is the main entry point to fast-export,
in order to avoid the issue mentioned here

https://github.com/frej/fast-export/issues/158#issuecomment-754482516

Also fix a typo
2021-01-09 02:06:52 -08:00
Frej Drejhammar
588e03bb23 Merge branch 'PR/251' 2020-11-15 15:34:27 +01:00
Jason Winnebeck
89da4ad8af Document --ignore-unnamed-heads option 2020-11-14 21:24:54 -05:00
Frej Drejhammar
b0d5e56c8d Merge branch 'PR/247' v201029 2020-10-29 19:01:04 +01:00
Frej Drejhammar
787e8559b9 Fix typo in README 2020-10-29 19:00:30 +01:00
Henrik Tunedal
ab500a24a7 Add plugin for dropping commits from output 2020-10-29 12:04:27 +01:00
Frej Drejhammar
ead75895b0 Enable code analysis
Merge github generated workflow into master
2020-10-10 16:26:53 +02:00
Frej Drejhammar
bf5f14ddab Create codeql-analysis.yml 2020-10-10 13:15:54 +00:00
Frej Drejhammar
7057ce2c2b Allow plugins to modify the committer
Plugins have since they were introduced been able to modify the author
of a commit, but not the committer. This patch adds the necessary
support for allowing them to also modify the committer.
2020-09-30 17:47:33 +02:00
Frej Drejhammar
2b6f735b8c Update section about submitting patches in README
Try to cover the most common reasons for requesting changes in PRs.
2020-09-09 14:08:00 +02:00
Frej Drejhammar
71acb42a09 Merge branch 'PR/236-v2' into master
Implement a plugin converting unnamed heads to branches
2020-07-31 17:08:04 +02:00
Ondrej Stanek
a7955bc49b Update head2branch plugin to accept hg commit hash
The revision number isn't a unique identifier of commits across
repository clones and forks, while the hg hash is guaranteed to be stable.
2020-07-31 10:50:57 +02:00
Ondrej Stanek
9c6dea9fd4 Pass original hg commit hash to plugins 2020-07-31 10:50:51 +02:00
Ethan Furman
21827a53f7 Add head2branch plugin
Support converting unnamed heads to named branches during mercurial
conversions.

Co-Authored-By:	ostan89@gmail.com
2020-07-31 10:49:08 +02:00
Ethan Furman
5c1cbf82b0 Add revision to commit_data for commit plugins
Co-Authored-By: ostan89@gmail.com
2020-07-31 10:48:33 +02:00
Ondrej Stanek
50631c4b34 Add option --ignore-unnamed-heads
This option allows the user to ignore only unnamed heads (compared to --force
which ignores all non-fatal issues). The intended use is for a future plugin
converting unnamed heads to named branches.
2020-07-31 10:30:53 +02:00
Ethan Furman
2a9dd53d14 Show all unnamed heads at once
Co-Authored-By: ostan89@gmail.com
2020-07-31 10:27:07 +02:00
Frej Drejhammar
597093eaf1 Merge branch 'fix-233'
Closes #233
2020-07-10 16:52:17 +02:00
Frej Drejhammar
3910044a97 Avoid crash during rev-parse when the default encoding is ascii
In some locales the default encoding is ascii in which case
subprocess.check_output() will fail if it is given a non-ascii ref as
one of the arguments. By forcing the ref to be utf8 we will avoid a
crash while still behaving correctly when the default encoding is
utf8.

The credits for this fix go to Nikita Bazhinov for discovering the fix
and Chris J Billington for explaining it.

Co-Authored-By: Nikita Bazhinov <nbazhinov@syntellect.ru>
Co-Authored-By: Chris J Billington <chrisjbillington@gmail.com>
2020-07-10 16:41:38 +02:00
Frej Drejhammar
44c50d0fae Merge branch 'PR/226' 2020-05-07 20:10:24 +02:00
chrisjbillington
d29d30363b Fix backward incompatible change for hg < 5.1
The port to Python 3 in b961f146 changed `repo.branchmap().iteritems()`
to use `.items()` instead. However, the object returned by mercurial
isn't a dictionary and its `.items()` method was only introduced (as an
alias for `iteritems`) in hg 5.1. `iteritems()` still exists, so let's
keep using it for now to retain compatibility with hg < 5.1.
2020-05-06 11:59:49 -04:00
Frej Drejhammar
f102d2a69f Merge branch 'PR/223'
Closes #223
2020-05-06 16:31:13 +02:00
Ondrej Stanek
cf0e5837b6 Allow converting a repository with git and hg subrepos
In the verification phase, fast-export falsely expects that both hg
and git subrepositories should have the appropriate line in the
subrepo-map file. The case is, that only hg subrepos need a line in
subrepo-map that references a converted subrepo, while git
subrepositories do not.
2020-05-06 16:30:05 +02:00
Frej Drejhammar
61d22307af Merge branch 'PR/217'
Closes: #215
2020-03-26 20:17:20 +01:00
chrisjbillington
3b3f86b71e Allow utf8 in mappings
We were previously processing entries in mapping files (when
`--mappings-are-raw` is not given) with
`.decode('unicode_escape').encode('utf8')` to replace backslash escape
sequences in bytestrings with the utf-8 encoded characters they
represent. However, it turns out that `.decode
('unicode_escape')` assumes latin-1 encoding if it encounters non-ascii
bytes: https://bugs.python.org/issue21331. So this gave incorrect
results if non-ascii utf8 data was present in the mapping.

To fix this, we now add an extra layer of `.decode('utf8').encode
('unicode-escape')` in order to convert any non-ascii characters into
their backslash escape sequences. Then the subsequent
`.decode('unicode_escape')` only encounters ascii characters and gives
correct results.
2020-03-25 12:33:42 -04:00
Frej Drejhammar
e51844cd65 Merge branch 'PR/214'
Closes: #213
2020-03-25 16:09:01 +01:00
Toni Sissala
90eeef2ff4 Fix TypeError when using -M command line argument
hg-fast-export.sanitize_name expects branch name to be a bytes
object. Command line parser gives out str objects. Convert
possible str object to bytes in hg2git.set_default_branch().
2020-03-25 11:19:25 +02:00
Frej Drejhammar
7f4d9c3ad4 Merge branch 'PR/211' 2020-03-10 17:51:47 +01:00
Pi Delport
b37420f404 Fix link markup for hg-export-tool 2020-03-09 16:41:26 +02:00
Frej Drejhammar
f2aa47fdf7 Merge branch 'PR/210'
Closes #210.
2020-03-08 19:43:23 +01:00
chrisjbillington
6361b44c33 Fix bug in ignoring .git files/folders on Windows
Mercurial internally stores (most) filepaths using forward slashes, and
returns them as such from its Python API, even on Windows.

So the splitting up of filepaths with `os.path.sep` was incorrect,
resulting in `.git` files (those within a subdirectory, anyway)
not being ignored on Windows as intended. Splitting on `b'/'` regardless
of OS fixes this.
2020-03-08 19:40:50 +01:00