On the first synchronization, the clone has the implicit branch "master". This cannot be
changed in JGit. When we fetch the refs from the repository that should be mirrored, the
master branch of the clone will be updated to the revision of the remote repository (if
it has a master branch). If now the master branch shall be filtered from mirroring (ie.
if it is rejected), we normally would delete the ref in this clone. But because it is
the current branch, it cannot be deleted. We detect this and later,
after we have pushed the result, delete the master branch by pushing
an empty ref to the central repository.
Introduces a maximum size for the simple workdir cache. On cache overflow workdirs are evicted using an LRU strategy.
Furthermore parallel requests for the same repository will now block until the workdir is released.
Currently the mirror command implementation for git fires post receive repository hook events, that return every changeset of the repository instead of those really added by the single mirror update.
This fixes this issue by first creating a working copy, running the fetch and the update in this clone, and then pushing back the result to the central repository. This triggers the internal mechanisms used in other commands like the modify command.
The downside of this is, that we first have to create the clone, so for big mirrors a cached working copy is strongly recommended.
When SCM-Manager is used behind a reverse proxy like
Nginx it may be the case, that lfs PUT requests are
buffered by the reverse proxy and will be sent to the
SCM-Manager after the whole file has been received. Due
to the expiration time of 5 minutes for the authentivation
token that had been requested by Git before the upload
has been started, this request from the proxy to
SCM-Manager fails if the upload from the client to the
reverse proxy took longer than these 5 minutes.
To solve this, we make this expiration time configurable,
so that whenever you have very large files or small
bandwidth the expiration timeout can be increased.
Fixes the transmission of messages from post commit hooks in Git repositories. We therefore use a new method patched in jGit for SCM-Manager. This simplifies the trigger logic a lot.
Only show relevant information for repository on repository information page. The initialization code example is only shown if the repository is still empty.
Validate filepath and filename to prevent path traversal in modification
command and provide validations for editor plugin.
Co-authored-by: René Pfeuffer <rene.pfeuffer@cloudogu.com>
Capture metrics about the lifetime of working copies used, for example, by the merge and modify commands. Working copies are internal repository clones that can place a large load on the server. Therefore, these metrics can be helpful in identifying sources of large server load.
Co-authored-by: Sebastian Sdorra <sebastian.sdorra@cloudogu.com>
With this pull request, diffs for Git are loaded in chunks. This means, that for diffs with a lot of files only a part of them are loaded. In the UI a button will be displayed to load more. In the REST API, the number of files can be specified. This only works for diffs, that are delivered as "parsed" diffs. Currently, this is only available for Git.
Co-authored-by: Sebastian Sdorra <sebastian.sdorra@cloudogu.com>
We will fire an RepositoryImportHookEvent instead of PostReceiveRepositoryHookEvent for repository imports with metadata. The event is only fired if all parts of the repository could be successfully imported. The extra event is required to avoid heavy recalculations which can be triggered by the PostReceiveRepositoryHookEvent for example the scm-statistic-plugin uses the PostReceiveRepositoryHookEvent to calculate its statistics.
Co-authored-by: Eduard Heimbuch <eduard.heimbuch@cloudogu.com>
Fire post receive repository hook event after pull from remote
and after unbundle (Git, HG and SVN)
Co-authored-by: René Pfeuffer <rene.pfeuffer@cloudogu.com>
The change tackles a sporadic error in integration tests, where during or after a change in a git repository a pack file could not be read correctly due to a "short compressed stream". The exception could look like this:
ERROR org.eclipse.jgit.internal.storage.file.ObjectDirectory - Exception caught while accessing pack file scm-home/repositories/8gSNr4ogc5X/data/objects/pack/pack-51a3500283ece83ab8efa7edfb9370e6f97311b5.pack, the pack file might be corrupt. Caught 1 consecutive errors while trying to read this pack.
java.io.EOFException: Short compressed stream at 197
at org.eclipse.jgit.internal.storage.file.PackFile.decompress(PackFile.java:381)
at org.eclipse.jgit.internal.storage.file.PackFile.load(PackFile.java:815)
at org.eclipse.jgit.internal.storage.file.PackFile.get(PackFile.java:284)
at org.eclipse.jgit.internal.storage.file.ObjectDirectory.openPackedObject(ObjectDirectory.java:455)
at org.eclipse.jgit.internal.storage.file.ObjectDirectory.openPackedFromSelfOrAlternate(ObjectDirectory.java:413)
at org.eclipse.jgit.internal.storage.file.ObjectDirectory.openObject(ObjectDirectory.java:404)
at org.eclipse.jgit.internal.storage.file.WindowCursor.open(WindowCursor.java:132)
at org.eclipse.jgit.treewalk.CanonicalTreeParser.reset(CanonicalTreeParser.java:191)
at org.eclipse.jgit.treewalk.TreeWalk.parserFor(TreeWalk.java:1344)
at org.eclipse.jgit.treewalk.TreeWalk.addTree(TreeWalk.java:732)
at sonia.scm.repository.spi.Differ.create(Differ.java:102)
at sonia.scm.repository.spi.Differ.diff(Differ.java:63)
We found out that this seems to be linked to the asynchronous execution of post commit hooks, that originally are triggered in the midst of the processing of a receive pack.
With this change the fireing of these triggers is delayed until the end of the internal git processing. The central class for this is the GitHookEventFacade, where hook events are either
- put into a ThreadLocal, so that they can be triggered later, or
- triggered in a thread that is joined with an internal JGit thread for internal pushes (eg. for modify command or merge command)
With this change, work dirs are created in the
directory of the repository and no longer in the
global scm work dir directory. This is relevant due
to two facts:
1. Repositories may contain confidential data and therefore
reside in special directories (that may be mounted on
special drives). It may be considered a breach when these
directories are cloned or otherwise copied to global
temporary drives.
2. Big repositories may overload global temp spaces. It may be
easier to create special drives with more space for such
big repositories.
* Add store exporter to collect the repository metadata
* Add EnvironmentInformationXmlGenerator
* Collect export data and put into compressed tar archive output stream
* Create full repository export endpoint.
* Add full repository export to ui
* Ignore irrelevant files from config store directory
* write metadata stores to file since a baos could teardown the server memory
* Migrate store name for git lfs files (#1504)
Changes the directory name for the git LFS blob store by
removing the repository id from the store name.
This is necessary for im- and exports of lfs blob stores,
because the original name had the repository id as a part
of it and therefore the old store would not be found when
the repository is imported with another id.
Existing blob files will be moved to the new store location
by an update step.
Co-authored-by: Eduard Heimbuch <eduard.heimbuch@cloudogu.com>
* Introduce util for migrations (#1505)
With this util it is more simple to rename
or delete stores.
* Rename files in export
Co-authored-by: René Pfeuffer <rene.pfeuffer@cloudogu.com>