1
1
mirror of https://github.com/go-gitea/gitea synced 2025-10-26 08:58:24 +00:00
Commit Graph

304 Commits

Author SHA1 Message Date
Peter Gardfjäll
e28cc79c92 Improve sync performance for pull-mirrors (#19125)
This addresses https://github.com/go-gitea/gitea/issues/18352

It aims to improve performance (and resource use) of the `SyncReleasesWithTags` operation for pull-mirrors.

For large repositories with many tags, `SyncReleasesWithTags` can be a costly operation (taking several minutes to complete). The reason is two-fold:
    
1. on sync, every upstream repo tag is compared (for changes) against existing local entries in the release table to ensure that they are up-to-date.
    
2. the procedure for getting _each tag_ involves a series of git operations    
    ```bash
     git show-ref --tags -- v8.2.4477
     git cat-file -t 29ab6ce9f36660cffaad3c8789e71162e5db5d2f
     git cat-file -p 29ab6ce9f36660cffaad3c8789e71162e5db5d2f
     git rev-list --count 29ab6ce9f36660cffaad3c8789e71162e5db5d2f
     ```    

     of which the `git rev-list --count` can be particularly heavy.
    
This PR optimizes performance for pull-mirrors. We utilize the fact that a pull-mirror is always identical to its upstream and rebuild the entire release table on every sync and use a batch `git for-each-ref .. refs/tags` call to retrieve all tags in one go.
    
For large mirror repos, with hundreds of annotated tags, this brings down the duration of the sync operation from several minutes to a few seconds. A few unscientific examples run on my local machine:

- https://github.com/spring-projects/spring-boot (223 tags)
  - before: `0m28,673s`
  - after: `0m2,244s`
- https://github.com/kubernetes/kubernetes (890 tags)
  - before: `8m00s`
  - after: `0m8,520s`
- https://github.com/vim/vim (13954 tags)
  - before: `14m20,383s`
  - after: `0m35,467s`

 

I added a `foreachref` package which contains a flexible way of specifying which reference fields are of interest (`git-for-each-ref(1)`) and to produce a parser for the expected output. These could be reused in other places where `for-each-ref` is used.  I'll add unit tests for those if the overall PR looks promising.
2022-03-31 14:30:40 +02:00
wxiaoguang
b877504b03 Refactor git.Command.Run*, introduce RunWithContextString and RunWithContextBytes (#19266)
This follows 
* https://github.com/go-gitea/gitea/issues/18553

Introduce `RunWithContextString` and `RunWithContextBytes` to help the refactoring. Add related unit tests. They keep the same behavior to save stderr into err.Error() as `RunInXxx` before.

Remove `RunInDirTimeoutPipeline` `RunInDirTimeoutFullPipeline` `RunInDirTimeout` `RunInDirTimeoutEnv`  `RunInDirPipeline`  `RunInDirFullPipeline`  `RunTimeout`, `RunInDirTimeoutEnvPipeline`, `RunInDirTimeoutEnvFullPipeline`, `RunInDirTimeoutEnvFullPipelineFunc`.

Then remaining `RunInDir` `RunInDirBytes` `RunInDirWithEnv` can be easily refactored in next PR with a simple search & replace:
* before: `stdout, err := RunInDir(path)`
* next: `stdout, _, err := RunWithContextString(&git.RunContext{Dir:path})`

Other changes:
1. When `timeout <= 0`, use default. Because `timeout==0` is meaningless and could cause bugs. And now many functions becomes more simple, eg: `GitGcRepos` 9 lines to 1 line. `Fsck` 6 lines to 1 line.
2. Only set defaultCommandExecutionTimeout when the option `setting.Git.Timeout.Default > 0`
2022-03-31 13:56:22 +02:00
wxiaoguang
c83168104b Use a more general (and faster) method to sanitize URLs with credentials (#19239)
Use a more general method to sanitize URLs with credentials: Simple and intuitive / Faster /  Remove all credentials in all URLs
2022-03-31 10:25:40 +08:00
6543
3e88af898a Make git.OpenRepository accept Context (#19260)
* OpenRepositoryCtx -> OpenRepository
* OpenRepository -> openRepositoryWithDefaultContext, only for internal usage
2022-03-30 03:13:41 +08:00
zeripath
889a8c268c Use full output of git show-ref --tags to get tags for PushUpdateAddTag (#19235)
Strangely #19038 appears to relate to an issue whereby a tag appears to
be listed in `git show-ref --tags` but then does not appear when `git
show-ref --tags -- short_name` is called.

As a solution though I propose to stop the second call as it is
unnecessary and only likely to cause problems.

I've also noticed that the tags calls are wildly inefficient and aren't using the common cat-files - so these have been added.

I've also noticed that the git commit-graph is not being written on mirroring - so I've also added writing this to the migration which should improve mirror rendering somewhat. 

Fix #19038

Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: 6543 <6543@obermui.de>
2022-03-29 19:12:33 +02:00
Lunny Xiao
c29fbc6d23 Hide sensitive content on admin panel progress monitor (#19218)
Sanitize urls within git process descriptions.

Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
Co-authored-by: Andrew Thornton <art27@cantab.net>
2022-03-27 12:54:09 +01:00
zeripath
41b60d94db Do not include global arguments in process manager (#19226)
The git command by default adds a number of global arguments. These are not
helpful to be displayed in the process manager and so should be skipped for
default process descriptions.

Signed-off-by: Andrew Thornton <art27@cantab.net>
2022-03-27 10:09:56 +01:00
zeripath
2d21d2af9e Make migrations SKIP_TLS_VERIFY apply to git too (#19132)
Make SKIP_TLS_VERIFY apply to git data migrations too through adding the `-c http.sslVerify=false` option to the git clone command.

Fix #18998

Signed-off-by: Andrew Thornton <art27@cantab.net>
2022-03-19 14:16:38 +00:00
techknowlogick
0b15a729cf rm .sample hooks which aren't used (#19101) 2022-03-16 10:33:07 +00:00
zeripath
8ddb5490e8 Don't show context cancelled errors in attribute reader (#19006)
Fix #18997

Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
2022-03-08 08:30:14 +00:00
6543
efd10f1ab4 git backend ignore replace objects (#18979)
* git backend ignore replace objects

* comment
2022-03-02 20:13:19 +00:00
zeripath
4482f62a26 Prevent dangling GetAttribute calls (#18754)
It appears possible that there could be a hang due to unread data from the
repo-attribute command pipes. This PR simply closes these during the defer.

Signed-off-by: Andrew Thornton <art27@cantab.net>
2022-02-14 18:03:56 +01:00
Lunny Xiao
1b1658d887 Fix isempty detection of git repository (#18746)
* Fix isempty detection of git repository

* Fix IsEmpty check
2022-02-14 00:01:23 +08:00
Martin Scholz
26718a785a Change git.cmd to RunWithContext (#18693)
Change all `cmd...Pipeline` commands to `cmd.RunWithContext`.

#18553

Co-authored-by: Martin Scholz <martin.scholz@versasec.com>
2022-02-11 13:47:22 +01:00
zeripath
eb748f5f3c Add apply-patch, basic revert and cherry-pick functionality (#17902)
This code adds a simple endpoint to apply patches to repositories and
branches on gitea. This is then used along with the conflicting checking
code in #18004 to provide a basic implementation of cherry-pick revert.

Now because the buttons necessary for cherry-pick and revert have 
required us to create a dropdown next to the Browse Source button
I've also implemented Create Branch and Create Tag operations.

Fix #3880 
Fix #17986 

Signed-off-by: Andrew Thornton <art27@cantab.net>
2022-02-09 20:28:55 +00:00
6543
3043eb36bf Delete old git.NewCommand() and use it as git.NewCommandContext() (#18552) 2022-02-06 20:01:47 +01:00
zeripath
9419dd2b62 Stop logging an error when notes are not found (#18626)
This is an unnecessary logging event.

Fix #18616

Signed-off-by: Andrew Thornton <art27@cantab.net>
2022-02-06 15:11:35 +08:00
Gusted
f87d5ea9ee Remove go 1.15 support (#18511)
- Remove support for go 1.15(go.mod already requires go 1.16).

Co-authored-by: John Olheiser <john.olheiser@gmail.com>
2022-02-01 10:46:45 +08:00
zeripath
f7b152f126 Ensure git tag tests and others create test repos in tmpdir (#18447)
* Ensure git tag tests and other create test repos in tmpdir

There are a few places where tests appear to reuse testing repos which
causes random CI failures.

This PR simply changes these tests to ensure that cloning always happens
into new temporary directories.

Fix #18444

* Change log root for integration tests to use the REPO_TEST_DIR

There is a potential race in the drone integration tests whereby test-mysql etc
will start writing to log files causing make test-check fail.

Fix #18077

Signed-off-by: Andrew Thornton <art27@cantab.net>
2022-01-29 12:41:44 +00:00
Lunny Xiao
401e5c8174 Fix broken when no commits and default branch is not master (#18422)
* Fix broken when no commits and default branch is not master

* Fix IsEmpty check

* Improve codes

* Add timeout
2022-01-28 10:51:16 +08:00
6543
80adbebbc8 Unexport git.GlobalCommandArgs (#18376)
Unexport the git.GlobalCommandArgs variable.
2022-01-25 18:15:58 +00:00
Gusted
c2e13fb763 Fix partial cloning a repo (#18373)
- Pass the Global command args into serviceRPC.
- Fixes error with partial cloning.
- Add partial clone test
- Include diff

Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
2022-01-23 22:19:32 +01:00
Lunny Xiao
35fdefc1ff Always use git command but not os.Command (#18363) 2022-01-23 00:57:52 -05:00
6543
54e9ee37a7 format with gofumpt (#18184)
* gofumpt -w -l .

* gofumpt -w -l -extra .

* Add linter

* manual fix

* change make fmt
2022-01-20 18:46:10 +01:00
zeripath
5cb0c9aa0d Propagate context and ensure git commands run in request context (#17868)
This PR continues the work in #17125 by progressively ensuring that git
commands run within the request context.

This now means that the if there is a git repo already open in the context it will be used instead of reopening it.

Signed-off-by: Andrew Thornton <art27@cantab.net>
2022-01-19 23:26:57 +00:00
6543
ff00b8688b Fix NPE on try to get tag reference via API (#18245)
* fix npe

* rm gitRepo from Tag
2022-01-12 20:37:46 +00:00
luzpaz
af92473920 Fix source typos (#18227)
Follow-up to #18219
2022-01-10 23:46:26 +08:00
luzpaz
8c647bf0f6 Fix various typos (#18219)
Found via `codespell -q 3 -S ./options/locale,./vendor -L ba,pullrequest,pullrequests,readby,te,unknwon`

Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
2022-01-10 17:32:37 +08:00
Gusted
4b3bfd7e89 Enable partial clone by default (#18195)
- Enable partial clones(which are by default disabled from git) by
default, unless configured otherwise.
- Resolves #18190
2022-01-06 06:38:38 +01:00
zeripath
ffc08c1914 Do not read or write git reference files directly (#18079)
Git will and can pack references into packfiles and therefore if you write/read the
files directly you will get false results. Instead you should use update-ref and
show-ref. To that end I have created three new functions in git/repo_commit.go that
will do this correctly.

Related #17191

Signed-off-by: Andrew Thornton <art27@cantab.net>
2021-12-23 21:44:00 +08:00
99rgosse
e0cf3d86c4 Migrated Repository will show modifications when possible (#17191)
* Read patches to get history
2021-12-23 16:32:29 +08:00
zeripath
7cc7f0ed75 TestRepository_GetTag intermittently panics due to an NPE (#18043)
There are repeated panics in tests due to TestRepository_GetTag failing
to run properly.  This happens when we attempt to reset the internal
repo for a tag which has failed to load. The problem is - the panic that
this is causing is preventing us from finding what the real error is.

This PR simply moves the failure out so we have a chance to see what
really is failing.

Signed-off-by: Andrew Thornton <art27@cantab.net>

Co-authored-by: 6543 <6543@obermui.de>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
2021-12-21 11:10:16 +08:00
Gusted
ff2fd08228 Simplify parameter types (#18006)
Remove repeated type declarations in function definitions.
2021-12-20 04:41:31 +00:00
zeripath
fb5f7791ef Prevent off-by-one error on comments on newly appended lines (#18029)
* Prevent off-by-one error on comments on newly appended lines

There was a bug in CutDiffAroundLine whereby if a file without a terminal new line
has a patch which appends lines to it and a comment is placed on one of those lines
the comment diff will be a line out of place.

This fixes CutDiffAroundLine to simply ignore the missing terminal newline - however,
we should really improve this rendering to add a marker to say that there was a
previously missing terminal newline.

Fix #17875

Signed-off-by: Andrew Thornton <art27@cantab.net>
2021-12-20 02:15:49 +00:00
zeripath
f1e85622da Improve TestPatch to use git read-tree -m and implement git-merge-one-file functionality (#18004)
The current TestPatch conflict code uses a plain git apply which does not properly
account for 3-way merging. However, we can improve things using `git read-tree -m` to
do a three-way merge then follow the algorithm used in merge-one-file. We can also use 
`--patience` and/or `--histogram` to generate a nicer diff for applying patches too.

Fix #13679
Fix #6417

Signed-off-by: Andrew Thornton <art27@cantab.net>
2021-12-19 05:19:25 +01:00
zeripath
8354670708 Prevent hang in git cat-file if repository is not a valid repository and other fixes (#17991)
This PR contains multiple fixes. The most important of which is:

* Prevent hang in git cat-file if the repository is not a valid repository 
    
    Unfortunately it appears that if git cat-file is run in an invalid
    repository it will hang until stdin is closed. This will result in
    deadlocked /pulls pages and dangling git cat-file calls if a broken
    repository is tried to be reviewed or pulls exists for a broken
    repository.

    Fix #14734
    Fix #9271
    Fix #16113

Otherwise there are a few small other fixes included which this PR was initially intending to fix:

* Fix panic on partial compares due to missing PullRequestWorkInProgressPrefixes
* Fix links on pulls pages  due to regression from #17551 - by making most /issues routes match /pulls too - Fix #17983
* Fix links on feeds pages due to another regression from #17551 but also fix issue with syncing tags - Fix #17943
* Add missing locale entries for oauth group claims
* Prevent NPEs if ColorFormat is called on nil users, repos or teams.
2021-12-16 19:01:14 +00:00
zeripath
9e6e1dc950 Improve checkBranchName (#17901)
The current implementation of checkBranchName is highly inefficient
involving opening the repository, the listing all of the branch names
checking them individually before then using using opened repo to get
the tags.

This PR avoids this by simply walking the references from show-ref
instead of opening the repository (in the nogogit case).

Signed-off-by: Andrew Thornton <art27@cantab.net>
2021-12-08 19:08:16 +00:00
mscherer
34b5436ae1 Refactor various strings (#17784)
Fixes #16478

Co-authored-by: Gusted <williamzijl7@hotmail.com>

Co-authored-by: Gusted <williamzijl7@hotmail.com>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
2021-12-02 15:28:08 +08:00
zeripath
01087e9eef Make Requests Processes and create process hierarchy. Associate OpenRepository with context. (#17125)
This PR registers requests with the process manager and manages hierarchy within the processes.

Git repos are then associated with a context, (usually the request's context) - with sub commands using this context as their base context.

Signed-off-by: Andrew Thornton <art27@cantab.net>
2021-11-30 20:06:32 +00:00
Gusted
d1f5584039 Simplify code for wrting SHA to name-rev (#17696) 2021-11-18 04:50:22 -05:00
zeripath
3c4724d70e Add .gitattribute assisted language detection to blame, diff and render (#17590)
Use check attribute code to check the assigned language of a file and send that in to
chroma as a hint for the language of the file.

Signed-off-by: Andrew Thornton <art27@cantab.net>
2021-11-17 20:37:00 +00:00
wxiaoguang
750a8465f5 A better go code formatter, and now make fmt can run in Windows (#17684)
* go build / format tools
* re-format imports
2021-11-17 20:34:35 +08:00
zeripath
bbffcc3aec Multiple Escaping Improvements (#17551)
There are multiple places where Gitea does not properly escape URLs that it is building and there are multiple places where it builds urls when there is already a simpler function available to use this.
    
This is an extensive PR attempting to fix these issues.

1. The first commit in this PR looks through all href, src and links in the Gitea codebase and has attempted to catch all the places where there is potentially incomplete escaping.
2. Whilst doing this we will prefer to use functions that create URLs over recreating them by hand.
3. All uses of strings should be directly escaped - even if they are not currently expected to contain escaping characters. The main benefit to doing this will be that we can consider relaxing the constraints on user names and reponames in future. 
4. The next commit looks at escaping in the wiki and re-considers the urls that are used there. Using the improved escaping here wiki files containing '/'. (This implementation will currently still place all of the wiki files the root directory of the repo but this would not be difficult to change.)
5. The title generation in feeds is now properly escaped.
6. EscapePound is no longer needed - urls should be PathEscaped / QueryEscaped as necessary but then re-escaped with Escape when creating html with locales Signed-off-by: Andrew Thornton <art27@cantab.net>

Signed-off-by: Andrew Thornton <art27@cantab.net>
2021-11-16 18:18:25 +00:00
KN4CK3R
f99d50fc9f Read expected buffer size (#17409)
* Read expected buffer size.

* Changed name.
2021-10-24 22:12:43 +01:00
Lunny Xiao
4a57c9ea17 Fix some lints (#17337)
Fix some linting problems.
2021-10-17 20:47:12 +01:00
zeripath
58cd55d353 Check for context exceeded in WalkGitLog (#17319)
There is a slight race in checking of a context deadline exceed in #16467
which leads to a 500 on the repository page.

The solution is to check the error coming back from `*LogNameStatusRepoParser.Next()`
and if it is the `ContextDeadlineExceeded` break from the loop.

Fix #17314

Signed-off-by: Andrew Thornton <art27@cantab.net>
2021-10-15 19:41:34 +01:00
zeripath
a889d0cc8c Add buttons to allow loading of incomplete diffs (#16829)
This PR adds two buttons to the stats and the end of the diffs list to load the (some of) the remaining incomplete diff sections.

Contains #16775
    
Signed-off-by: Andrew Thornton <art27@cantab.net>


## Screenshots

### Show more button at the end of the diff
![Screenshot from 2021-09-04 11-12-37](https://user-images.githubusercontent.com/1824502/132091009-b1f6113e-2c04-4be5-8a04-b8ecea56887b.png)

### Show more button at the end of the diff stats box
![Screenshot from 2021-09-04 11-14-54](https://user-images.githubusercontent.com/1824502/132091063-86da5a6d-6628-4b82-bea9-3655cd9f40f6.png)
2021-10-15 17:05:33 +01:00
zeripath
01b9d35f1a Disable core.protectNTFS (#17300)
core.protectNTFS protects NTFS from files which may be difficult to remove or interact
with using the win32 api, however, it also appears to prevent such files from
being entered into the git indexes - fundamentally causing breakages with PRs that
affect these files. However, deliberately setting this to false may cause security
issues due to the remain sparse checkout of files in the merge pipeline.

The only sensible option therefore is to provide an optional setting which admins
could set which would forcibly switch this off if they are affected by this issue.

Fix #17092

Signed-off-by: Andrew Thornton <art27@cantab.net>
2021-10-13 14:20:11 -04:00
a1012112796
bb39359668 Add a simple way to rename branch like gh (#15870)
- Update default branch if needed
- Update protected branch if needed
- Update all not merged pull request base branch name
- Rename git branch
- Record this rename work and auto redirect for old branch on ui

Signed-off-by: a1012112796 <1012112796@qq.com>
Co-authored-by: delvh <dev.lh@web.de>
2021-10-08 19:03:04 +02:00
zeripath
001dbf100d Defer Last Commit Info (#16467)
One of the biggest reasons for slow repository browsing is that we wait
until last commit information has been generated for all files in the
repository.

This PR proposes deferring this generation to a new POST endpoint that
does the look up outside of the main page request.

Signed-off-by: Andrew Thornton <art27@cantab.net>
2021-10-08 15:08:22 +02:00