Commit Graph

18 Commits

Author SHA1 Message Date
Nanguan Lin dc04044716
Replace assert.Fail with assert.FailNow (#27578)
assert.Fail() will continue to execute the code while assert.FailNow()
not. I thought those uses of assert.Fail() should exit immediately.
PS: perhaps it's a good idea to use
[require](https://pkg.go.dev/github.com/stretchr/testify/require)
somewhere because the assert package's default behavior does not exit
when an error occurs, which makes it difficult to find the root error
reason.
2023-10-11 11:02:24 +00:00
Lunny Xiao b167f35113
Add context parameter to some database functions (#26055)
To avoid deadlock problem, almost database related functions should be
have ctx as the first parameter.
This PR do a refactor for some of these functions.
2023-07-22 22:14:27 +08:00
wxiaoguang 18f26cfbf7
Improve queue and logger context (#24924)
Before there was a "graceful function": RunWithShutdownFns, it's mainly
for some modules which doesn't support context.

The old queue system doesn't work well with context, so the old queues
need it.

After the queue refactoring, the new queue works with context well, so,
use Golang context as much as possible, the `RunWithShutdownFns` could
be removed (replaced by RunWithCancel for context cancel mechanism), the
related code could be simplified.

This PR also fixes some legacy queue-init problems, eg:

* typo : archiver: "unable to create codes indexer queue" => "unable to
create repo-archive queue"
* no nil check for failed queues, which causes unfriendly panic

After this PR, many goroutines could have better display name:

![image](https://github.com/go-gitea/gitea/assets/2114189/701b2a9b-8065-4137-aeaa-0bda2b34604a)

![image](https://github.com/go-gitea/gitea/assets/2114189/f1d5f50f-0534-40f0-b0be-f2c9daa5fe92)
2023-05-26 07:31:55 +00:00
wxiaoguang 6f9c278559
Rewrite queue (#24505)
# ⚠️ Breaking

Many deprecated queue config options are removed (actually, they should
have been removed in 1.18/1.19).

If you see the fatal message when starting Gitea: "Please update your
app.ini to remove deprecated config options", please follow the error
messages to remove these options from your app.ini.

Example:

```
2023/05/06 19:39:22 [E] Removed queue option: `[indexer].ISSUE_INDEXER_QUEUE_TYPE`. Use new options in `[queue.issue_indexer]`
2023/05/06 19:39:22 [E] Removed queue option: `[indexer].UPDATE_BUFFER_LEN`. Use new options in `[queue.issue_indexer]`
2023/05/06 19:39:22 [F] Please update your app.ini to remove deprecated config options
```

Many options in `[queue]` are are dropped, including:
`WRAP_IF_NECESSARY`, `MAX_ATTEMPTS`, `TIMEOUT`, `WORKERS`,
`BLOCK_TIMEOUT`, `BOOST_TIMEOUT`, `BOOST_WORKERS`, they can be removed
from app.ini.

# The problem

The old queue package has some legacy problems:

* complexity: I doubt few people could tell how it works.
* maintainability: Too many channels and mutex/cond are mixed together,
too many different structs/interfaces depends each other.
* stability: due to the complexity & maintainability, sometimes there
are strange bugs and difficult to debug, and some code doesn't have test
(indeed some code is difficult to test because a lot of things are mixed
together).
* general applicability: although it is called "queue", its behavior is
not a well-known queue.
* scalability: it doesn't seem easy to make it work with a cluster
without breaking its behaviors.

It came from some very old code to "avoid breaking", however, its
technical debt is too heavy now. It's a good time to introduce a better
"queue" package.

# The new queue package

It keeps using old config and concept as much as possible.

* It only contains two major kinds of concepts:
    * The "base queue": channel, levelqueue, redis
* They have the same abstraction, the same interface, and they are
tested by the same testing code.
* The "WokerPoolQueue", it uses the "base queue" to provide "worker
pool" function, calls the "handler" to process the data in the base
queue.
* The new code doesn't do "PushBack"
* Think about a queue with many workers, the "PushBack" can't guarantee
the order for re-queued unhandled items, so in new code it just does
"normal push"
* The new code doesn't do "pause/resume"
* The "pause/resume" was designed to handle some handler's failure: eg:
document indexer (elasticsearch) is down
* If a queue is paused for long time, either the producers blocks or the
new items are dropped.
* The new code doesn't do such "pause/resume" trick, it's not a common
queue's behavior and it doesn't help much.
* If there are unhandled items, the "push" function just blocks for a
few seconds and then re-queue them and retry.
* The new code doesn't do "worker booster"
* Gitea's queue's handlers are light functions, the cost is only the
go-routine, so it doesn't make sense to "boost" them.
* The new code only use "max worker number" to limit the concurrent
workers.
* The new "Push" never blocks forever
* Instead of creating more and more blocking goroutines, return an error
is more friendly to the server and to the end user.

There are more details in code comments: eg: the "Flush" problem, the
strange "code.index" hanging problem, the "immediate" queue problem.

Almost ready for review.

TODO:

* [x] add some necessary comments during review
* [x] add some more tests if necessary
* [x] update documents and config options
* [x] test max worker / active worker
* [x] re-run the CI tasks to see whether any test is flaky
* [x] improve the `handleOldLengthConfiguration` to provide more
friendly messages
* [x] fine tune default config values (eg: length?)

## Code coverage:

![image](https://user-images.githubusercontent.com/2114189/236620635-55576955-f95d-4810-b12f-879026a3afdf.png)
2023-05-08 19:49:59 +08:00
flynnnnnnnnnn e81ccc406b
Implement FSFE REUSE for golang files (#21840)
Change all license headers to comply with REUSE specification.

Fix #16132

Co-authored-by: flynnnnnnnnnn <flynnnnnnnnnn@github>
Co-authored-by: John Olheiser <john.olheiser@gmail.com>
2022-11-27 18:20:29 +00:00
Lunny Xiao 86c85c19b6
Refactor AssertExistsAndLoadBean to use generics (#20797)
* Refactor AssertExistsAndLoadBean to use generics

* Fix tests

Co-authored-by: zeripath <art27@cantab.net>
2022-08-16 10:22:25 +08:00
Lunny Xiao 1a9821f57a
Move issues related files into models/issues (#19931)
* Move access and repo permission to models/perm/access

* fix test

* fix git test

* Move functions sequence

* Some improvements per @KN4CK3R and @delvh

* Move issues related code to models/issues

* Move some issues related sub package

* Merge

* Fix test

* Fix test

* Fix test

* Fix test

* Rename some files
2022-06-13 17:37:59 +08:00
6543 d8905cb623
Dont overwrite err with nil & rename PullCheckingFuncs to reflect there usage (#19572)
- dont overwrite err with nil unintentionaly
- rename CheckPRReadyToMerge to CheckPullBranchProtections
- rename prQueue to prPatchCheckerQueue

from #9307

Co-authored-by: delvh <dev.lh@web.de>
2022-05-02 01:54:44 +02:00
zeripath c88547ce71
Add Goroutine stack inspector to admin/monitor (#19207)
Continues on from #19202.

Following the addition of pprof labels we can now more easily understand the relationship between a goroutine and the requests that spawn them. 

This PR takes advantage of the labels and adds a few others, then provides a mechanism for the monitoring page to query the pprof goroutine profile.

The binary profile that results from this profile is immediately piped in to the google library for parsing this and then stack traces are formed for the goroutines.

If the goroutine is within a context or has been created from a goroutine within a process context it will acquire the process description labels for that process. 

The goroutines are mapped with there associate pids and any that do not have an associated pid are placed in a group at the bottom as unbound.

In this way we should be able to more easily examine goroutines that have been stuck.

A manager command `gitea manager processes` is also provided that can export the processes (with or without stacktraces) to the command line.

Signed-off-by: Andrew Thornton <art27@cantab.net>
2022-03-31 19:01:43 +02:00
zeripath a82fd98d53
Pause queues (#15928)
* Start adding mechanism to return unhandled data

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Create pushback interface

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Add Pausable interface to WorkerPool and Manager

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Implement Pausable and PushBack for the bytefifos

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Implement Pausable and Pushback for ChannelQueues and ChannelUniqueQueues

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Wire in UI for pausing

Signed-off-by: Andrew Thornton <art27@cantab.net>

* add testcases and fix a few issues

Signed-off-by: Andrew Thornton <art27@cantab.net>

* fix build

Signed-off-by: Andrew Thornton <art27@cantab.net>

* prevent "race" in the test

Signed-off-by: Andrew Thornton <art27@cantab.net>

* fix jsoniter mismerge

Signed-off-by: Andrew Thornton <art27@cantab.net>

* fix conflicts

Signed-off-by: Andrew Thornton <art27@cantab.net>

* fix format

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Add warnings for no worker configurations and prevent data-loss with redis/levelqueue

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Use StopTimer

Signed-off-by: Andrew Thornton <art27@cantab.net>

Co-authored-by: Lauris BH <lauris@nix.lv>
Co-authored-by: 6543 <6543@obermui.de>
Co-authored-by: techknowlogick <techknowlogick@gitea.io>
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
2022-01-22 21:22:14 +00:00
wxiaoguang 81926d61db
Decouple unit test, remove intermediate `unittestbridge` package (#17662)
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
2021-11-16 16:53:21 +08:00
wxiaoguang df64fa4865
Decouple unit test code from business code (#17623) 2021-11-12 22:36:47 +08:00
Lunny Xiao a4bfef265d
Move db related basic functions to models/db (#17075)
* Move db related basic functions to models/db

* Fix lint

* Fix lint

* Fix test

* Fix lint

* Fix lint

* revert unnecessary change

* Fix test

* Fix wrong replace string

* Use *Context

* Correct committer spelling and fix wrong replaced words

Co-authored-by: zeripath <art27@cantab.net>
2021-09-19 19:49:59 +08:00
zeripath ba526ceffe
Multiple Queue improvements: LevelDB Wait on empty, shutdown empty shadow level queue, reduce goroutines etc (#15693)
* move shutdownfns, terminatefns and hammerfns out of separate goroutines

Coalesce the shutdownfns etc into a list of functions that get run at shutdown
rather then have them run at goroutines blocked on selects.

This may help reduce the background select/poll load in certain
configurations.

* The LevelDB queues can actually wait on empty instead of polling

Slight refactor to cause leveldb queues to wait on empty instead of polling.

* Shutdown the shadow level queue once it is empty

* Remove bytefifo additional goroutine for readToChan as it can just be run in run

* Remove additional removeWorkers goroutine for workers

* Simplify the AtShutdown and AtTerminate functions and add Channel Flusher

* Add shutdown flusher to CUQ

* move persistable channel shutdown stuff to Shutdown Fn

* Ensure that UPCQ has the correct config

* handle shutdown during the flushing

* reduce risk of race between zeroBoost and addWorkers

* prevent double shutdown

Signed-off-by: Andrew Thornton <art27@cantab.net>
2021-05-15 16:22:26 +02:00
6543 a19447aed1
migrate from com.* to alternatives (#14103)
* remove github.com/unknwon/com from models

* dont use "com.ToStr()"

* replace "com.ToStr" with "fmt.Sprint" where its easy to do

* more refactor

* fix test

* just "proxy" Copy func for now

* as per @lunny
2020-12-25 11:59:32 +02:00
zeripath 875c5e1305 Only check for conflicts/merging if the PR has not been merged in the interim (#10132)
* Only check for merging if the PR has not been merged in the interim

* fixup! Only check for merging if the PR has not been merged in the interim

* Try to fix test failure

* Use PR2 not PR1 in tests as PR1 merges automatically

* return already merged error

* enforce locking

* enforce locking - fix-test

* enforce locking - fix-testx2

* enforce locking - fix-testx3

* move pullrequest checking to after merge

This might improve the chance that the race does not affect us but does not prevent it.

* Remove minor race with getting merge commit id

* fixup

* move check pr after merge

* Remove unnecessary prepareTestEnv - onGiteaRun does this for us

* Add information about when merging occuring

* fix fmt

* More logging

* Attempt to fix mysql

* Try MySQL fix again

* try again

* Try again?!

* Try again?!

* Sigh

* remove the count - perhaps that will help

* next remove the update id

* next remove the update id - make it updated_unix instead

* On failure to merge ensure that the pr is rechecked for conflict errors

* On failure to merge ensure that the pr is rechecked for conflict errors

* Update models/pull.go

* Update models/pull.go

Co-Authored-By: guillep2k <18600385+guillep2k@users.noreply.github.com>

* Apply suggestions from code review

Co-Authored-By: guillep2k <18600385+guillep2k@users.noreply.github.com>

Co-authored-by: Lauris BH <lauris@nix.lv>
Co-authored-by: guillep2k <18600385+guillep2k@users.noreply.github.com>
2020-02-10 01:09:31 +02:00
zeripath 2c903383b5
Add Unique Queue infrastructure and move TestPullRequests to this (#9856)
* Upgrade levelqueue to version 0.2.0

This adds functionality for Unique Queues

* Add UniqueQueue interface and functions to create them

* Add UniqueQueue implementations

* Move TestPullRequests over to use UniqueQueue

* Reduce code duplication

* Add bytefifos

* Ensure invalid types are logged

* Fix close race in PersistableChannelQueue Shutdown
2020-02-02 23:19:58 +00:00
Lunny Xiao 82e0383d21 Move some pull request functions from models to services (#9266)
* Move some pull request functions from models to services

* Fix test
2019-12-06 21:44:10 -05:00