1
1
mirror of https://github.com/go-gitea/gitea synced 2025-01-09 09:24:25 +00:00
gitea/tests/gitea-repositories-meta
Bruno Sofiato f64fbd9b74
Updated tokenizer to better matching when search for code snippets ()
This PR improves the accuracy of Gitea's code search. 

Currently, Gitea does not consider statements such as
`onsole.log("hello")` as hits when the user searches for `log`. The
culprit is how both ES and Bleve are tokenizing the file contents (in
both cases, `console.log` is a whole token).

In ES' case, we changed the tokenizer to
[simple_pattern_split](https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-simplepatternsplit-tokenizer.html#:~:text=The%20simple_pattern_split%20tokenizer%20uses%20a,the%20tokenization%20is%20generally%20faster.).
In such a case, tokens are words formed by digits and letters. In
Bleve's case, it employs a
[letter](https://blevesearch.com/docs/Tokenizers/) tokenizer.

Resolves 

---------

Signed-off-by: Bruno Sofiato <bruno.sofiato@gmail.com>
2024-11-06 20:51:20 +00:00
..
limited_org
migration/lfs-test.git
org3 Replace 'userxx' with 'orgxx' in all test files when the user type is org () 2023-09-14 02:59:53 +00:00
org26
org41/repo61.git Allow non-admin users to delete review requests () 2024-02-24 12:38:43 +00:00
org42/search-by-path.git Updated tokenizer to better matching when search for code snippets () 2024-11-06 20:51:20 +00:00
privated_org
user2 Support repo license () 2024-10-01 15:25:08 -04:00
user5/repo4.git
user12/repo10.git
user13/repo11.git
user27 Substitute variables in path names of template repos too () 2023-06-20 21:14:47 +00:00
user30 Add a simple test for external renderer () 2022-12-12 20:45:21 +08:00
user40/repo60.git Allow non-admin users to delete review requests () 2024-02-24 12:38:43 +00:00