mirror of
				https://github.com/go-gitea/gitea.git
				synced 2025-10-30 18:36:22 +01:00 
			
		
		
		
	Preserve BOM in web editor (#28935)
The `ToUTF8*` functions were stripping BOM, while BOM is actually valid in UTF8, so the stripping must be optional depending on use case. This does: - Add a options struct to all `ToUTF8*` functions, that by default will strip BOM to preserve existing behaviour - Remove `ToUTF8` function, it was dead code - Rename `ToUTF8WithErr` to `ToUTF8` - Preserve BOM in Monaco Editor - Remove a unnecessary newline in the textarea value. Browsers did ignore it, it seems but it's better not to rely on this behaviour. Fixes: https://github.com/go-gitea/gitea/issues/28743 Related: https://github.com/go-gitea/gitea/issues/6716 which seems to have once introduced a mechanism that strips and re-adds the BOM, but from what I can tell, this mechanism was removed at some point after that PR.
This commit is contained in:
		| @@ -135,7 +135,7 @@ func (b *Indexer) addUpdate(ctx context.Context, batchWriter git.WriteCloserErro | ||||
| 			Id(id). | ||||
| 			Doc(map[string]any{ | ||||
| 				"repo_id":    repo.ID, | ||||
| 				"content":    string(charset.ToUTF8DropErrors(fileContents)), | ||||
| 				"content":    string(charset.ToUTF8DropErrors(fileContents, charset.ConvertOpts{})), | ||||
| 				"commit_id":  sha, | ||||
| 				"language":   analyze.GetCodeLanguage(update.Filename, fileContents), | ||||
| 				"updated_at": timeutil.TimeStampNow(), | ||||
|   | ||||
		Reference in New Issue
	
	Block a user