This function was never used and will be superseded by
`RopeSliceExt::is_grapheme_boundary` (which accepts a byte index rather
than a character index) once we transition to Ropey v2. In the meantime
any callers should convert to byte index and use the `RopeSliceExt`
extension rather than form new dependencies on this.
This fixes a regression from the switch to tree-house with one of the
custom predicates in indent queries: `#not-kind-eq?`. This predicate
should be allowed to be written multiple times in a pattern. For example
in the Go indents:
; Switches and selects aren't indented, only their case bodies are.
; Outdent all closing braces except those closing switches or selects.
(
(_ "}" @outdent) @outer
(#not-kind-eq? @outer "select_statement")
(#not-kind-eq? @outer "type_switch_statement")
(#not-kind-eq? @outer "expression_switch_statement")
)
So instead of an `Option<T>` of one we need a `Vec<T>` and we need to
check that all of these predicates are individually satisfied (basically
`iter().all(/* node kind is not expected kind for that capture */)`).
This avoids using any custom configuration in a user-defined
`languages.toml` config for the syntax test cases. The test cases should
only use the builtin `languages.toml` config.
Also the xtask crate reimplemented `default_lang_loader` and
`default_lang_config`. These functions are replaced with calls into
`helix_core`.
`InactiveQueryCursor::new` might reuse a query cursor from a
thread-local cache if one is available, rather than create a new cursor.
Currently tree-house does not reset cached cursors back to defaults
(i.e. byte range and match limit). For now we can patch around this here
but eventually this should be fixed in `tree-house` upstream. Then this
patch can be reverted.
In practice this caused textobjects like `]f` to get "stuck" trying to
move to the next function if it was out of the current view. This is
because the highlight query cursor sets the range of the cursor to the
current viewport. We can reset the byte range to defaults to fix the
textobject behavior.
The switch to tree-house accidentally dropped some shebang parsing code
from the loader's function to detect by shebang. This change restores
that. The new code is slightly different as it's using a `regex_cursor`
regex on the Rope rather than eagerly converting the text to a
`Cow<str>` and running a regular regex across it.
This type also exists on `Editor`. This change brings it to the
`Document` as well because the replacement for `Syntax` in the child
commits will eliminate `Syntax`'s copy of `syn_loader`. `Syntax` will
also be responsible for returning the highlighter and query iterators
(which will borrow the loader), so the loader must be separated from
that type.
In the long run, when we make a larger refactor to have
`Document::apply` be a function of the `Editor` instead of the
`Document`, we will be able to drop this field on `Document` - it is
currently only necessary for `Document::apply`. Once we make that
refactor, we will be able to eliminate the surrounding `Arc` in
`Arc<ArcSwap<syntax::Loader>>` and use the `ArcSwap` directly instead.
This resolves a TODO in the core diagnostic module to refactor this
type. It was originally an alias of `LanguageServerId` for simplicity.
Refactoring as an enum is a necessary step towards introducing
"internal" diagnostics - diagnostics emitted by core features such as
a spell checker. Fully supporting this use-case will require further
larger changes to the diagnostic type, but the change to the provider
can be made first.
Note that `Copy` is not derived for `DiagnosticProvider` (as it was
previously because `LanguageServerId` is `Copy`). In the child commits
we will add the `identifier` used in LSP pull diagnostics which is a
string - not `Copy`.
The dependabot file was matching on tree-sitter crates - that's a relic
from v0.6.0 and lower where grammars were regular dependencies.
The remaining changes are unused crates that were forgotten about during
shuffles like moving path canonicalization from helix-core to
helix-loader (and then again from helix-loader to helix-stdx).
Path completion items always have documentation but future core (i.e.
non-LSP) completions may not always have documentation - for example
word completion from the current buffer.
These functions are the equivalent of 23b424a46 for grapheme clusters.
In order to add the `is_grapheme_boundary` function we also need to
query whether a byte index lies on a character boundary, so this change
also adds `is_char_boundary`.
The `Name` variant's inner type can be switched to `RopeSlice` since
the parent commit removed the usage of `&str`. In doing this we need to
switch from a regular `Regex` to a `rope::Regex`, which is mostly a
matter of renaming the type.
The `Filename` and `Shebang` variants can also switch to `RopeSlice`
which avoids allocations in cases where the text doesn't reside on
different chunks of the rope. Previously `Filename`'s `Cow` was always
the owned variant because of the conversion to a `PathBuf`.
This splits the `InjectionLanguageMarker::Name` into two: one that
preforms the previous behavior (using the language configurations'
`injection_regex` fields and performing a match) and a new variant that
looks up directly by `language_id` with equality.
The old variant is used when capturing the injection language like we
do in the markdown queries for codefences. That captured text is part of
the document being highlighted so we might need a regex to recognize a
language like JavaScript as either "js" or "javascript". But the text
passed in the `(#set! injection.language "name")` property can be
looked up directly. This property is in the query code so there's no
need to be flexible in what we accept: we can require that the
`(#set! injection.language ..)` properties refer to languages by their
configured ID. This should save a noticeable amount of work for the
common case of injections: `(#set! injection.language)` is used much
more often than `@injection.language`.