gpt4 book ai didi

typescript - 如何避免 TypeScript 模板文字类型推断中的歧义?

转载 作者:行者123 更新时间:2023-12-04 08:13:11 26 4
gpt4 key购买 nike

我正在尝试编写一种类型来验证给定的输入字符串是否具有由 1 个或多个空格字符分隔的有效类名。输入也可能有前导或尾随空格。
我现在的类型非常接近,但是 TS 编译器可以通过多种方式推断模板文字,这意味着语法不明确。这会导致不需要的结果。
首先我们定义原始类型:

// To avoid recursion as much as possible
type Spaces = (
| " "
| " "
| " "
| " "
| " "
);
type Whitespace = Spaces | "\n" | "\t";
type ValidClass = 'a-class' | 'b-class' | 'c-class';
然后是实用程序类型
// Utility type to provide nicer error messages
type Err<Message extends string> = `Error: ${Message}`;

type TrimEnd<T extends string> = (
T extends `${infer Rest}${Whitespace}`
? TrimEnd<Rest>
: T
);
type TrimStart<T extends string> = (
T extends `${Whitespace}${infer Rest}`
? TrimStart<Rest>
: T
);
type Trim<T extends string> = TrimEnd<TrimStart<T>>;
最后是检查输入字符串的实际类型:
// Forces the string to be trimmed before starting recursive loop.
type SplitToValidClasses<T extends string> = SplitToValidClassesInner<Trim<T>>;

// Splits the input string into array of `Array<Token | 'Error: ...'>`
// strings. The input is converted to an array format mostly because I found it
// easier to work with arrays in other TS generics, instead of e.g space separated
// values.
type SplitToValidClassesInner<T extends string> =
// Does `T` contain more than one string? For example 'aaaa\n\n bbbb'
T extends `${infer Head}${Whitespace}${infer Tail}`
// Yes, `T` could be infered into three parts.
// Is `Head` a valid class name?
? Trim<Head> extends ValidClass
// Yes, it's a valid name. Continue recursively with rest of the string
// but trim white space from both sides.
? [Trim<Head>, ...SplitToValidClassesInner<Trim<Tail>>]
: [Err<`'${Head}' is not a valid class`>]
: T extends `${infer Tail}`
? Tail extends ValidClass
? [Tail]
: [Err<`'${Tail}' is not a valid class`>]
: [never];

// This works
type CorrectResult = SplitToValidClasses<'a-class b-class c-class'>
但是当使用不同的输入进行测试时,我们会注意到不正确的结果:
// Should be ["a-class", "b-class", "c-class"]
type Input1 = `a-class b-class c-class`;
type Result = SplitToValidClasses<Input1>;

// Should be ["a-class", "b-class", "c-class", "a-class"]
type Result2 = SplitToValidClasses<`

a-class b-class
c-class

a-class
`>;

// Should be ["a-class", "Error: 'wrong-class' is not a valid class"]
type Result3 = SplitToValidClasses<`
a-class
wrong-class
c-class
`>;
问题发生在模板推理中:
type SplitToValidClassesInnerFirstLevelDebug<T extends string> =
T extends `${infer Head}${Whitespace}${infer Tail}`
? [Head, Whitespace, Tail]
: never

// The grammar is ambiguous, leading to
// "["a-class b-class" | "a-class", Whitespace, "c-class" | "b-class c-class"]
// Removing the ambiguousity should fix the issue
type Result4 = SplitToValidClassesInnerFirstLevelDebug<Input1>
Playground link
除了 Anders Hejlsberg 在 his PR 中解释的内容之外,我找不到很多关于如何推断模板文字的细节的文档。 :

For inference to succeed the starting and ending literal character spans (if any) of the target must exactly match the starting and ending spans of the source. Inference proceeds by matching each placeholder to a substring in the source from left to right: A placeholder followed by a literal character span is matched by inferring zero or more characters from the source until the first occurrence of that literal character span in the source. A placeholder immediately followed by another placeholder is matched by inferring a single character from the source.


如何实现这种打字,而不会产生模棱两可的结果?我想到的一种方法是逐个字符地递归解析输入,但它很快就达到了 TS 中的递归限制。

最佳答案

我想出了两个解决方案,但都没有解决原始问题,因为类型变得太复杂或递归。第二种解决方案肯定比第一种解决方案更具可扩展性。
方案一:递归解析
此解决方案递归解析输入字符串。 type Split按空格拆分输入字符串并返回标记(或单词)数组。

type EndOfInput = '';

// Validates given `UnprocessedInput` input string
// It recursively iterates through each character in the string,
// and appends characters into the second type parameter `Current` until the
// token has been consumed. When the token is fully consumed, it is added to
// `Result` and `Current` memory is cleared.
//
// NOTE: Do not pass anything else than the first type parameter. Other type
// parameters are for internal tracking during recursive loop
//
// See https://github.com/microsoft/TypeScript/pull/40336 for more template literal
// examples.
type Split<UnprocessedInput extends string, Current extends string = '', Result extends string[] = []> =
// Have we reached to the end of the input string ?
UnprocessedInput extends EndOfInput
// Yes. Is the `Current` empty?
? Current extends EndOfInput
// Yes, we're at the end of processing and no need to add new items to result
? Result
// No, add the last item to results, and return result
: [...Result, Current]
// No, use template literal inference to get first char, and the rest of the string
: UnprocessedInput extends `${infer Head}${infer Rest}`
// Is the next character whitespace?
? Head extends Whitespace
// No, and is the `Current` empty?
? Current extends EndOfInput
// Yes, continue "eating" whitespace
? Split<Rest, Current, Result>
// No, it means we went from a token to whitespace, meaning the token
// is fully parsed and can be added to the result
: Split<Rest, '', [...Result, Current]>
// No, add the character to Current
: Split<Rest, `${Current}${Head}`, Result>
// This shouldn't happen since UnprocessedInput is restricted with
// `extends string` type narrowing.
// For example ValidCssClassName<null> would be a `never` type if it didn't
// already fail to "Type 'null' does not satisfy the constraint 'string'"
: [never]
这适用于较小的输入,但不适用于较大的字符串,因为 TS 递归限制:
type Result5 = Split<`
a


b

c`>

// Fails for larger string values, because of recursion limit
type Result6 = Split<`aaaaaaaaaaaaaaaaaaa
bbbbbbbbbbbbbbbbbbbbb`
Playground link
解决方案 2:称为 token 的类
由于我们实际上将有效的类名作为字符串联合,我们可以将其用作模板文字类型的一部分来使用整个类名。
为了理解这个解决方案,让我们从部分构建它。首先让我们使用 ValidClass在模板文字中:
type SplitDebug1<T extends string> =
T extends `${ValidClass}${Whitespace}${infer Tail}`
? [ValidClass, Whitespace, Tail]
: never

// The grammar is not ambiguous anymore!
// [ValidClass, Whitespace, "b-class c-class"]
type Result1 = SplitDebug1<"a-class b-class c-class">
这解决了歧义问题,但现在我们不能再访问解析的 Head,因为 ValidClass只是指类型 type ValidClass = "a-class" | "b-class" | "c-class" .不幸的是,TypeScript 不允许同时推断和限制 token ,所以这是不可能的:
type SplitDebug2<T extends string> =
T extends `${infer Head extends ValidClass ? infer Head : never}${Whitespace}${infer Tail}`
? [Head, Whitespace, Tail]
: never

// Still just [ValidClass, Whitespace, "b-class c-class"]
type Result2 = SplitDebug1<"a-class b-class c-class">
但这里是黑客。我们可以使用已知的 Tail作为反转匹配以访问 Head 的一种方式:
type SplitDebug3<T extends string> =
T extends `${ValidClass}${Whitespace}${infer Tail}`
? T extends `${infer Head}${Whitespace}${Tail}`
? [Head, Whitespace, Tail]
: never
: never

// Now we now the first valid token aka class name!
// ["a-class", Whitespace, "b-class c-class"]
type Result3 = SplitDebug3<"a-class b-class c-class">
这个技巧可以用来解析有效的类名,完整的解决办法:

// Demonstrating with large amount of class names
// Breaks to "too complex union type" with 20k class names
type Digit = '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9';
type ValidClass1000 = `class-${Digit}${Digit}${Digit}`;

type SplitToValidClasses<T extends string> = SplitToValidClassesInner<Trim<T>>;
type SplitToValidClassesInner<T extends string> =
T extends `${ValidClass1000}${Whitespace}${infer Tail}`
? T extends `${infer Head}${Whitespace}${Tail}`
? Trim<Head> extends ValidClass1000
? [Trim<Head>, ...SplitToValidClassesInner<Trim<Tail>>]
: [Err<`'${Head}' is not a valid class`>]
: never
: T extends `${infer Tail}`
? Tail extends ValidClass1000
? [Tail]
: [Err<`'${Tail}' is not a valid class`>]
: [never];

// ["class-001", "class-002", "class-003", "class-004", "class-000"]
type Result4 = SplitToValidClasses<`

class-001 class-002
class-003
class-004 class-000

`>
Playground link
这是我能想到的最好的解决方案,也适用于相当大的联合类型。错误消息可以被完善,但它仍然提示正确的位置。
虽然支持联合类型中的大量选择,但这不适用于我们在单个类型联合中有大约 40k Tailwind 类名称的实际用例。该类型表示在开发期间可能添加的所有可能的类名(未使用的在生产中被清除)。

关于typescript - 如何避免 TypeScript 模板文字类型推断中的歧义?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/65844206/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com