gpt4 book ai didi

R:按字母顺序和数字顺序对具有字符和数值的字符串向量进行排序

转载 作者:行者123 更新时间:2023-12-02 04:39:51 25 4
gpt4 key购买 nike

我有一个包含字符和数值的字符串向量。例如:

a=c("ILLUMINA:420:C2D7UACXX:1:1102:14591:91480","ILLUMINA:420:C2D7UACXX:1:1102:14592:3881","ILLUMINA:420:C2D7UACXX:1:1102:14592:37103","ILLUMINA:420:C2D7UACXX:1:1102:14592:37356")

我想对向量进行排序,以便字符按字母顺序排序,数字按数字顺序排序。字符串的结构始终采用以下格式: "ILLUMINA:420:C2D7UACXX:1:<number>:<number>:<number>" , 因此实际上该顺序仅适用于最后三个冒号分隔的数字。

我试过mixedsort {gtools}但结果与使用 sort 相同和

sort.int, which is:

> mixedsort(a)
[1] "ILLUMINA:420:C2D7UACXX:1:1102:14591:91480" "ILLUMINA:420:C2D7UACXX:1:1102:14592:37103"
[3] "ILLUMINA:420:C2D7UACXX:1:1102:14592:37356" "ILLUMINA:420:C2D7UACXX:1:1102:14592:3881"

显然正确的顺序应该是:

[1] "ILLUMINA:420:C2D7UACXX:1:1102:14591:91480" "ILLUMINA:420:C2D7UACXX:1:1102:14592:3881" 
[3] "ILLUMINA:420:C2D7UACXX:1:1102:14592:37103" "ILLUMINA:420:C2D7UACXX:1:1102:14592:37356"

有没有立竿见影的解决办法?

最佳答案

EDIT 在 OP 澄清后完全改变解决方案

您可以提取最后 3 个元素和顺序,然后创建一个 data.frame:

dat = read.table(text=sub('.*:1:([0-9]+):([0-9]+):([0-9]+)','\\1|\\2|\\3',a),sep='|')
dat
V1 V2 V3
1 1102 14591 91480
2 1102 14592 3881
3 1102 14592 37103
4 1102 14592 37356

然后您使用 3 列进行排序:

 a[with(dat,order(V1,V2,V3))]
[1] "ILLUMINA:420:C2D7UACXX:1:1102:14591:91480" "ILLUMINA:420:C2D7UACXX:1:1102:14592:3881"
[3] "ILLUMINA:420:C2D7UACXX:1:1102:14592:37103" "ILLUMINA:420:C2D7UACXX:1:1102:14592:37356"

关于R:按字母顺序和数字顺序对具有字符和数值的字符串向量进行排序,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/21107295/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com