gpt4 book ai didi

How to specify more spaces for the delimiter using cut?(如何使用Cut为分隔符指定更多空格?)

转载 作者:bug小助手 更新时间:2023-10-25 23:54:08 34 4
gpt4 key购买 nike



Is there any way to specify a field delimiter for more spaces with the cut command? (like " "+) ?
For example: In the following string, I like to reach value '3744', what field delimiter I should say?

有没有办法用CUT命令为更多的空格指定一个字段分隔符?(如“”+)?例如:在下面的字符串中,我希望达到值‘3744’,我应该说什么字段分隔符?



$ps axu | grep jboss

jboss 2574 0.0 0.0 3744 1092 ? S Aug17 0:00 /bin/sh /usr/java/jboss/bin/run.sh -c example.com -b 0.0.0.0


cut -d' ' is not what I want, for it's only for one single space.
awk is not what I am looking for either, but how to do with 'cut'?

Cut-d‘’不是我想要的,因为它只有一个空间。AWK也不是我想要的,但是怎么处理‘Cut’呢?



thanks.

谢谢。


更多回答

best answer is using tr as shown here: stackoverflow.com/a/4483833/168143

最佳答案是使用如下所示的tr:stackoverflow.com/a/4483833/168143

Not directly relevant to the actual question being asked but instead of ps+grep you could use pgrep which is available in most modern distros. It will return the result exactly in the form you need it.

与实际提出的问题没有直接关系,但您可以使用pgrep,它在大多数现代发行版中都可用,而不是ps+grep。它将以您需要的形式完全返回结果。

Possible duplicate of How to make the 'cut' command treat multiple characters as one delimiter?

如何使‘Cut’命令将多个字符视为一个分隔符,这可能是重复的?

These days I just use hck as a drop in cut replacement. By default it splits on all whitespace, like awk. And the key feature is that you can specify a delimiter with -d like cut, but unlike cut that delimiter can be a regex! No more needing to pre-process with tr -s before passing to cut. You can find hck here: github.com/sstadick/hck

这些天来,我只是用HCK来替代伤口。默认情况下,它在所有空格上拆分,如awk。关键特性是您可以使用-d来指定分隔符,但与Cut不同的是,该分隔符可以是正则表达式!在传递到切割之前,不再需要与tr-S进行前处理。你可以在这里找到hkk:githorb.com/sstadick/hk.

Does this answer your question? Does CUT support multiple spaces as the delimiter?

这回答了你的问题吗?Cut是否支持多个空格作为分隔符?

优秀答案推荐

Actually awk is exactly the tool you should be looking into:

实际上,awk正是您应该研究的工具:



ps axu | grep '[j]boss' | awk '{print $5}'


or you can ditch the grep altogether since awk knows about regular expressions:

或者,您可以完全放弃grep,因为awk了解正则表达式:



ps axu | awk '/[j]boss/ {print $5}'


But if, for some bizarre reason, you really can't use awk, there are other simpler things you can do, like collapse all whitespace to a single space first:

但是,如果出于某种奇怪的原因,您真的不能使用awk,您还可以做其他更简单的事情,比如首先将所有空格折叠到一个空格中:



ps axu | grep '[j]boss' | sed 's/\s\s*/ /g' | cut -d' ' -f5





That grep trick, by the way, is a neat way to only get the jboss processes and not the grep jboss one (ditto for the awk variant as well).

顺便说一句,grep技巧是一种巧妙的方法,可以只获取Joss进程,而不获取grep Joss进程(awk变体也是如此)。



The grep process will have a literal grep [j]boss in its process command so will not be caught by the grep itself, which is looking for the character class [j] followed by boss.

grep进程将在其process命令中有一个字面意义的grep [j]boss,因此不会被grep本身捕获,后者正在寻找boss后面的字符类[j]。



This is a nifty way to avoid the | grep xyz | grep -v grep paradigm that some people use.

这是避免某些人使用的|grep xyz|grep-v grep范例的好方法。



awk version is probably the best way to go, but you can also use cut if you firstly squeeze the repeats with tr:

AWK版本可能是最好的方法,但如果你首先用tr挤压重复,你也可以使用Cut:



ps axu | grep jbos[s] | tr -s ' ' | cut -d' ' -f5
# ^^^^^^^^^^^^ ^^^^^^^^^ ^^^^^^^^^^^^^
# | | |
# | | get 5th field
# | |
# | squeeze spaces
# |
# avoid grep itself to appear in the list


I like to use the tr -s command for this

为此,我喜欢使用tr-S命令



 ps aux | tr -s [:blank:] | cut -d' ' -f3


This squeezes all white spaces down to 1 space. This way telling cut to use a space as a delimiter is honored as expected.

这会将所有空格压缩到1个空格。这种方式告诉cut使用空格作为空格是符合预期的。



I am going to nominate tr -s [:blank:] as the best answer.

我将提名tr-S[:BLACK:]作为最佳答案。



Why do we want to use cut? It has the magic command that says "we want the third field and every field after it, omitting the first two fields"

我们为什么要使用CUT?它有一个神奇的命令:“我们需要第三个字段和其后的每个字段,省略前两个字段。”



cat log | tr -s [:blank:] |cut -d' ' -f 3- 


I do not believe there is an equivalent command for awk or perl split where we do not know how many fields there will be, ie out put the 3rd field through field X.

我不相信Awk或Perl Split有类似的命令,我们不知道会有多少个域,即输出第三个域到域X。



Shorter/simpler solution: use cuts (cut on steroids I wrote)



ps axu | grep '[j]boss' | cuts 4


Note that cuts field indexes are zero-based so 5th field is specified as 4

请注意,Cuts字段索引是从零开始的,因此第5个字段被指定为4



http://arielf.github.io/cuts/

Http://arielf.github.io/cuts/



And even shorter (not using cut at all) is:

更短的(根本不使用Cut)是:



pgrep jboss


One way around this is to go:

绕过这个问题的一种方法是:



$ps axu | grep jboss | sed 's/\s\+/ /g' | cut -d' ' -f3


to replace multiple consecutive spaces with a single one.

用一个空格替换多个连续的空格。



Personally, I tend to use awk for jobs like this. For example:

就我个人而言,我倾向于使用awk来做这样的工作。例如:



ps axu| grep jboss | grep -v grep | awk '{print $5}'


As an alternative, there is always perl:

作为替代方案,总是有Perl:



ps aux | perl -lane 'print $F[3]'


Or, if you want to get all fields starting at field #3 (as stated in one of the answers above):

或者,如果您想要获取从字段#3开始的所有字段(如上面的答案之一所述):



ps aux | perl -lane 'print @F[3 .. scalar @F]'


If you want to pick columns from a ps output, any reason to not use -o?

如果您想从ps输出中选择列,那么有什么理由不使用-o呢?



e.g.

例如



ps ax -o pid,vsz
ps ax -o pid,cmd


Minimum column width allocated, no padding, only single space field separator.

分配的最小列宽,没有填充,只有一个空格字段分隔符。



ps ax --no-headers -o pid:1,vsz:1,cmd

3443 24600 -bash
8419 0 [xfsalloc]
8420 0 [xfs_mru_cache]
8602 489316 /usr/sbin/apache2 -k start
12821 497240 /usr/sbin/apache2 -k start
12824 497132 /usr/sbin/apache2 -k start


Pid and vsz given 10 char width, 1 space field separator.

PID和VSZ提供10个字符宽度,1个空域分隔符。



ps ax --no-headers -o pid:10,vsz:10,cmd

3443 24600 -bash
8419 0 [xfsalloc]
8420 0 [xfs_mru_cache]
8602 489316 /usr/sbin/apache2 -k start
12821 497240 /usr/sbin/apache2 -k start
12824 497132 /usr/sbin/apache2 -k start


Used in a script:-

在脚本中使用:-



oldpid=12824
echo "PID: ${oldpid}"
echo "Command: $(ps -ho cmd ${oldpid})"


I've implemented a patch that adds new -m command-line option to cut(1), which works in the field mode and treats multiple consecutive delimiters as a single delimiter. This basically solves the OP's question in a rather efficient way, by treating several spaces as one delimiter right within cut(1).

我已经实现了一个补丁,它向Cut(1)添加了新的-m命令行选项,它在字段模式下工作,并将多个连续的分隔符视为单个分隔符。这基本上以相当有效的方式解决了OP的问题,将几个空格视为Cut(1)中的一个分隔符。


In particular, with my patch applied, the following command will perform the desired operation. It's as simple as that, just add -m into the invocation of cut(1) and simply use -d ' ' -f 5 to extract the PID values from the process list produced by ps(1):

特别是,在应用我的补丁后,以下命令将执行所需的操作。就这么简单,只需将-m添加到Cut(1)的调用中,然后使用-d‘’-f5从ps(1)生成的进程列表中提取PID值:


ps axu | grep jboss | cut -d ' ' -m -f 5

I also submitted this patch upstream, and let's hope that it will eventually be accepted and merged into the coreutils project.

我也向上游提交了这个补丁,让我们希望它最终会被接受并合并到coreutils项目中。


There are some further thoughts about adding even more whitespace-related features to cut(1), and having some feedback on all that from different people would be great, preferably on the coreutils mailing list. I'm willing to implement more patches for cut(1) and submit them upstream, which would make this utility more versatile and more usable in various real-world scenarios.

还有一些关于在Cut(1)中添加更多与空格相关的特性的进一步想法,从不同的人那里得到一些反馈将是很好的,最好是在coreutils邮件列表上。我愿意为Cut(1)实现更多的补丁,并向上游提交它们,这将使这个实用程序在各种现实场景中更加通用和更有用。



Another way if you must use cut command

另一种方法,如果您必须使用CUT命令



ps axu | grep [j]boss |awk '$1=$1'|cut -d' ' -f5


In Solaris, replace awk with nawk or /usr/xpg4/bin/awk

在Solaris中,将awk替换为nawk或/usr/xpg4/bin/awk



I still like the way Perl handles fields with white space.

First field is $F[0].

我仍然喜欢Perl处理带有空格的字段的方式。第一个字段是$F[0]。



$ ps axu | grep dbus | perl -lane 'print $F[4]'


My approach is to store the PID to a file in /tmp, and to find the right process using the -S option for ssh. That might be a misuse but works for me.

我的方法是将ID存储到/tmp中的一个文件中,并使用ssh的-S选项查找正确的进程。这可能是一种滥用,但对我来说很管用。



#!/bin/bash

TARGET_REDIS=${1:-redis.someserver.com}
PROXY="proxy.somewhere.com"

LOCAL_PORT=${2:-6379}

if [ "$1" == "stop" ] ; then
kill `cat /tmp/sshTunel${LOCAL_PORT}-pid`
exit
fi

set -x

ssh -f -i ~/.ssh/aws.pem centos@$PROXY -L $LOCAL_PORT:$TARGET_REDIS:6379 -N -S /tmp/sshTunel$LOCAL_PORT ## AWS DocService dev, DNS alias
# SSH_PID=$! ## Only works with &
SSH_PID=`ps aux | grep sshTunel${LOCAL_PORT} | grep -v grep | awk '{print $2}'`
echo $SSH_PID > /tmp/sshTunel${LOCAL_PORT}-pid


Better approach might be to query for the SSH_PID right before killing it, since the file might be stale and it would kill a wrong process.

更好的方法可能是在终止SSH_PID之前查询它,因为该文件可能已经过时,并且它会终止错误的进程。


更多回答

Great answer. I'll be coming back to look this up again next time I need it.

回答得很好。下次我需要的时候,我会回来再查一下的。

The grep trick seems to not work in crontab files. Any reason?

Grep技巧似乎在crontab文件中不起作用。有什么原因吗?

I keep learning and forgetting the grep trick. Thanks for my most recent reminder. Maybe this time it'll stick. But I wouldn't bet on it.

我一直在学习,却忘记了grep技巧。感谢你给我的最新提醒。也许这一次它会坚持下去。但我不敢打赌。

@Michael, you should set up a cron job somewhere to mail that tip (and possibly others) to you once a month :-)

@Michael,你应该在某个地方建立一个cron职位,每月向你发送一次提示(可能还有其他提示):-)

Oliver, sometimes the best answer to "how do I do X with Y?" is "Don't use Y, use Z instead". Since OP accepted this answer, it's likely I convinced them of that :-)

奥利弗,有时最好的答案是“我该怎么做X和Y?”就是“不要用Y,改用Z”。由于OP接受了这个答案,我很可能说服了他们:-)

Fancy illustration.

精美的插图。

tr -s ' ' is mighty nice! I hope I can remember that better than awk

《S》真是太棒了!我希望我能比awk记得更清楚

@Chris I have to object :D Awk is way better for these things!!

@Chris我不得不反对:D Awk在这些事情上要好得多!

@fedorqui When it comes to print nth field to the end, the cut -f5- grammar, "-fN-" is much simpler than awk.

@fedorqui当要将第n个字段打印到末尾时,Cut-f5语法“-fn-”比awk简单得多。

@Weekend agreed.

@Weekend表示同意。

I think this should be the answer, it is closer to the OP request (asked to use cut). This approach is 5-10% slower than the awk approach (because there is one more pipe to handle with tr), but in general this will be irrelevant.

我认为这应该是答案,它更接近OP请求(要求使用CUT)。这种方法比awk方法慢5-10%(因为要用tr多处理一个管道),但通常这是无关紧要的。

Strange, this does not work on OS X. The sed command does not change multiple spaces to one space.

奇怪的是,这在OS X上不起作用。sed命令不会将多个空格更改为一个空格。

\s is a GNU sed extension. On OS X you can pass the -E flag to sed to enable extended regular expressions, then use [[:space:]] in place of \s, like so: sed -E 's/[[:space:]]+/ /g'

\S是GNU sed的扩展。在OS X上,您可以将-E标志传递给sed以启用扩展正则表达式,然后使用[[:空格:]]代替\S,如下所示:sed-E‘S/[[:空格:]]+//g’

That can be compressed down to ps axu | awk '/[j]boss/ {print $5}'.

它可以压缩为PS AXU|awk‘/[j]Boss/{print$5}’。

Isn't awk slower (especially when there are some superfluous other processes), then sed / grep / cut?

Awk不是比sed/grep/Cut更慢(特别是当有一些多余的其他进程时)吗?

This does not work with the output of lsof I tried lsof|perl -lane 'print $F[5]' this sometimes gets the 5th column, sometimes the 6th

这不适用于lsof的输出我尝试lsof| perl -lane 'print $F[5]'这有时会得到第5列,有时会得到第6列

I think the question just was how to use delimiters that might contain a varying number of spaces. For this purpose the answer was correct.

我认为问题只是如何使用可能包含不同数量的空格的分隔符。就这一点而言,答案是正确的。

In lsof the problem is that the number of columns is not always consistent in each row.

在lsof中,问题是每行中的列数并不总是一致的。

You can use this answer: Get a certain column of an output with content aligned right and some columns not always filled

您可以使用以下答案:获取输出的某一列,内容正确对齐,某些列并不总是填充

My previous answer to this question was deleted because it wasn't tailored specifically to this question. Thus, I answered this question again, providing a much more specific answer. I hope it's fine now.

我之前对这个问题的回答被删除了,因为它不是针对这个问题量身定做的。因此,我再次回答了这个问题,提供了一个更加具体的答案。我希望现在可以放晴了。

34 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com