脚本实现在每个字符串后面添加该字符串是第几次出现,求助,感谢

sh/bash/dash/ksh/zsh等Shell脚本
回复
lixin1292006
帖子: 1
注册时间: 2018-12-16 14:04

脚本实现在每个字符串后面添加该字符串是第几次出现,求助,感谢

#1

帖子 lixin1292006 » 2018-12-16 14:12

求助实现以下文本操作:
将下面文本内容中的括号内的部分,如((DR,((SR,(SA,SG)),((SG,SR),SA))),HS)中每个英文字符后面添加该字符串是第几次出现,即实现
((DR,((SR,(SA,SG)),((SG,SR),SA))),HS) ——> ((DR1,((SR1,(SA1,SG1)),((SG2,SR2),SA2))),HS1)。即DR第一次出现时,将DR后面添加1,DR1;
第二次出现时将DR后面添加2,DR2.

#NEXUS
begin trees;
TREE * UNTITLED = [&R] ((DR,((SR,(SA,SG)),((SG,SR),SA))),HS);
TREE * UNTITLED = [&R] (((((SG,(SR,SA)),(SR,SA)),DR),SG),HS);
TREE * UNTITLED = [&R] (((DR,(SA,(SG,SR))),(SA,(SR,SG))),HS);
TREE * UNTITLED = [&R] ((DR,(((SA,SR),SG),((SA,SR),SG))),HS);
end

期望脚本实现:

#NEXUS
begin trees;
TREE * UNTITLED = [&R] ((DR1,((SR1,(SA1,SG1)),((SG2,SR2),SA2))),HS1);
TREE * UNTITLED = [&R] (((((SG1,(SR1,SA1)),(SR2,SA2)),DR1),SG2),HS1);
TREE * UNTITLED = [&R] (((DR1,(SA1,(SG1,SR1))),(SA2,(SR2,SG2))),HS1);
TREE * UNTITLED = [&R] ((DR1,(((SA1,SR1),SG1),((SA2,SR2),SG2))),HS1);
end
头像
astolia
论坛版主
帖子: 6444
注册时间: 2008-09-18 13:11

Re: 脚本实现在每个字符串后面添加该字符串是第几次出现,求助,感谢

#2

帖子 astolia » 2018-12-16 20:51

代码: 全选

awk 'BEGIN{FS="\n"}{s=$1;l=length(s);f=0;p=0;for(i=1;i<=l;i++){c=substr(s,i,1);if(f==1&&c>="A"&&c<="Z"){if(p==0){p=i}}else{if(f==1){if(p>0){k=substr(s,p,i-p);printf ++m[k];p=0}}else{if(c=="("){f=1}}}printf c}delete m;printf "\n"}'
$ cat a.txt
#NEXUS
begin trees;
TREE * UNTITLED = [&R] ((DR,((SR,(SA,SG)),((SG,SR),SA))),HS);
TREE * UNTITLED = [&R] (((((SG,(SR,SA)),(SR,SA)),DR),SG),HS);
TREE * UNTITLED = [&R] (((DR,(SA,(SG,SR))),(SA,(SR,SG))),HS);
TREE * UNTITLED = [&R] ((DR,(((SA,SR),SG),((SA,SR),SG))),HS);
end
$ cat a.txt | awk 'BEGIN{FS="\n"}{s=$1;l=length(s);f=0;p=0;for(i=1;i<=l;i++){c=substr(s,i,1);if(f==1&&c>="A"&&c<="Z"){if(p==0){p=i}}else{if(f==1){if(p>0){k=substr(s,p,i-p);printf ++m[k];p=0}}else{if(c=="("){f=1}}}printf c}delete m;printf "\n"}'
#NEXUS
begin trees;
TREE * UNTITLED = [&R] ((DR1,((SR1,(SA1,SG1)),((SG2,SR2),SA2))),HS1);
TREE * UNTITLED = [&R] (((((SG1,(SR1,SA1)),(SR2,SA2)),DR1),SG2),HS1);
TREE * UNTITLED = [&R] (((DR1,(SA1,(SG1,SR1))),(SA2,(SR2,SG2))),HS1);
TREE * UNTITLED = [&R] ((DR1,(((SA1,SR1),SG1),((SA2,SR2),SG2))),HS1);
end
回复