正则表达式 – Shell脚本 – 列出文件,读取文件并将数据写入新文

发布时间：2020-12-14 05:50:28 所属栏目：百科来源：网络整理

导读：我对 shell脚本有一个特殊的问题. 简单的脚本对我来说不是问题,但我是新手,想让我成为一个简单的数据库文件. 所以,我想做的是： - Search for filetypes (i.e. .nfo) -- should be no problem :)- read inside of each found file and use some strings insi

我对 shell脚本有一个特殊的问题.
简单的脚本对我来说不是问题,但我是新手,想让我成为一个简单的数据库文件.

所以,我想做的是：

- Search for filetypes (i.e. .nfo) <-- should be no problem :)
- read inside of each found file and use some strings inside
- these string of each file should be written in a new file. Each found file informations

应该是新文件中的一行

我希望我解释我的“项目”很好.

我现在的问题是,要了解我如何告诉脚本它必须搜索文件,然后使用这些文件中的每个文件读入并使用其中的一些信息将其写入新文件.

我会更好地解释一下.
我正在搜索文件,这让我回头：

file1.nfo
file2.nfo
file3.nfo

好了,现在在每个文件中我需要2行之间的信息.即
file1.nfo：

<user>test1</user>

file2.nfo：

<user>test2</user>

所以在新文件中现在应该是：

file1.nfo:user1
file2.nfo:user2

好的：

find -name *.nfo  > /test/database.txt

正在打印出文件列表.
和

sed -n '/<user*/,/</user>/p' file1.nfo

给我回到完整的文件,而不仅仅是< user>之间的信息.和< / user>

我试着一步一步地继续阅读,但是看起来非常困难.

我做错了什么,应该是列出所有文件的最佳方法,并将文件和两个字符串之间的内容写入文件？

编辑新：

好的,这是更多信息的更新.
我现在学到了很多,并在网上搜索我的问题.我可以找到很多信息,但我不知道如何将它们放在一起,以便我可以使用它.

现在用awk工作就是我得到了文件名和字符串.

这里现在完整的信息(我想我可以继续自己有点帮助,但我不能:()

下面是一个示例：/test/file1.nfo

<string1>STRING 1</string1>
<string2>STRING 2</string2>
<string3>STRING 3</string3>
<string4>STRING 4</string4>
<personal informations>
<hobby>Baseball</hobby>
<hobby>Baskeball</hobby>
</personal informations>

这是/test/file2.nof的一个例子

<string1>STRING 1</string1>
<string2>STRING 2</string2>
<string3>STRING 3</string3>
<string4>STRING 4</string4>
<personal informations>
<hobby>Soccer</hobby>
<hobby>Traveling</hobby>
</personal informations>

我想要创建的文件必须如下所示.

STRING 1:::/test/file1.nfo:::Date of file:::STRING 4:::STRING 3:::Baseball,Basketball:::STRING 2
STRING 1:::/test/file2.nfo:::Date of file:::STRING 4:::STRING 3:::Baseball,Basketball:::STRING 2

“文件日期”应该是文件的创建日期.所以我可以看到文件的年龄.

所以,这就是我需要的东西,看起来并不容易.

非常感谢.

UPATE ERROR -printf

find: unrecognized: -printf

Usage: find [PATH]... [OPTIONS] [ACTIONS]

Search for files and perform actions on them.
First failed action stops processing of current file.
Defaults: PATH is current directory,action is '-print'

    -follow         Follow symlinks
    -xdev           Don't descend directories on other filesystems
    -maxdepth N     Descend at most N levels. -maxdepth 0 applies
                    actions to command line arguments only
    -mindepth N     Don't act on first N levels
    -depth          Act on directory *after* traversing it

Actions:
    ( ACTIONS )     Group actions for -o / -a
    ! ACT           Invert ACT's success/failure
    ACT1 [-a] ACT2  If ACT1 fails,stop,else do ACT2
    ACT1 -o ACT2    If ACT1 succeeds,else do ACT2
                    Note: -a has higher priority than -o
    -name PATTERN   Match file name (w/o directory name) to PATTERN
    -iname PATTERN  Case insensitive -name
    -path PATTERN   Match path to PATTERN
    -ipath PATTERN  Case insensitive -path
    -regex PATTERN  Match path to regex PATTERN
    -type X         File type is X (one of: f,d,l,b,c,...)
    -perm MASK      At least one mask bit (+MASK),all bits (-MASK),or exactly MASK bits are set in file's mode
    -mtime DAYS     mtime is greater than (+N),less than (-N),or exactly N days in the past
    -mmin MINS      mtime is greater than (+N),or exactly N minutes in the past
    -newer FILE     mtime is more recent than FILE's
    -inum N         File has inode number N
    -user NAME/ID   File is owned by given user
    -group NAME/ID  File is owned by given group
    -size N[bck]    File size is N (c:bytes,k:kbytes,b:512 bytes(def.))
                    +/-N: file size is bigger/smaller than N
    -links N        Number of links is greater than (+N),or exactly N
    -prune          If current file is directory,don't descend into it
If none of the following actions is specified,-print is assumed
    -print          Print file name
    -print0         Print file name,NUL terminated
    -exec CMD ARG ; Run CMD with all instances of {} replaced by
                    file name. Fails if CMD exits with nonzero
    -delete         Delete current file/directory. Turns on -depth option

解决方法

所有你需要的是：

find -name '*.nfo' | xargs awk -F'[><]' '{print FILENAME,$3}'

如果您的文件中包含的内容多于您在示例输入中显示的内容,那么这可能就是您所需要的：

... awk -F'[><]' '/<user>/{print FILENAME,$3}' file

试试这个(未经测试)：

> outfile
find -name '*.nfo' -printf "%p %Tcn" |
while IFS= read -r fname tstamp
do
      awk -v tstamp="$tstamp" -F'[><]' -v OFS=":::" '
          { a[$2] = a[$2] sep[$2] $3; sep[$2] = "," }
          END {
              print a["string1"],FILENAME,tstamp,a["string4"],a["string3"],a["hobby"],a["string2"]
          }
      ' "$fname" >> outfile
done

只有当您的文件名不包含空格时,上述操作才有效.如果他们可以,我们需要调整循环.

如果你的发现不支持-printf(建议 – 认真考虑获得一个现代的“发现”！)：

> outfile
find -name '*.nfo' -print |
while IFS= read -r fname
do
      tstamp=$(stat -c"%x" "$fname")
      awk -v tstamp="$tstamp" -F'[><]' -v OFS=":::" '
          { a[$2] = a[$2] sep[$2] $3; sep[$2] = ",a["string2"]
          }
      ' "$fname" >> outfile
done

如果你没有“stat”,那么google可以选择从文件中获取时间戳,或者考虑解析ls -l的输出 – 它是不可靠的,但如果它是你所有的……

（编辑：李大同）

【声明】本站内容均来自网络，其相关言论仅代表作者个人观点，不代表本站立场。若无意侵犯到您的权利，请及时与联系站长删除相关内容!