Trying to modify awk code -
awk 'begin{ofs=","} fnr == 1 {if (nr > 1) {print fn,fnr,nl} fn=filename; fnr = 1; nl = 0} {fnr = fnr} /error/ && filename ~ /\.gz$/ {nl++} { cmd="gunzip -cd " filename cmd; close(cmd) } end {print fn,fnr,nl} ' /tmp/appscraps/* > /tmp/test.txt
the above scans files in given directory. prints file name, number of lines in each file , number of lines found containing 'error'.
im trying make script executes command if of file reads in isn't regular file. i.e., if file gzip file, run particular command.
above attempt include gunzip command in there , on own. unfortunately, isn't working. also, cannot "gunzip" files in directory beforehand. because not files in directory "gzip" type. regular files.
so need script treat .gz file finds different way can read it, count , print number of lines that's in it, , number of lines found matching pattern supplied (just if file had been regular file).
any help?
this part of script makes no sense:
{if (nr > 1) {print fn,fnr,nl} fn=filename; fnr = 1; nl = 0} {fnr = fnr} /error/ && filename ~ /\.gz$/ {nl++}
let me restructure bit , comment it's clearer does:
{ # every line of every input file, following: # if 2nd or subsequent line, print values of these variables: if (nr > 1) { print fn,fnr,nl } fn = filename # set fn filename. since occur first line of # every file, value fn have when printed above, # why not rid of fn , print filename? fnr = 1 # set fnr 1. over-written below # setting fnr pointless. nl = 0 } { # every line of every input file, following # (note unnecessary "}" "{" above): fnr = fnr # set fnr fnr. since occur first line of # every file, value fnr have when printed above, # why not rid of fnr , print fnr-1? } /error/ && filename ~ /\.gz$/ { nl++ # increment value of nl. since nl set 0 above, # ever set 1, why not set 1? # suspect real intent not set 0 above. }
you have code above testing file name ends in ".gz" you're running gunzip on every file in next block.
beyond that, call gunzip shell else suggested. awk tool parsing text, it's not environment call other tools - that's shell for.
for example, assuming comment (prints file name, number of lines in each file , number of lines found containing 'error
) accurately describes want awk script , assuming makes sense test word "error" directly in ".gz" file using awk:
for file in /tmp/appscraps/*.gz awk -v ofs=',' '/error/{nl++} end{print filename, nr+0, nl+0}' "$file" gunzip -cd "$file" done > /tmp/test.txt
much clearer , simpler, isn't it?
if doesn't make sense test word error directly in ".gz" file, can instead:
for file in /tmp/appscraps/*.gz zcat "$file" | awk -v file="$file" -v ofs=',' '/error/{nl++} end{print file, nr+0, nl+0}' gunzip -cd "$file" done > /tmp/test.txt
to handle gz , non-gz files you've described in comment below:
for file in /tmp/appscraps/* case $file in *.gz ) cmd="zcat" ;; * ) cmd="cat" ;; esac "$cmd" "$file" | awk -v file="$file" -v ofs=',' '/error/{nl++} end{print file, nr+0, nl+0}' done > /tmp/test.txt
i left out gunzip since don't need far can tell stated requirements. if i'm wrong, explain need for.
Comments
Post a Comment