regex - I cannot seem to get my regular expressions in Perl to recognize underscore (_) characters -


sorry bring problem simple, trying write series of regular expressions in perl extract types of data file. reason, cannot seem perl match lines of data have underscore (_) in them.

if want lines start

"ch2    flybase exon    " 

or

"ch3    flybase exon    " 

(the white spaces tab characters), following code works well:

if ($_ =~ m/^ch[ 2-3]   flybase exon    /) {print outputfile;} 

however, if want match lines more complex chromosome names (i.e. more letters 'ch' followed number), such as:

ch4_group1 ch4_group2 ch4_group3 ch4_group4 ch4_group5 chxl_group1a chxl_group1e chxl_group3a chxl_group3b chxr_group3a chxr_group5 chxr_group6 chxr_group8 unknown_group_1 unknown_group_10 unknown_group_100 unknown_group_101 

i have tried following codes without success:

if ($_ =~ m/^ch4_group[1-5] flybase exon    /) {print outputfile;} if ($_ =~ m/^chx._group[0-9]+[a-z]* flybase exon    /) {print outputfile;} if ($_ =~ m/^unknown_group_[0-9]+   flybase exon    /) {print outputfile;} if ($_ =~ m/^unknown_singleton_[0-9]+   flybase exon    /) {print outputfile;} 

i have tried including \ in front of _, did not help.

any suggestions appreciated.

assuming you're using x, m, i options make following changes:

^ch4_group[1-5] flybase exon
be:
^ch4_group[1-5]\s*flybase\sexon\s*$

^chx._group[0-9]+[a-z]* flybase exon
be:
^chx._group[0-9]+[a-z]*\s+flybase\sexon\s*$

^unknown_group_[0-9]+ flybase exon
be:
^unknown_group_[0-9]+\s*flybase\sexon\s*$

^unknown_singleton_[0-9]+ flybase exon
would be:
^unknown_singleton_[0-9]+\s*flybase\sexon\s*$


Comments

Popular posts from this blog

css - Which browser returns the correct result for getBoundingClientRect of an SVG element? -

gcc - Calling fftR4() in c from assembly -

Function that returns a formatted array in VBA -