![]() ![]() matchall input string1 string2 string3Ī recursive solution. Grep -o -F "$newline_separated_list_of_strings" "$infile" The solution implemented as a bash script:Įcho "Usage: $(basename "$0") input_file string1 " The best you can do is store the matches in a temporary file, and then run grep multiple times using one regex at a time. ![]() It cannot be extended for regexes, because a single regex can match multiple different strings and we cannot track which match corresponds to which regex. Note that this approach/solution works only for fixed strings. For example, although the text abcd matches both abc and bcd, grep finds only one of them: $ grep -o -F $'abc\nbcd' <<< abcd One shortcoming with this solution (failing to meet the partial matches should be OK requirement) is that grep doesn't detect overlapping matches. $ grep -o -F $'string1\nstring2\nfoo' input|sort -u|wc -l $ grep -o -F $'string1\nstring3' input|sort -u|wc -l $ grep -o -F $'string1\nstring2\nstring3' input|sort -u|wc -l Then eliminate duplicate occurrences of matched strings with sort -u,Īnd finally check that the count of remaining lines equals the count of the input strings. Make use of the -o| -only-matching option of grep (which forces to output only the matched parts of a matching line, with each such part on a separate output line), (Invoking grep multiple times, especially with the recursive method, did better than I expected) Perl re optimised: 7s (Removed Getopt::Std and non-regex support for faster startup).Perl non-re optimised: 5s (Removed Getopt::Std and regex support for faster startup).Results: (measured with time, real time rounded to closest half second) # Initiate array tracking what we have matches forįor ((i=0 i and that an if can be used to check the result) Strings=( ) # search strings into an arrayĭeclare -a matches # Array to keep track which strings already match Shift # move it out of the way that is useful This might meet all of your requirements: (regex version miss some comments, look at string version instead) #!/bin/bashįilename="$1" # Filename is first parameter bash version is >= 3 for the regular expression version.Using bash instead of external tools is acceptable.It must return success if everything is found, failure when not.Invoking grep less than once is acceptable.The use of any other external tools are unacceptable.Invoking grep multiple times is unacceptable.This is based on the following assumptions:G It is likely MUCH slower than using awk, but if you want to do it anyway. Since you eliminated that option in the question statement, yes, it is possible to do and this provides a way to do it. Government agencies include particular like to release own data in long PDFs which frequent have the data wee want in a table on one of the pages. In case brevity is what you're looking for, here's the GNU awk one-liner to do just what you asked for: awk 'NR=FNRįirst, you probably want to use awk. For the chapter you’ll need the following file, which is available for software here: usbpstatsfy2017sectorprofile.pdf. invented to do general text manipulation jobs like this so not sure why you'd want to try to avoid it. Please let me know if there are any other ways to get this done through the shell script.Awk is the tool that the guys who invented grep, shell, etc. I need a generalized solution for all cases. ![]() I don't know how I can do that when file1 is being compared with multiple files in a continuous manner. that (a) contain the string "/abc/bce/12345/input/part3" + an added string (the filenames), but (b) don't contain the other "/abc/bce/12345/input/part3/err" string. But I want to get lines in file2.txt, file3.txt. py Finally, try on older Unix shells/oses: grep -e pattern1 -e pattern2. ![]() However the problem lies, when I take the 1st line from file1 and try to retrieve the path+File_name, it takes all the similar lines from file2,file3, and so on. The syntax is: Use single quotes in the pattern: grep 'pattern' file1 file2 Next use extended regular expressions: grep -E 'pattern1pattern2'. I am using grep -rHw "/abc/bce/12345/input/part3" test/ to match the line from file1 and extract their info from file2,file3. I have multiple files(text file) in a folder like below where the 1st file contains some paths as a string and the other with path+file_name. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |