So I have a bunch of Apache logs with using the standard log format. I want to get all the log lines that did not come from a web crawler.
So lets say I have a file robot_patterns with entries like
Googlebotmsnbot-mediaYandexBotbingbot
If I run the command grep -f robot_patterns *.log
I will get all the entries by bots matching the above patterns. My actual list has ~30 entries of bots and agents that I wish to ignore.
But I want to find all the entries that are NOT from bots. So I try grep -v -f robot_patterns *.log
and no results are returned by grep. This is not what I expect or desire, and I am not finding an obvious way to get what I want. When using the -v
option combined with multiple patterns in a file, grep will only return a matching line if it matches EVERY pattern.