
05-23-2003, 01:25 AM
|
|
Web Hosting Guru
|
|
Join Date: May 2001
Location: WA, USA
Posts: 319
|
|
monitoring httpd/access question with a twist
hi people,
OK, I am aware I can telnet or SSH my box and use this command to view my access log in real time :
tail -f /home/log/httpd/access
But what if I am looking for something to monitor one specific file access, which will somehow notify me to let me know when this file is accessed. Or maybe make a little stats page which will show me?
I suppose if I wanted to be patient I can monitor my logs for each site after they have been generated at 4am, but even so it would still be nice if something could tell me :hey, joe just accessed that page you sent him" for example.
Even if it just scanned the logfile every X minutes, that would be acceptable.
Is this possible?
__________________
http://printers.abbey-lane.com
|

05-23-2003, 07:48 AM
|
|
Web Hosting Master
|
|
Join Date: Aug 2001
Location: Atlanta
Posts: 1,166
|
|
The easiest way to do it would be to build a script around the command you've already identified:
tail -f path/to/logfile | grep specific_file
This will only return results when specific_file is accessed.
Note in this example specific_file is a TEXT string value, not a path to a file. It's just watching the log for occurences of the text value.
Brandon
|

05-23-2003, 09:01 PM
|
|
Web Hosting Guru
|
|
Join Date: May 2001
Location: WA, USA
Posts: 319
|
|
Ah, now that's interesting! Now I just have to figure out how to make some kind of a notification or reporting system work.
Could :
tail -f path/to/logfile | grep specific_file
...... be made to run every hour (cron?) and then output to a specific file - maybe a txt file which could be beneath /web so as to be visible from a web browser?
not sure if cron is used just to run files or to execute a command.
__________________
http://printers.abbey-lane.com
|

05-23-2003, 10:57 PM
|
|
Web Hosting Master
|
|
Join Date: Nov 2002
Location: Michigan
Posts: 695
|
|
Don't use the '-f' switch with tail if you're trying to grep output from it. Grep won't finish processing and displaying anything until the tail command is done. Which, with -f, never happens by definition.
Just use plain old grep to check for a string:
grep filename /path/to/logfile
To put that in a cron job every hour, shell in as root (you need to be root for read access to the logs) and do
crontab -e
which will put you in a vi editing session in root's crontab file. Make an entry like
05 * * * * grep filename /path/to/logfile > /home/sites/siteX/web/info.txt
Then exit the session (:wq) and your new cron entry will be saved. This will run the grep command every hour at 5 minutes past the hour, and dump the results to a file called info.txt in siteX's web directory.
You could also pipe grep's output to a mail command like:
05 * * * * grep filename /path/to/logfile | mail -s "hourly log scan" yourusername
which would send the output to user 'yourusername' with a subject line of 'hourly log scan'
|

05-24-2003, 12:40 AM
|
|
Web Hosting Guru
|
|
Join Date: May 2001
Location: WA, USA
Posts: 319
|
|
Hi Bruce,
I hate to sound cringing, but you are an absolute star! I'm sure this isn't the first time you've given me such amazingly good info.
I can't wait to give this a try - - will let you all know how I get on with it. 
__________________
http://printers.abbey-lane.com
|

05-24-2003, 01:23 AM
|
|
Web Hosting Master
|
|
Join Date: Aug 2001
Location: Atlanta
Posts: 1,166
|
|
Rookie mistake on my part. That's why Bruce gets paid the big bucks!!
Thanks Bruce
Brandon
|

05-28-2003, 07:41 PM
|
|
Web Hosting Guru
|
|
Join Date: May 2001
Location: WA, USA
Posts: 319
|
|
This works like a charm bruce, thank you. I wonder if there's a way for it to only email me if the search was *not* empty, or if the search only founf something new? Maybe that's not possible using this method.
I am using it to search log/httpd/access and haven't quite decided how to approach the logrotation issue. that seems to be a common problem though - even webalizer loses data once the logs rotate.
__________________
http://printers.abbey-lane.com
|

05-28-2003, 08:16 PM
|
|
Web Hosting Master
|
|
Join Date: Nov 2002
Location: Michigan
Posts: 695
|
|
Interactively at the shell level would be more difficult. You could dump the results to a file, then call a Perl script to examine the file and only do something with it if it contains something you care about.
Just search the logs prior to rotation. On the RaQs, just check prior to 4am. On the Qube 3 and RaQ 550, I think the logs are parsed and rotated every 15 minutes, so you'd have to sneak it in before that, or else tinker with the log rotation setting somehow and get it to update the stats less often. Not sure exactly how to go about that though; I'm sure it's just some sort of a timer setting someplace.
You might want to investigate alternative solutions like logcheck, which are designed to look for certain things in log files. The rotation timing would still be an issue, but it might solve the "only send new stuff" issue, etc.
http://rpmfind.net/linux/RPM/contrib....1-1.i386.html
Note - I haven't actually set this up myself, so pointing you at the app is my only possible contribution at this time. 
|

05-28-2003, 09:11 PM
|
|
Web Hosting Guru
|
|
Join Date: May 2001
Location: WA, USA
Posts: 319
|
|
Hi Bruce,
That's great, thanks. I have 3 instances of this search utility running set for
56 * * * *
57 * * * *
58 * * * *
So hopefully they will catch the log before it rotates. I think I'll just stick with the method you showed me until I have the time to investigate anything else like the perl script fully.
__________________
http://printers.abbey-lane.com
|

05-31-2003, 04:11 PM
|
|
Web Hosting Guru
|
|
Join Date: May 2001
Location: WA, USA
Posts: 319
|
|
This is beginning to develop quite nicely. I thought I would add this to the thread in case anyone else has been monitoring it and can find it useful.
I have changed it so it will look for the specific filename, output that to a file, then search that file and email me every entry except those bearing my IP address.
That way, I am emailed every instance of the file being accessed, apart from when I access it. Of course it only works because I have a static IP address.
All on one line.....
grep file-to-monitor /home/log/httpd/access > /home/sites/site1/web/output.txt | grep -v 123.456.789.123 /home/sites/site1/web/output.txt | mail -s "hourly log scan" usernametoemail
__________________
http://printers.abbey-lane.com
|

05-31-2003, 04:20 PM
|
|
Web Hosting Guru
|
|
Join Date: May 2001
Location: WA, USA
Posts: 319
|
|
I was just thinking, if I wanted this to search for, say, 50 different items all at 56 * * * *, would this be a really serious load on my server?
Is so, what would be a sensible way to spread the searches, bearing in mind the log rotation issue at 4am?
__________________
http://printers.abbey-lane.com
|

05-31-2003, 04:24 PM
|
|
Web Hosting Master
|
|
Join Date: Nov 2002
Location: Michigan
Posts: 695
|
|
Hm. Shouldn't that first | be a ; ??
Part 1 greps the filename from the access log and outputs the results to output.txt.
Part 2 pipes every line from output.txt which doesn't contain the specified IP address to the mail command.
You need to separate Parts 1 and 2 with a semicolon, since they are separate commands, and you only want Part 2 to run after Part 1 is complete.
Otherwise it looks pretty good! (And you could tack on
; rm /home/sites/site1/web/output.txt
to delete the file, unless you also want output.txt to be web accessible at all times.
|

05-31-2003, 04:28 PM
|
|
Web Hosting Master
|
|
Join Date: Nov 2002
Location: Michigan
Posts: 695
|
|
An alternative, which doesn't require an intermediary file, would be
cat /home/log/httpd/access | grep -v ip.ad.re.ss | grep file-to-monitor | mail -s "hourly log scan" you@example.com
So you'd "look" through access, first excluding any line that contains your IP address, then including any line that contains the filename, then pipe the results of that to the mail command.
You might want to switch the IP address exclusion and filename inclusion for efficiency.
|

05-31-2003, 04:31 PM
|
|
Web Hosting Master
|
|
Join Date: Nov 2002
Location: Michigan
Posts: 695
|
|
When looking for multiple items, you could probably do some sort of awk regular expression at the 'command line' if the items were similar enough.
If they can't be matched with a regex, you might need a quick-and-dirty Perl script to do it for you or something.
To avoid the logrotation issue, you could just make a working copy of access at xx:59 and do all your searching through the copy at your leisure, then delete the file when you're done...
|

05-31-2003, 04:40 PM
|
|
Web Hosting Guru
|
|
Join Date: May 2001
Location: WA, USA
Posts: 319
|
|
Hi Bruce,
Great ideas, thank you! Also, I replaced the | with a ; as you suggested.
That working copy idea is a great one, why on earth didn't I think of that?
__________________
http://printers.abbey-lane.com
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Linear Mode
|
| Postbit Selector |
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|
|
| Login: |
|
|
| Advertisement: |
|
|
| Web Hosting News: |
|
|
|