(Download all the files used in this tutorial here)
When running a website, whether it’s a small personal homepage, or a large site for a company, it’s useful to be able to know who is visiting your site, and what they’re up to while they’re there. If you use PHP and MySQL to build your site, then it becomes a simple task to retrieve some information about your visitors and write it into the database, to be viewed at a later date.
Using this information, you can develop strategies to improve your website, and learn about your visitor’s browsing trends. You might even be able to tailor specific pages for a visitor’s purpose based on what search terms they used to find that page.
Store and view useful information about all your visitors
In this tutorial, I’ll take you through the steps required to build and install a very simple visitor tracking tool on your website, so you can start analysing your visitors behaviours.
During this tutorial we’ll need to create 5 files. If you’d rather just get straight to the point and try installing this application yourself, you can download all the files used in this article here
- ip_report.php - To access and view visitor statistics
- ip_tracker.php - The plugin module which will update the database with the relevant values
- three test pages - To let you see the tool in action!
- session_end.php - An optional file, used during testing to manually reset a session
We’ll also need to create a new MySQL table, visitor_tracking, the code for that is included in the zip file, as well as in the area below.
Let’s consider first, what we might want to achieve using a tool like this. Basically, we want to be able to access information about our visitors. But what information? And in what format?
We’re going to be storing our visitor data in a MySQL database. We’re also going to put measures in place to ensure that we don’t confuse unique visits with page hits. unique visit is when one single person visits your website and when that happens, we want only one record in the database. But, we also want to see the individual page hits which show how many individual pages each person has viewed. We can use this information to see how long a particular person stayed on the site.
Along with this visit data, we’re going to store some other information about the user being tracked. Thanks to the PHP array $_SERVER, we can get hold of a wide range of useful stats and information about the person. Here is a sample of some of the information we can retrieve about people arriving at our site.
- Referring page - Accessed through the $_SERVER['HTTP_REFERER'] variable. You can use this to see if people have come from a search engine, and this variable will also contain the search terms they used to find you. Very useful for marketing and analysis
- IP Address - A handy value to label your visitor records with. This is available through $_SERVER['REMOTE_ADDR']
- The current page address - To see which pages our visitors are accessing, we can retrieve the filename for the current page through $_SERVER['SCRIPT_NAME']. We can also grab the query string with $_SERVER['QUERY_STRING']
A selection of other values is available through the $_SERVER array too, but for now, we’re just going to store and record two values using this array - the current page location, and the IP address of the visitor
Note : IP Addresses can sometimes change, even while a user is browsing a page, so we can’t rely on these to track them. Instead, we’re going to rely on the session a user generates while they visit the site. This is unique and unchanging for each individual user, so we can make use of this fact to track them across multiple pages on our site.
Run this query in your MySQL management tool to generate the table we’re going to be using for this application. This is just a simple example of a visitor tracking tool, and doesn’t contain many fields, but you can add in as many new fields as you like, and it’s possible to track extra visitor data such as the type of browser they’re using, their location and more!
DROP TABLE IF EXISTS `visitor_tracking`; CREATE TABLE IF NOT EXISTS `visitor_tracking` ( `entry_id` int(11) NOT NULL AUTO_INCREMENT, `visitor_id` int(11) DEFAULT NULL, `ip_address` varchar(15) NOT NULL, `page_name` text, `query_string` text, `timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY (`entry_id`), KEY `visitor_id` (`visitor_id`,`timestamp`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
First we’re going to create the file that will do the visitor tracking for us. This file will be include()’d at the top of each of the pages we wish to track. We’ll also be tracking by query string, so this will work if your site only uses one file (e.g. index.php) and generates pages based on the value of an id (e.g. index.php?page_id=52)
To start off, we need to set up our database connection and also grab some values from the server about our current visitor, so we can save these to the database later
1 2 3 4 5 6 7 8 9 10 11 | //define our "maximum idle period" to be 30 minutes $mins = 30; //set the time limit before a session expires ini_set ("session.gc_maxlifetime", $mins * 60); session_start(); $ip_address = $_SERVER["REMOTE_ADDR"]; $page_name = $_SERVER["SCRIPT_NAME"]; $query_string = $_SERVER["QUERY_STRING"]; $current_page = $page_name."?".$query_string; //connect to the database using your database settings include("db_connect.php"); |
Let’s explain a few things here
- $mins = 30 - We need to define a “timeout” limit for our visitor tracking tool. If a visitor clicks on a new page within, say, 29 minutes, then that’ll be counted as part of the same visit. If, however, they go away for an hour or so and come back, we might want to treat that as a new visit. Increase this value if you want to let visitors stay away from the site longer and still count them in the same visit
- session_start(); - Start up the session. Very important that you don’t include any HTML whitespace or other characters before this function call
- $ip_address = … etc. - Grab some useful values from the server including the current visitor’s IP address, the name of the page we’re on and anything that’s in the query string. The current page is generated by joining these last two together
- include(”db_connect.php”); - Just include whatever code you need in a seperate file to connect to the database. In my case, the code is
//change these values to whatever your own database server requires $link = mysql_connect('localhost', 'root', 'pass'); //Connects to the database at "localhost" if(!$link) { //halt execution if cannot connect die("Cannot connect to the database!"); } mysql_select_db('test', $link); //Assuming you have a database named "test" set up
Next, we’re going to consider two different scenarios - first, if the visitor is returning to the site after already having viewed a page. We’re going to set a value in the $_SESSION array to remember this person, so…
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | if(isset($_SESSION["tracking"])){ //update the visitor log in the database, based on the current visitor //id held in $_SESSION["visitor_id"] $visitor_id = $_SESSION["visitor_id"]; if($_SESSION["current_page"] != $current_page) { $sql = "INSERT INTO visitor_tracking (ip_address, page_name, query_string, visitor_id) VALUES ('$ip_address', '$page_name', '$query_string', '$visitor_id')"; if(!mysql_query($sql)){ echo "Failed to update visitor log"; } $_SESSION["current_page"] = $current_page; } } |
In this first conditional branch, we’re checking to see if the session variable “tracking” has been set. If so, then this means that we’re already tracking this visitor and so must insert a new entry into the database based on their visitor ID. We’ll see in just a moment how we generate this visitor ID and also set the appropriate session variables. As long as the session is valid (if the user has been idle for less than 30 minutes), then these values will exist, otherwise they’ll be deleted and we can count this as a new visit, which is dealt with in the second condition of the IF statement.
- line 16 - We check that the current page doesn’t match the last page the user viewed. We don’t want to track multiple reloads of the same page
- lines 18-20 - The SQL statement to add a new entry to the visitors table. Note how $visitor_id will always be set to the same value, the value held in “visitor_id” in the $_SESSION variable. This is so we can track the same visitor across multiple page views. All entries for a specific “visit” by a specific user will have the same visitor ID.
Now we’re going to look at the other conditional branch - when a user visits the page for the first time
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 | else { //set a session variable so we know that this visitor is being tracked $_SESSION["tracking"] = true; //insert a new row into the database for this person $sql = "INSERT INTO visitor_tracking (ip_address, page_name, query_string) VALUES ('$ip_address', '$page_name', '$query_string')"; if(!mysql_query($sql)){ echo "Failed to add new visitor into tracking log"; } else { //find the next available visitor_id for the database //to assign to this person $entry_id = mysql_insert_id(); $lowest_sql = mysql_query("SELECT MAX(visitor_id) as next FROM visitor_tracking"); $lowest_row = mysql_fetch_array($lowest_sql); $lowest = $lowest_row["next"]; if(!isset($lowest)) $lowest = 1; else $lowest++; //update the visitor entry with the new visitor id mysql_query("UPDATE visitor_tracking SET visitor_id = '$lowest' WHERE entry_id = '$entry_id'"); //place the current visitor_id into the session so we can use it on //subsequent visits to track this person $_SESSION["visitor_id"] = $lowest; //save the current page to session so we don't track if someone just refreshes the page $_SESSION["current_page"] = $current_page; } } |
Now we’ll look at the file where we generate the report data. In the “head” of the file (before we add any HTML data) we’ll add in some PHP preprocessing, code that runs before we need to produce any output
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 | <?php include("db_connect.php"); //retrieve the appropriate visitor data $view = $_GET["view"]; //set a default value for $view if($view!="all" && $view!="record") $view = "all"; if($view == "all") { //show all recent visitors $sql = "SELECT visitor_id, GROUP_CONCAT(DISTINCT ip_address) as ip_address_list, COUNT(DISTINCT ip_address) as ip_total, COUNT(visitor_id) as page_count, MIN(timestamp) as start_time, MAX(timestamp) as end_time FROM visitor_tracking GROUP BY visitor_id"; $result = mysql_query($sql); if($result==false){ $view = "error"; $error = "Could not retrieve values"; } } else { //show pages for a specific visitor $visitor_id = $_GET['id']; //rung $visitor_id through filter_var to check it's not an invalid //value, or a hack attempt if(!filter_var($visitor_id, FILTER_VALIDATE_INT, 0)){ $error = "Invalid ID specified"; $view = "error"; } else { $sql = "SELECT timestamp, page_name, query_string, ip_address FROM visitor_tracking WHERE visitor_id = '$visitor_id'"; $result = mysql_query($sql); } } function display_date($time){ return date("F j, Y, g:i a", $time); } ?> |
Some important things to notice here -
- lines 11-14 - The SQL code used when we’re viewing the main page which details all the visitors that have arrived at the site, along with a few bits of information about them. Note the use of the MySQL function GROUP_CONCAT. This is an especially useful function, since it allows us to return a list of all the IP addresses (DISTINCT addresses in this case, since we don’t need duplicates) as a single string, seperated by commas.
See also how we use COUNT(visitor_id) to get a total page count for that specific visitor. And thanks to the GROUP BY keyword, we can return exactly one record per visitor, which makes processing and displaying the results much simpler
- lines 29 & 30 - This is the SQL code we use when we’re displaying a specific user’s visit details. This is a simpler query, since we just need a list of all the pages they viewed, and the time they viewed them at.
- lines 34-36 - A simple function used to format the date and time later on in the file
We’ll then generate the body of the page itself, including any HTML output necessary
37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 | <html> <head> <title>IP Tracker Report Page</title> <style> html {font-family:tahoma,verdana,arial,sans serif;} body {font-size:62.5%;} table tr th{ font-size:0.8em; background-color:#ddb; padding:0.2em 0.6em 0.2em 0.6em; } table tr td{ font-size:0.8em; background-color:#eec; margin:0.3em; padding:0.3em; } </style> </head> <body> <h1>IP Tracker Report</h1> <?php if($view=="all") { //display all of the results grouped by visitor if($row = mysql_fetch_array($result)){ ?> <table> <tbody> <tr> <th>Id</th> <th>IP Address(es)</th> <th>Entry Time</th> <th>Exit Time</th> <th>Duration</th> <th>Pages visited</th> <th>Actions</th> </tr> <?php do{ if($row["ip_total"] > 1) $ip_list = "Multiple addresses"; else $ip_list = $row["ip_address_list"]; $start_time = strtotime($row["start_time"]); $end_time = strtotime($row["end_time"]); $start = display_date($start_time); $end = display_date($end_time); $duration = $end_time - $start_time; if($duration >= 60) { $duration = number_format($duration/60, 1)." minutes"; } else { $duration = $duration." seconds"; } echo "<tr>"; echo "<td>{$row["visitor_id"]}</td>"; echo "<td>$ip_list</td>"; echo "<td>$start</td>"; echo "<td>$end</td>"; echo "<td>$duration</td>"; echo "<td>{$row["page_count"]}</td>"; echo "<td><a href='ip_report.php?view=record& id={$row["visitor_id"]}'>view</a></td>"; echo "</tr>"; } while ($row = mysql_fetch_array($result)); ?> </tbody> </table> <?php } else { ?> <h3>No records in the table yet</h3> <?php } ?> <?php } elseif($view=="record"){ ?> <h3>Showing records for Visitor <?php echo $visitor_id; ?></h3> <p><a href="ip_report.php">back</a></p> <?php //show all pages for a single visitor if($row = mysql_fetch_array($result)){ ?> <table> <tbody> <tr> <th>Page viewed</th> <th>Query string</th> <th>Time of view</th> </tr> <?php do{ if($row["ip_total"] > 1) $ip_list = "More than 1"; else $ip_list = $row["ip_address_list"]; $time = display_date(strtotime($row["timestamp"])); echo "<tr>"; echo "<td>{$row["page_name"]}</td>"; echo "<td>{$row["query_string"]}</td>"; echo "<td>$time</td>"; echo "</tr>"; } while ($row = mysql_fetch_array($result)); ?> </tbody> </table> <?php } else { ?> <h3>No records for this visitor</h3> <?php } } elseif($view=="error") { ?> <h3>There was an error</h3> <?php echo $error; } ?> </body> </html> |
Have a look through the code above. You should be able to see variables being used that were set in the preceding PHP code, to control which page we view (whether it’s a page of all the available visitors, or a detailed view of one visitor’s session)
I’ve also created three test files in the zip package, to let you see how the file ip_tracker.php above should be used. You just need to add an include() call to the file to add it to each page, and then the tracking functionality should be present on all pages where the file is added.
e.g.
<?php include("ip_tracker.php"); ?> <h1>Page 1</h1> <p>This page is now being tracked with IP Tracker!</p>
To test the application, simply visit the test pages on your server and see how the report page at ip_report.php changes as you go. If you’re running the script on a local machine, you’ll probably see your IP show up as 127.0.0.1. From anywhere else, if the script is hosted online, you should see something different
To reset the session and see how the output changes when you start a new “visit”, open up session_end.php in your browser (included in the zip package), and this will destroy the current session, essentially making you a new visitor when you revisit the test pages.
session_end.php
<?php session_start(); //load up the current session session_destroy(); //destroys all information associated with the current session ?>
Download all the files used in this tutorial here
User tracking can be a very database intensive process, especially on busy sites, so don’t use this script on any sites where performance is important. This application serves only as a demonstration of the potential for PHP to perform visitor tracking and many improvements would need to be made before it could be used on data-heavy sites, but feel free to use the code on your own site, and modify or improve it as you feel necessary. Let me know if it’s been of any use to you!




September 18th, 2008 at 7:42 pm
would i be able to use this code for my site or is there restrictions? is this just a guide or could i use it for a site, i dont want to get into trouble by using this code.
September 18th, 2008 at 10:46 pm
Yes, feel free to use any of the code on the site in your own projects or applications. I’m glad it’s of use
September 19th, 2008 at 6:03 pm
This is really cool, i have a visitor tracker for my website, but this one just expanded my ideas, thanks a lot.
September 24th, 2008 at 7:49 am
Is there any way this can be used to track video views on UGC sites such as youtube, MetaCafe, etc.? Any advice would be helpful…thanks!
September 24th, 2008 at 12:15 pm
Thnx alot man, very helpfull
September 30th, 2008 at 4:13 pm
Thats amazing. Thats everything I need lol.
October 6th, 2008 at 7:42 pm
Hi,
i am building a site for the employment market.
now on the site there will be adverts from advertisers, and rss feeds. So to be revenue i will be trying put a plan of shared income with people.
but i have a problem.
I know from the script above you can definaely see how when and how much a visitor activities are and therefore see if they are being actually hinest doing proper clicks to sites or just hot clicking for revenue.
Now advertisers, very well know, people don’t click thousands of times everyday on a site.
So predicament one, is it possible through the ssesion and using the ip tracking cookie, I believe. To set a number of sites the visitor actually can do or set a cut off code here once the visitor has done [X] number of clicks, any done after do not count in income or restrict access to outside sites. IE when they click on the link it is made in active for 24 hours.
But any links that are connected to the site [Pages on the same server as the site], are allowd to be done.
The other thing which is puzzling many people is how to stop people sending traffic to the site in the form of automated traffic. I won’t list any because there are loads.
Maybe like a flag type statement which cuts off the connections from a incoming point after the page has been accessed [X] number of times.
Thanks for the above, as you are a star.
Malbanks
October 10th, 2008 at 6:22 am
Thank you!
November 17th, 2008 at 10:20 am
This is really very helpful for web development.It’ great…Thank you!
December 7th, 2008 at 10:45 pm
in the ip section all i get is “multiple address” instead of showing real ip address plzz help!! thanx