If you are preparing to do some serious analysis of your web server logs, it makes sense to transfer the logs to a database, so that you can easily extract information by running SQL queries. This article shows how to easily import IIS logs into MS SQL Server.
IIS logs each web server hit as a separate line in a log file. Information is logged in W3C Extended Log format, but the individual fields that are logged is configurable.
At the top of your log file will be a section that specifies the fields which are being logged:
#Software: Microsoft Internet Information Services 6.0
#Version: 1.0
#Date: 2008-09-11 14:00:00
#Fields: date time s-sitename s-ip cs-method cs-uri-stem cs-uri-query cs-username c-ip cs(User-Agent) cs(Cookie) cs(Referer) sc-status sc-substatus sc-win32-status sc-bytes cs-bytes time-taken
First, let's create a database to store our logs.
-- Create a database called "WebLogs"
IF EXISTS (SELECT *
FROM master..sysdatabases
WHERE name = N'WebLogs')
DROP DATABASE WebLogs
GO
CREATE DATABASE WebLogs
GO
Now let's create a table with columns that match every field of the log file.
-- Create a table for logs.
USE WebLogs;
CREATE TABLE [JDSWEB01] (
[date] [date] NULL,
[time] [time] NULL,
[s-sitename] [varchar] (32) NULL,
[s-ip] [varchar] (16) NULL,
[cs-method] [varchar] (8) NULL,
[cs-uri-stem] [varchar] (255) NULL,
[cs-uri-query] [varchar] (2048) NULL,
[cs-username] [varchar] (16) NULL,
[c-ip] [varchar] (16) NULL,
[cs(User-Agent)] [varchar] (1024) NULL,
[cs(Cookie)] [varchar] (2048) NULL,
[cs(Referer)] [varchar] (4096) NULL,
[sc-status] [int] NULL,
[sc-substatus] [int] NULL,
[sc-win32-status] [int] NULL,
[sc-bytes] [bigint] NULL,
[cs-bytes] [bigint] NULL,
[time-taken] [int] NULL
)
SQL Server has a neat feature where you can do a bulk import of data from a character-delimited text file (like a CSV file). The fields in the log file are separated with a single space character, so the import query becomes:
-- Import data from web server log into table
USE WebLogs;
BULK INSERT [JDSWEB01] FROM 'c:\temp\ex080912.log'
WITH (
FIELDTERMINATOR = ' ',
ROWTERMINATOR = '\n'
)
Now that you have imported all of your log data into SQL Server, you should read my Tech Tip on extracting useful information from it with SQL queries.
As a bonus for reading this far, here are some useful tips:
- Web server logs quickly become quite large. A popular website will generate gigabytes of logs per day.
- Fortunately, these log files contain a lot of redundant information, so they compress really well. 5 GB of log files may compress to 200 MB (a 25:1 ratio).
- Don't ever open a large log file in Notepad. Notepad is written in an inefficient way that will cause it to become unresponsive and use an entire CPU while the file is opening (which takes a long time with Notepad). Use a good text editor like TextPad instead.
- The time required for a BULK INSERT varies linearly with the number of lines in the log file. Expect a ballpark figure of 1 minute per gigabyte of log file.
- Remove the headers at the top of the log file to avoid errors like: Bulk insert data conversion error (type mismatch) for row 1, column 1 (date).
- The headers appear every time the web server is restarted, or when a new log file is created, so the headers may appear in the middle of a log file too.
- You may need to tweak the size of the VARCHARS for some fields if you get an error message like: Bulk insert data conversion error (truncation) for row 8394, column 10 (cs(User-Agent)).
- Don't make your VARCHARs too big because SQL Server has a limitation of 0860 bytes. If the total size of all your columns exceeds this, you will have problems: INSERT or UPDATE of a row in this table will fail if the resulting row length exceeds 8060 bytes.
- If you cannot get rid of truncation errors entirely (and still stay within the 8060 byte limit for a row), you can add MAXERRORS=somebignumber to the WITH clause of the BULK INSERT.
- Double check that your BULK INSERT worked correctly by counting the number of records in the table, and comparing with the number of lines in the log file.
- Use a datatype bigger than an int for any field that you will SUM() (such as [sc-bytes]) to avoid overflow errors like "Arithmetic overflow error converting expression to data type int."
- Don't bother using Log2SQL (a commercial software product), it tries to make all fields 255 characters (obvious truncation problems), and it is completely unnecessary because of SQL Server's BULK INSERT feature.
- Queries may take a long time if there is lots of data in the database. You can speed things up considerably if you remote any log entries for static content (*.gif, *.js etc). Note that this will make your hits per second and sent/received bytes incorrect, but (for me anyway) my key metric is pages rather than raw hits.
Tech tips from JDS

Browser Console
Read More

Glide Variables
Read More

Understanding Database Indexes in ServiceNow
Read More

Fast-track ServiceNow upgrades with Automated Testing Framework (ATF)
Read More

Read More

Splunk .conf18
Read More

ServiceNow Catalog Client Scripts: G_Form Clear Values
Read More

Is DevPerfOps a thing?
Read More

The benefits of performance testing with LoadRunner
Read More

Monitoring Atlassian Suite with AppDynamics
Read More

5 quick tips for customising your SAP data in Splunk
Read More

How to maintain versatility throughout your SAP lifecycle
Read More

How to revitalise your performance testing in SAP
Read More

Reserve and import data through Micro Focus ALM
Read More

How to effectively manage your CMDB in ServiceNow
Read More

ServiceNow and single sign-on
Read More

How to customise the ServiceNow Service Portal
Read More

Integrating a hand-signed signature to an Incident Form in ServiceNow
Read More

Integrating OMi (Operations Manager i) with ServiceNow
Read More

Implementing an electronic signature in ALM
Read More

Service portal simplicity
Read More

Learning from real-world cloud security crises
Read More

Static Variables and Pointers in ServiceNow
Read More

Citrix and web client engagement on an Enterprise system
Read More

Understanding outbound web services in ServiceNow
Read More

How to solve SSL 3 recording issues in HPE VuGen
Read More

How to record Angular JS Single Page Applications (SPA)
Read More

Calculating Pacing for Performance Tests
Read More

Vugen and GitHub Integration
Read More

What’s new in LoadRunner 12.53
Read More

Filtered Reference Fields in ServiceNow
Read More

ServiceNow performance testing tips
Read More

Monitor Dell Foglight Topology Churn with Splunk
Read More

Straight-Through Processing with ServiceNow
Read More

Splunk: Using Regex to Simplify Your Data
Read More

ServiceNow Choice List Dependencies
Read More

Tips for replaying RDP VuGen scripts in BSM or LoadRunner
Read More

Incorporating iSPI metric reports into MyBSM dashboard pages
Read More

Using SV contexts to simulate stored data
Read More

What’s new in LoadRunner 12.02
Read More

Recycle Bin for Quality Center
Read More

LoadRunner Correlation with web_reg_save_param_regexp
Read More

LoadRunner 11.52
Read More

QC for Testers – Quiz
Read More

Agile Performance Tuning with HP Diagnostics
Read More

What’s new in HP Service Virtualization 2.30
Read More

Understanding LoadRunner Virtual User Days (VUDs)
Read More

Problems recording HTTPS with VuGen
Read More

Improving the management and efficiency of QTP execution
Read More

Performance testing Oracle WebCenter with LoadRunner
Read More

Generating custom reports with Quality Center OTA using Python
Read More

Asynchronous Communication: Scripting For Cognos
Read More

How to fix common VuGen recording problems
Read More

Monitoring Active Directory accounts with HP BAC
Read More

URL Attachments in Quality Center
Read More

What’s new in LoadRunner 11.00?
Read More

Restore old License Usage stats after upgrading Quality Center
Read More

Changing LoadRunner/VuGen log options at runtime
Read More

Restricting large attachments in Quality Center
Read More

Retrieving Quality Center user login statistics
Read More

A comparison of open source load testing tools
...
Read More

Worst practices in performance testing
Read More

LoadRunner Sales Questions
Read More

LoadRunner Analysis: Hints and tips
Read More

LoadRunner in Windows 7
HP Loadrunner 11 is now available. This new version now natively supports Windows 7 and Windows Server 2008. I ...
Read More

Using the QuickTest Professional “commuter” license
Read More

Installing HP Diagnostics
Read More

Understanding LoadRunner licensing
Read More

VuGen scripting for YouTube video
Read More

Creating a Web + MMS vuser
Read More

Why you should use backwards dates
Read More

How to get the host’s IP address from within VuGen
Read More

VuGen scripting for BMC Remedy Action Request System 7.1
Read More

Unique usernames for BPM scripts
Read More

Mapping drives for LoadRunner Windows monitoring
Read More

VuGen feature requests
Read More

LoadRunner script completion checklist
Read More

Querying Quality Center user roles
Read More

Querying the Quality Center Database
Read More

HPSU 2009 Presentation – Performance Testing Web 2.0
Read More

Scaling HP Diagnostics
Read More

Global variables aren’t really global in LoadRunner
Read More

Client-side certificates for VuGen
Read More

Detect malicious HTML/JavaScript payloads with WebInspect (e.g. ASPROX, Gumblar, Income Iframe)
Read More

VuGen code snippets
Read More

Integrating QTP with Terminal Emulators
Read More

Why you must add try/catch blocks to Java-based BPM scripts
Read More

Querying a MySQL database with LoadRunner
Read More

ANZTB 2009 Presentation: Performance Testing Web 2.0
Read More

How to make QTP “analog mode” steps more reliable
Read More

Testing multiple browsers in a Standardized Operating Environment (SOE)
Read More

DNS-based load balancing for virtual users
Read More

What’s new in LoadRunner 9.50?
Read More

Calculating the difference between two dates or timestamps
Read More

The “is it done yet” loop
Read More

Think time that cannot be ignored
Read More

Understanding aggregate variance within LoadRunner analysis
Read More

Load balancing vusers without a load balancer
Read More

Harvesting file names with VuGen
Read More

Parameterising Unix/Posix timestamps in VuGen
Read More

HP Software trial license periods
Read More

How to handle HTTP POSTs with a changing number of name-value pairs
Read More

VuGen string comparison behaviour
Read More

Persistent data in VuGen with MySQL
Read More

How to write a Performance Test Plan
Read More

Unable to add virtual machine
To get ...
Read More

LoadRunner scripting languages
Read More

WDiff replacement for VuGen
Read More

Testing web services with a standard Web Vuser
Read More

Why your BPM scripts should use Download Filters
Read More

Querying your web server logs
Read More

Importing IIS Logs into SQL Server
Read More

QTP “Uninstall was not completed” problem
Read More

VuGen correlation for SAP Web Dynpro
Read More

How to save $500 on your HP software license
Read More

Testing and monitoring acronyms
Read More

Solving VuGen script generation errors
Read More

An introduction to SiteScope EMS Topology
Read More

Using the BAC JMX Console
Read More
This solution will probably not fit most users. BULK INSERT doesn’t’ have the ability to filter out rows based on string values. It only allows us to start the reading from specific line in the file. But IIS writes headers each time the application starts and not only in the beginning of the log file. You can check my blog for full solutions using Transact SQL and more important full explanation (I think it is full, but I am always open to get feedback, comments, and idea to improve it):
http://ariely.info/Blog/tabid/83/EntryId/212/Parse-and-import-IIS-log-files-using-Transact-SQL.aspx
Thanks, a simple way to import data into SQL. Now I can make easily statistics, about IIS use
Thank you for good information.
May I save on my blog?
Thanks.Good article. More useful….
Thanks.Good article. How this can be used to import Iplanet WebServer logs?