LoadRunner as a WebCrawler

Today I had the opportunity to use LoadRunner for a completely different task than it has been designed for.

The project I’m working with wanted to have a system that “warms up the caches” before it goes live, so we needed to have a system the crawls the production site at a specific time and speed, excludes a set of links and only goes 3 levels deep.

The resulting script was very easy to make, once the initial think-through was done. The code was very simple put into three sub functions: Process_Level_1(), Process_Level_2() and Process_Level_3(). In each function we call a set of recorded URL’s from the previous processing level.

The following example is a snippet of the code that I used to crawl the site. {BaseURL} is the protocol://domain:port of the site in question.

Note: In the following example it’s assumed that “URL_LIST1” contains all the URL’s we want to process in level 1.

Example code

void Process_Level1()
{
	int i;
	char buf[2048];
	char buf2[2048];
	char *pos;
	int res;
	int count;

	count = atoi(lr_eval_string("{URL_LIST1_count}"));

	if (count > 0)
	for ( i=1; i 0) res++; 

		if (res == 0)
		{
			lr_save_string( lr_eval_string(buf), "URL" );

			// Replace & with & - NONSTANDARD FUNCTION
			lr_replace( "URL", "&", "&" );

			web_reg_save_param("URL_LIST2", // save all href="" URL's
							   "LB=href="",
							   "RB="",
							   "Ord=All",
							   "Search=Body",
							   "NotFound=Warning",
							   LAST );

			web_url("URL", 
				"URL={BaseURL}{URL}", 
				"TargetFrame=", 
				"Resource=0", 
				"RecContentType=text/html", 
				"Mode=HTML", 
				LAST);
 
			// Process all "URL_LIST2" entires
			Process_Level2();
		}
	}
}

Please note that the “lr_replace()” function is not a standard LoadRunner function.

Advertisements

3 thoughts on “LoadRunner as a WebCrawler

  1. Kim,

    Thank you for providing this code. It helped me load test our website by crawling the site and visiting every page. My colleague and I enhanced the script to recursively crawl the site and maintain a list of visited URL’s so that it only visited the page once. Here’s the script for those that are in need of this functionality:
    ———————————————————
    char **myList;
    int numListElements = 0;
    int listSize = 1;

    Action()
    {

    web_reg_save_param("URL_LIST1",
    "LB=href=\"",
    "RB=\"",
    "Ord=All",
    "Search=Body",
    "NotFound=Warning",
    LAST );

    web_url("Home Page",
    "URL={BaseURL}",
    "TargetFrame=",
    "Resource=0",
    "RecContentType=text/html",
    "Referer=",
    "Snapshot=t1.inf",
    "Mode=HTML",
    LAST);

    Process_URLs(1);

    free(myList);
    myList = 0;
    numListElements = 0;
    listSize = 1;

    return 0;
    }

    Process_URLs(int index)
    {
    int i;
    int nextIndex;
    char listName[255];
    char listCountParamName[255];
    char listItemParamName[255];
    int count;
    int res_count;
    char *resourceName;

    nextIndex = (index + 1);

    sprintf(listCountParamName, "{URL_LIST%d_count}", index);
    count = atoi(lr_eval_string(listCountParamName));

    if (count > 0){
    for (i = 1; i <= count; i++){
    sprintf(listItemParamName, "{URL_LIST%d_%d}", index, i);

    lr_save_string(lr_eval_string(listItemParamName), "URL");

    if (isItemInList(lr_eval_string("{URL}")) == 0) {

    char *str = (char *)malloc(sizeof(lr_eval_string("{URL}")));
    str = lr_eval_string("{URL}");
    addItemToList(str);

    sprintf(listName, "URL_LIST%d", nextIndex);

    web_reg_save_param(listName,
    "LB=href=\"",
    "RB=\"",
    "Ord=All",
    "Search=Body",
    "NotFound=Warning",
    LAST );

    resourceName = (char *) strrchr(lr_eval_string("{URL}"), ‘/’);

    web_url(resourceName,
    "URL={BaseURL}{URL}",
    "TargetFrame=",
    "Resource=0",
    "RecContentType=text/html",
    "Mode=HTML",
    LAST);

    Process_URLs(nextIndex);

    }
    }
    }
    }

    void addItemToList(char *item) {
    char **newList;
    int i;

    if (!myList) {
    myList = (char **) malloc(listSize * sizeof(char *));
    }

    if (++numListElements > listSize) {
    newList = (char**) malloc(listSize * 2 * sizeof(char *));
    for (i = 0; i < listSize; ++i) {
    newList[i] = myList[i];
    }
    listSize *= 2;
    free(myList);
    myList = newList;
    }

    myList[numListElements – 1] = item;
    }

    int isItemInList(char *item) {
    int i;

    for (i = 0; i < numListElements; ++i) {
    if (!strcmp(item, myList[i])) {
    return 1;
    }
    }

    return 0;
    }

    void printList() {
    int i;

    for (i = 0; i < numListElements; ++i) {
    lr_output_message(myList[i]);
    }
    }
    ———————————————————

  2. Hey I was just wondering if you could please post the code for the other two processes, as I’m really interested in seeing this work.

    cheers 🙂

    • HI, the two other functions are the same as the 1st one .. just use a different URL list to process.. Level 2 uses URL_LIST2 and level 3 URL_LIST3…

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s