Quantcast

how to get redirect chain

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

how to get redirect chain

Wayne Xin
Hi,
 
I’m new to the alias and this may be an old topic. When I retrieve a page using htmlunit, there are usually some level of redirections. I would like to get the full redirection chain. One suggestion is the overwrite the HttpWebConnection. However, all I have is a WebRequest and WebResponse.
 
   public WebResponse getResponse(WebRequest request) throws IOException {
                super(request);
   }
 
I have 2 problems:
 
1.       It’s hard for me to tell if the web request is the original entry page. Or is it from loading a resource in the middle of parsing a non-entry page. I’m missing the “enclosed page” that triggered the http connection.
2.       In the response, I could try to parse and get the href. But I wonder if the redirect response is not well formatted, how do I guarantee the href I got is the page the parse is going to visit.
 
Wonder if anybody has been successful in retrieving the redirect chain.
 
Thanks a lot.
 
-Wayne

 


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Htmlunit-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/htmlunit-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: how to get redirect chain

asashour
Hi Wayne,

The logic is in WebClient.loadWebResponseFromWebConnection().

I am afraid there is no currently way to exactly know, except by copying the method on your code and do the same checking.

Hope that helps,

Ahmed



From: Wayne <[hidden email]>
To: "[hidden email]" <[hidden email]>
Sent: Thursday, March 30, 2017 8:14 PM
Subject: [Htmlunit-user] how to get redirect chain

Hi,

I’m new to the alias and this may be an old topic. When I retrieve a page using htmlunit, there are usually some level of redirections. I would like to get the full redirection chain. One suggestion is the overwrite the HttpWebConnection. However, all I have is a WebRequest and WebResponse.

  public WebResponse getResponse(WebRequest request) throws IOException {
                super(request);
  }

I have 2 problems:

1.      It’s hard for me to tell if the web request is the original entry page. Or is it from loading a resource in the middle of parsing a non-entry page. I’m missing the “enclosed page” that triggered the http connection.
2.      In the response, I could try to parse and get the href. But I wonder if the redirect response is not well formatted, how do I guarantee the href I got is the page the parse is going to visit.

Wonder if anybody has been successful in retrieving the redirect chain.

Thanks a lot.

-Wayne




------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Htmlunit-user mailing list



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Htmlunit-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/htmlunit-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: how to get redirect chain

Wayne Xin

Thank you very much Ahmed. Reading the loadWebResponseFromWebConnection() was helpful. That seems to only show the 30X redirection. I wonder where I find <meta http-equiv=”refresh…”/> redirect and javascript redirect (window.location).

 

-Wayne

 

From: Ahmed Ashour <[hidden email]>
Reply-To: Ahmed Ashour <[hidden email]>, "[hidden email]" <[hidden email]>
Date: Thursday, March 30, 2017 at 4:04 PM
To: "[hidden email]" <[hidden email]>
Subject: Re: [Htmlunit-user] how to get redirect chain

 

Hi Wayne,

 

The logic is in WebClient.loadWebResponseFromWebConnection().

 

I am afraid there is no currently way to exactly know, except by copying the method on your code and do the same checking.

 

Hope that helps,

 

Ahmed

 


From: Wayne <[hidden email]>
To: "[hidden email]" <[hidden email]>
Sent: Thursday, March 30, 2017 8:14 PM
Subject: [Htmlunit-user] how to get redirect chain

 

Hi,

 

I’m new to the alias and this may be an old topic. When I retrieve a page using htmlunit, there are usually some level of redirections. I would like to get the full redirection chain. One suggestion is the overwrite the HttpWebConnection. However, all I have is a WebRequest and WebResponse.

 

  public WebResponse getResponse(WebRequest request) throws IOException {

                super(request);

  }

 

I have 2 problems:

 

1.      It’s hard for me to tell if the web request is the original entry page. Or is it from loading a resource in the middle of parsing a non-entry page. I’m missing the “enclosed page” that triggered the http connection.

2.      In the response, I could try to parse and get the href. But I wonder if the redirect response is not well formatted, how do I guarantee the href I got is the page the parse is going to visit.

 

Wonder if anybody has been successful in retrieving the redirect chain.

 

Thanks a lot.

 

-Wayne

 

 

 

 

------------------------------------------------------------------------------

Check out the vibrant tech community on one of the world's most

engaging tech sites, Slashdot.org! http://sdm.link/slashdot

_______________________________________________

Htmlunit-user mailing list

 

------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot_______________________________________________ Htmlunit-user mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/htmlunit-user


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Htmlunit-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/htmlunit-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: how to get redirect chain

asashour
Hi Wayne,

Check RefreshHandler, and HtmlPage.executeRefreshIfNeeded()

I guess it would be easier for you to use the following workaround:

new Exception().getStackTrace()

And then see which StackTraceElement suits your need, e.g. is it called from Window.setLocation, or from somewhere else?

Ahmed


From: Wayne <[hidden email]>
To: Ahmed Ashour <[hidden email]>; "[hidden email]" <[hidden email]>
Sent: Friday, March 31, 2017 7:31 AM
Subject: Re: [Htmlunit-user] how to get redirect chain

Thank you very much Ahmed. Reading the loadWebResponseFromWebConnection() was helpful. That seems to only show the 30X redirection. I wonder where I find <meta http-equiv=”refresh…”/> redirect and javascript redirect (window.location).
 
-Wayne
 
From: Ahmed Ashour <[hidden email]>
Reply-To: Ahmed Ashour <[hidden email]>, "[hidden email]" <[hidden email]>
Date: Thursday, March 30, 2017 at 4:04 PM
To: "[hidden email]" <[hidden email]>
Subject: Re: [Htmlunit-user] how to get redirect chain
 
Hi Wayne,
 
The logic is in WebClient.loadWebResponseFromWebConnection().
 
I am afraid there is no currently way to exactly know, except by copying the method on your code and do the same checking.
 
Hope that helps,
 
Ahmed
 

From: Wayne <[hidden email]>
To: "[hidden email]" <[hidden email]>
Sent: Thursday, March 30, 2017 8:14 PM
Subject: [Htmlunit-user] how to get redirect chain
 
Hi,
 
I’m new to the alias and this may be an old topic. When I retrieve a page using htmlunit, there are usually some level of redirections. I would like to get the full redirection chain. One suggestion is the overwrite the HttpWebConnection. However, all I have is a WebRequest and WebResponse.
 
  public WebResponse getResponse(WebRequest request) throws IOException {
                super(request);
  }
 
I have 2 problems:
 
1.      It’s hard for me to tell if the web request is the original entry page. Or is it from loading a resource in the middle of parsing a non-entry page. I’m missing the “enclosed page” that triggered the http connection.
2.      In the response, I could try to parse and get the href. But I wonder if the redirect response is not well formatted, how do I guarantee the href I got is the page the parse is going to visit.
 
Wonder if anybody has been successful in retrieving the redirect chain.
 
Thanks a lot.
 
-Wayne
 
 
 
 
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Htmlunit-user mailing list
 
------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot_______________________________________________ Htmlunit-user mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/htmlunit-user
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Htmlunit-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/htmlunit-user



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Htmlunit-user mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/htmlunit-user
Loading...