Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to setting proxy authorization with username/passwd in ubuntu-server 18.04 env #132

Open
luorixiangyang opened this issue Jun 24, 2023 · 8 comments

Comments

@luorixiangyang
Copy link

How to setting proxy authorization with username/passwd in ubuntu-server 18.04 env?
I found lots of example but dont reslove my requirement to scrape the web like : (https://developer.apple.com/documentation/accelerate/bnns/shape/3656199-init)

thanks!

@rbri
Copy link
Collaborator

rbri commented Jun 24, 2023

Looks like there is something missing ;-) will have a deeper look

@rbri
Copy link
Collaborator

rbri commented Jun 24, 2023

As a workaround can you please try something like

String PROXY_HOST = ....;
int PROXY_PORT = .....

WebDriver webDriver = new HtmlUnitDriver(BrowserVersion.FIREFOX, true) {
    @Override
    protected WebClient modifyWebClient(WebClient client) {
        final WebClient webClient = super.modifyWebClient(client);

        webClient.getOptions().setProxyConfig(new ProxyConfig(PROXY_HOST, PROXY_PORT, null));
        final DefaultCredentialsProvider credentialsProvider = (DefaultCredentialsProvider) webClient.getCredentialsProvider();
        credentialsProvider.addCredentials("username", "password", PROXY_HOST, PROXY_PORT);

       return webClient;
    }
};

@luorixiangyang
Copy link
Author

As a workaround can you please try something like

String PROXY_HOST = ....;
int PROXY_PORT = .....

WebDriver webDriver = new HtmlUnitDriver(BrowserVersion.FIREFOX, true) {
    @Override
    protected WebClient modifyWebClient(WebClient client) {
        final WebClient webClient = super.modifyWebClient(client);

        webClient.getOptions().setProxyConfig(new ProxyConfig(PROXY_HOST, PROXY_PORT, null));
        final DefaultCredentialsProvider credentialsProvider = (DefaultCredentialsProvider) webClient.getCredentialsProvider();
        credentialsProvider.addCredentials("username", "password", PROXY_HOST, PROXY_PORT);

       return webClient;
    }
};

Here is the detail infos:
pom.xml dependency like below:

...

org.seleniumhq.selenium
selenium-java
4.10.0


org.seleniumhq.selenium
htmlunit-driver
4.10.0

...

source code like below:
public static WebDriver createProxyWebDriver() {
String PROXY_HOST = ProxyHost;
int PROXY_PORT = ProxyPort;

    // config webDriver with proxies
    WebDriver webDriver = new HtmlUnitDriver(BrowserVersion.FIREFOX, true) {
        @Override
        protected WebClient modifyWebClient(WebClient client) {
            final WebClient webClient = super.modifyWebClient(client);

            webClient.getOptions().setProxyConfig(new ProxyConfig(PROXY_HOST, PROXY_PORT, null));
            final DefaultCredentialsProvider credentialsProvider = (DefaultCredentialsProvider) webClient
                    .getCredentialsProvider();
            credentialsProvider.addCredentials(ProxyUser, ProxyPass, PROXY_HOST, PROXY_PORT, null);

            return webClient;
        }
    };
    return webDriver;
}

public static String getPageOnDynamicWeb(String url) {
WebDriver client = createProxyWebDriver();
client.get(url);
String response = client.getPageSource();
client.close();
return response;
}

public static void main(String[] args) throws Exception {
String response = "";
String url = "https://developer.apple.com/documentation/accelerate/bnns/shape/3656199-init"; // the target url
response = getPageOnDynamicWeb(url);
ClearInnerToWriteFile(
"/home/luori/_fly/workspaces/javaworkspace/selenium-base/logs/apple_api_page_html.html",
response);
}

Run before code will take exception like below:
......
Caused by: net.sourceforge.htmlunit.corejs.javascript.EvaluatorException: invalid property id (https://developer.apple.com/tutorials/js/chunk-vendors.fc64ed7e.js#10)
at com.gargoylesoftware.htmlunit.javascript.HtmlUnitContextFactory$HtmlUnitErrorReporter.error(HtmlUnitContextFactory.java:435)
at net.sourceforge.htmlunit.corejs.javascript.Parser.addError(Parser.java:257)
at net.sourceforge.htmlunit.corejs.javascript.Parser.reportError(Parser.java:336)
at net.sourceforge.htmlunit.corejs.javascript.Parser.reportError(Parser.java:327)
at net.sourceforge.htmlunit.corejs.javascript.Parser.reportError(Parser.java:320)
at net.sourceforge.htmlunit.corejs.javascript.Parser.objectLiteral(Parser.java:3499)
......

@luorixiangyang
Copy link
Author

Environment:
Ubuntu-server 18.04
google-chrome: Google Chrome 114.0.5735.133
ChromeDriver:114.0.5735.90
JDK:1.8.0_271

@luorixiangyang
Copy link
Author

Please try target url :https://developer.apple.com/documentation/accelerate/bnns/shape/3656199-init to test the correct approach .
Thanks!

@luorixiangyang
Copy link
Author

I need to point out :(https://developer.apple.com/documentation/accelerate/bnns/shape/3656199-init) is dynamic web content, need excute javascript file on scrape process. I can get the static web content but can't catch the dynamic parts.

@rbri
Copy link
Collaborator

rbri commented Jun 25, 2023

Had a deeper look and there are several problems with this page. Long story short - HtmlUnit does not support the whole modern javascript syntay (because it is based on Rhino). We are working on improving this but i fear there is no real progress until the end of this year.

Two options: help us to improve Rhino or use selenium with real browsers

@luorixiangyang
Copy link
Author

Had a deeper look and there are several problems with this page. Long story short - HtmlUnit does not support the whole modern javascript syntay (because it is based on Rhino). We are working on improving this but i fear there is no real progress until the end of this year.

Two options: help us to improve Rhino or use selenium with real browsers

Got it! I also check if i can make contribution on HtmlUitl to improve this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants