Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running on Windows fails silently due to certificate verification problem #21

Closed
bakunin75 opened this issue Jan 7, 2020 · 8 comments
Closed

Comments

@bakunin75
Copy link

Trying to get dwdweather 0.11.1 running on Python 3.7.4.

Problem:
Issuing the command
dwdweather weather 02667 20190717T11 --resolution hourly --categories air_temperature
leads to
2020-01-07 16:33:48,803 [dwdweather.client ] INFO : Requesting https://opendata.dwd.de/climate_environment/CDC/observations_germany/climate/hourly/air_temperature/recent
2020-01-07 16:33:48,928 [dwdweather.client ] WARNING: Station "2667" has no data for category "air_temperature"

The issue is also present when importing DwdWeather in python using the minimal example in the readme.

I tracked the problem to the client.py. For some reason the find_resource_file function in get_measurements gets stuck in the try block, but there is no Error raised.

resource_list = self.get_resource_index(index_uri, "zip")

@amotl
Copy link
Member

amotl commented Jan 7, 2020

Dear @bakunin75,

thanks for writing in.

We have been running dwdweather2 on Python 3.7.4 and it worked well so far, see #7 (comment). Invoking the command you outlined above gives us:

$ dwdweather weather 02667 20190717T11 --resolution hourly --categories air_temperature
2020-01-07 23:00:43,306 [dwdweather.client   ] INFO   : Acquiring dataset for resolution "hourly" from "https://opendata.dwd.de/climate_environment/CDC/observations_germany/climate/hourly"
2020-01-07 23:00:43,309 [dwdweather.core     ] INFO   : Using cache database /Users/amo/.dwd-weather/dwdweather2.db
2020-01-07 23:00:43,310 [dwdweather.commands ] INFO   : Querying data for station "2667" and categories "['air_temperature']" at "2019-07-17 11:00:00"
2020-01-07 23:00:43,315 [dwdweather.core     ] INFO   : Downloading measurements for station 2667 and timeranges ['recent']
2020-01-07 23:00:43,315 [dwdweather.core     ] INFO   : Station information: null
2020-01-07 23:00:43,315 [dwdweather.core     ] INFO   : Downloading "air temperature" data (TU)
2020-01-07 23:00:43,315 [dwdweather.client   ] INFO   : Requesting https://opendata.dwd.de/climate_environment/CDC/observations_germany/climate/hourly/air_temperature/recent
2020-01-07 23:00:43,541 [dwdweather.client   ] INFO   : Fetching resource https://opendata.dwd.de/climate_environment/CDC/observations_germany/climate/hourly/air_temperature/recent/stundenwerte_TU_02667_akt.zip
2020-01-07 23:00:43,590 [dwdweather.client   ] INFO   : Reading from Zip: produkt_tu_stunde_20180706_20200106_02667.txt
2020-01-07 23:00:43,595 [dwdweather.core     ] INFO   : Importing measurements for station "2667" and category "{'key': 'TU', 'name': 'air_temperature'}"
2020-01-07 23:00:43,595 [dwdweather.core     ] INFO   : Importing "air temperature" data from "https://opendata.dwd.de/climate_environment/CDC/observations_germany/climate/hourly/air_temperature/recent/stundenwerte_TU_02667_akt.zip/produkt_tu_stunde_20180706_20200106_02667.txt"
100%|██████████████████████████████████| 13202/13202 [00:04<00:00, 2701.92it/s]
{
    "airtemp_humidity": 68.0,
    "airtemp_quality_level": 3,
    "airtemp_temperature": 17.7,
    "cloudiness_quality_level": null,
    "cloudiness_source": null,
    "cloudiness_total_cover": null,
    "datetime": 2019071711,
    "precipitation_fallen": null,
    "precipitation_form": null,
    "precipitation_height": null,
    "precipitation_quality_level": null,
    "pressure_normalized": null,
    "pressure_quality_level": null,
    "pressure_station": null,
    "soiltemp_quality_level": null,
    "soiltemp_temperature_002": null,
    "soiltemp_temperature_005": null,
    "soiltemp_temperature_010": null,
    "soiltemp_temperature_020": null,
    "soiltemp_temperature_050": null,
    "soiltemp_temperature_100": null,
    "solar_atmosphere": null,
    "solar_duration": null,
    "solar_end_of_interval": null,
    "solar_global": null,
    "solar_quality_level": null,
    "solar_sky": null,
    "solar_zenith": null,
    "station_id": 2667,
    "sun_duration": null,
    "sun_quality_level": null,
    "visibility_quality_level": null,
    "visibility_source": null,
    "visibility_value": null,
    "wind_direction": null,
    "wind_quality_level": null,
    "wind_speed": null
}

You might want to add the --reset-cache option or drop the cache database manually in order to check if this has anything to do with.

With kind regards,
Andreas.

@amotl
Copy link
Member

amotl commented Jan 7, 2020

I tracked the problem to the client.py. For some reason the find_resource_file function in get_measurements gets stuck in the try block, but there is no Error raised.

This observation could also indicate there might be network connectivity problems?

@bakunin75
Copy link
Author

Thanks for the replies. I'm curious, which OS are you running on?

Upon further investigation, I tracked the problem to the following call:
response = self.http.get(uri + u'/')

I created a minimal example (windows).

from requests_cache import CachedSession
from bs4 import BeautifulSoup
import os

APP_NAME = "dwdweather2"
APP_VERSION = "0.11.1"

cache_name = os.path.join(os.getenv("APPDATA"),"dwdcache", "dwd_cache")

http = CachedSession(
    backend="sqlite",
    cache_name=cache_name,
    expire_after=300,
    user_agent=APP_NAME + "/" + APP_VERSION,
)

baseurl = "https://opendata.dwd.de/climate_environment/CDC/observations_germany/climate/hourly/air_temperature/recent"
extension = "zip"
response = http.get(baseurl + u"/",verify=True)
content = response.content

soup = BeautifulSoup(content, "html.parser")
ret_list = [
    baseurl + "/" + node.get("href")
    for node in soup.find_all("a")
    if node.get("href").endswith(extension)
]

This one throws
OpenSSL.SSL.Error: [('SSL routines', 'tls_process_server_certificate', 'certificate verify failed')]
When setting verify=False (which you wouldn't want) the requests runs through and prints the expected list of zip files.

@amotl
Copy link
Member

amotl commented Jan 8, 2020

I'm curious, which OS are you running on?

I am running macOS 10.13.6.

certificate verify failed

Strange thing. Maybe some CA certificates are not properly installed on your machine or it is really about connectivity woes on your side. Will you be able to check using a different internet uplink if you get the chance to?

When setting verify=False (which you wouldn't want) the requests runs through.

If nothing helps for you or other users on Windows, I might actually consider this if nobody objects to it.

@amotl amotl changed the title DwdWeather python3 Running on Windows croaks with "certificate verify failed" Jan 8, 2020
@amotl amotl changed the title Running on Windows croaks with "certificate verify failed" Running on Windows fails silently Jan 8, 2020
@bakunin75
Copy link
Author

bakunin75 commented Jan 8, 2020

Strange thing. Maybe some CA certificates are not properly installed on your machine or it is really about connectivity woes on your side. Will you be able to check using a different internet uplink if you get the chance to?

I will try that, but probably don't get the chance until the weekend. It's possible that some company firewall/proxy or whatever is causing this problem.

If not this issue may be linked to pyca/pyopenssl#823 (see first reply), but I can't work that into my minimal example. I wonder if anyone has ever run this module successfully on windows before?

@bakunin75
Copy link
Author

bakunin75 commented Jan 8, 2020

Btw the dwdbulk package mentioned in #22 (comment) suffers from the same SSL issue on windows (which is not supported anyways).

I've tested both modules on a linux VM and both work fine. So probably won't investigate further on the windows front and just stick to linux..

@amotl
Copy link
Member

amotl commented May 30, 2020

Dear @bakunin75,

thanks for your answer. While I was thinking about closing this issue, I believe we should keep it open for a while. This error really should not silently swallow the issue with certificate verification.

For all others running the same thing: As requests_cache's CachedSession does not accept the verify argument, you might want to set it at runtime within DwdCdcClient.setup_cache like

self.http.verify = False

in order to work around that problem.

With kind regards,
Andreas.

@amotl amotl changed the title Running on Windows fails silently Running on Windows fails silently due to certificate verification problem May 30, 2020
@amotl
Copy link
Member

amotl commented Jul 5, 2020

So probably won't investigate further on the windows front and just stick to Linux.

All right, thanks!

@amotl amotl closed this as completed Jul 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants