Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decoding stores that was encrypted by Yahoo! Finance recently #953

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

raphi6
Copy link

@raphi6 raphi6 commented Dec 18, 2022

Sorry for any invonvenience, I am new to working on git in such a
professional manor so expect errors with pull request.

Changes:

In pandas-datareader/yahoo/daily.py/

I have added function decrypt_cryptojs_aes() to decode the
stores that were previously not allowing any stock data to be 
accessed from Yahoo! Finance due to their new change.

Additionally just changed _read_one_data() so that it reads the
decoded stores and passes on stock data correctly.

I have tested this on a limited number of stocks on my personal
project and works good.
I have ran the test_yahoo.py and passed 16, failed 4. However, it
still is more tests than the current version on GitHub now due to
Yahoo! Finance new change (I dont think any Yahoo! stocks work atm).
I am unsure of the tests that are failing so some help would be great.
I am sure this can be used just as a temporary fix!

I also don't know how to run the 3rd and 4th bullet points below.

…m Yahoo! Finance as encrypted due to a recent change in their API (around 2 days ago). Fix decodes the data in decrypt_cryptojs_aes() and makes small changes to _read_one_data() to parse it properly later.
@CKDarling
Copy link

Please make this pr a priority. Yahoo api is entirely bricked across related packages.

@satoshi
Copy link

satoshi commented Dec 23, 2022

I think we need to update requirements.txt with packaging and pycryptodome. Some folks also mentioned pycryptodomex but I didn't need this package as far as my testing.

Updating with recommendations from @satoshi
@raphi6
Copy link
Author

raphi6 commented Dec 25, 2022

I think we need to update requirements.txt with packaging and pycryptodome. Some folks also mentioned pycryptodomex but I didn't need this package as far as my testing.

Thanks, just updated it now.

@mariamragab
Copy link

Seems your main failures in the Azure DevOps logs are you are failing both the linter (flake8) and the formatter (black) tests.

To pass the formatter, in your project root directory, run:
black pandas_datareader
then run
black --check pandas_datareader
to check that worked and commit your changes.

To pass the linter, in your project root directory, run
git diff upstream/master -u -- "*.py" | flake8 --diff
Then you will have to fix the issues manually. Once you do, run it again to confirm you didn't miss anything.

Finally, commit your changes! Good luck :)

@sangar3
Copy link

sangar3 commented Dec 28, 2022

Can't wait till this request gets committed! I use a data reader a lot and this error is causing a lot of problems in my code.
thank you for your work @raphi6

@raphi6
Copy link
Author

raphi6 commented Dec 28, 2022

Seems your main failures in the Azure DevOps logs are you are failing both the linter (flake8) and the formatter (black) tests.

To pass the formatter, in your project root directory, run: black pandas_datareader then run black --check pandas_datareader to check that worked and commit your changes.

To pass the linter, in your project root directory, run git diff upstream/master -u -- "*.py" | flake8 --diff Then you will have to fix the issues manually. Once you do, run it again to confirm you didn't miss anything.

Finally, commit your changes! Good luck :)

I meant to say changed 3 files in the above commit, and I also noticed that it got rid of some 'u's from pandas_datareader/tests/io/test_jsdmx.py AND pandas_datareader/tests/yahoo/test_options.py
and I have no idea what that is doing/if it breaks anything?

But now I am struggling with the second command using flake8. I installed it with pip and tried to run the above and get the error message:
fatal: bad revision 'upstream/master'
usage: flake8 [options] file file ...
flake8: error: unrecognized arguments: --diff

Have been trying to understand what the problem is but im completely unfamiliar with git diff and flake8

@raphi6
Copy link
Author

raphi6 commented Dec 28, 2022

Can't wait till this request gets committed! I use a data reader a lot and this error is causing a lot of problems in my code. thank you for your work @raphi6

Me too, my dissertation is using this library and it wont work until this gets accepted :) I have already submitted

@spot92
Copy link

spot92 commented Jan 5, 2023

Has anyone reached out to get this merged? And does it work on Windows?

@robliou
Copy link

robliou commented Jan 5, 2023

Has anyone reached out to get this merged? And does it work on Windows?

Asking the same question here. I'm trying to use Tiingo instead but that doesn't seem to be working either?

@raphi6
Copy link
Author

raphi6 commented Jan 5, 2023

Has anyone reached out to get this merged? And does it work on Windows?

I've emailed @bashtage a couple of times with no luck, i believe he is the only one to merge. Also I'm on Windows 10 and works.
Maybe you guys can try contact him as well @robliou ?

@spot92
Copy link

spot92 commented Jan 5, 2023

If you've already emailed (or contacted on github or whatever) him/her, then all there is to do is wait

@raphi6
Copy link
Author

raphi6 commented Jan 12, 2023

Sometimes I get the following error :
If anyone could help out that would be great

"""
Traceback (most recent call last):
File "C:\Users\rapha\PycharmProjects\PROJECT\risk\var.py", line 406, in
Backtest().range_of_days()
File "C:\Users\rapha\PycharmProjects\PROJECT\risk\var.py", line 392, in range_of_days
var = VaR(stock_list, temp_start, temp_end, weights, alpha).historical_var() * np.sqrt(t)
File "C:\Users\rapha\PycharmProjects\PROJECT\risk\var.py", line 38, in init
yahoo_data = pandasdr.get_data_yahoo(s, end=end, start=start)['Close']
File "C:\Users\rapha\PycharmProjects\PROJECT\venv\lib\site-packages\pandas_datareader\data.py", line 80, in get_data_yahoo
return YahooDailyReader(*args, **kwargs).read()
File "C:\Users\rapha\PycharmProjects\PROJECT\venv\lib\site-packages\pandas_datareader\base.py", line 258, in read
df = self._dl_mult_symbols(self.symbols)
File "C:\Users\rapha\PycharmProjects\PROJECT\venv\lib\site-packages\pandas_datareader\base.py", line 268, in _dl_mult_symbols
stocks[sym] = self._read_one_data(self.url, self._get_params(sym))
File "C:\Users\rapha\PycharmProjects\PROJECT\venv\lib\site-packages\pandas_datareader\yahoo\daily.py", line 238, in _read_one_data
data = new_j["HistoricalPriceStore"]
UnboundLocalError: local variable 'new_j' referenced before assignment

Process finished with exit code 1
"""

@Fconel
Copy link

Fconel commented Jan 14, 2023

ranaroussi/yfinance#1291 (comment) this work for me but keeping unpad block size at 16

         encrypted_stores = data['context']['dispatcher']['stores']
-        _cs = data["_cs"]
-        _cr = data["_cr"]
-
-        _cr = b"".join(int.to_bytes(i, length=4, byteorder="big", signed=True) for i in json.loads(_cr)["words"])
-        password = hashlib.pbkdf2_hmac("sha1", _cs.encode("utf8"), _cr, 1, dklen=32).hex()
+        password_key = next(key for key in data.keys() if key not in ["context", "plugins"])
+        password = data[password_key]

         encrypted_stores = b64decode(encrypted_stores)

@CarlosEspinoTimon
Copy link

Sometimes I get the following error : If anyone could help out that would be great

""" Traceback (most recent call last): File "C:\Users\rapha\PycharmProjects\PROJECT\risk\var.py", line 406, in Backtest().range_of_days() File "C:\Users\rapha\PycharmProjects\PROJECT\risk\var.py", line 392, in range_of_days var = VaR(stock_list, temp_start, temp_end, weights, alpha).historical_var() * np.sqrt(t) File "C:\Users\rapha\PycharmProjects\PROJECT\risk\var.py", line 38, in init yahoo_data = pandasdr.get_data_yahoo(s, end=end, start=start)['Close'] File "C:\Users\rapha\PycharmProjects\PROJECT\venv\lib\site-packages\pandas_datareader\data.py", line 80, in get_data_yahoo return YahooDailyReader(*args, **kwargs).read() File "C:\Users\rapha\PycharmProjects\PROJECT\venv\lib\site-packages\pandas_datareader\base.py", line 258, in read df = self._dl_mult_symbols(self.symbols) File "C:\Users\rapha\PycharmProjects\PROJECT\venv\lib\site-packages\pandas_datareader\base.py", line 268, in _dl_mult_symbols stocks[sym] = self._read_one_data(self.url, self._get_params(sym)) File "C:\Users\rapha\PycharmProjects\PROJECT\venv\lib\site-packages\pandas_datareader\yahoo\daily.py", line 238, in _read_one_data data = new_j["HistoricalPriceStore"] UnboundLocalError: local variable 'new_j' referenced before assignment

Process finished with exit code 1 """

Hi @raphi6 !

FIrst of all, thank you very much for trying to fix this, I have just starting playinf with this and now is broken 😢

This error you are seeing is happening because the response for that particular stock you are looking for does not have the keys _cr and _cs. This new variable new_j its created ONLY if the condition is met, however the rest of your code is reliying on this variable.

I have tried with 'AAPL', 'GOOGL', 'AMZN' and none of them return the keys _cs and _cr, so I was wondering if you could share an example of a stock that returns that keys.

And now I want to ask you, how did you get to that _cs and _cr keys, is there some documentation for the API we are consuming? (sorry if it's a dummy question but I am not able to find it)

NOTE: Also, it feels weird that they are encrypting something... and sharing also the key. That's pointless (or I might be missing something 😅 )

@bneumayer
Copy link

bneumayer commented Jan 14, 2023

Sometimes I get the following error : If anyone could help out that would be great

""" Traceback (most recent call last): File "C:\Users\rapha\PycharmProjects\PROJECT\risk\var.py", line 406, in Backtest().range_of_days() File "C:\Users\rapha\PycharmProjects\PROJECT\risk\var.py", line 392, in range_of_days var = VaR(stock_list, temp_start, temp_end, weights, alpha).historical_var() * np.sqrt(t) File "C:\Users\rapha\PycharmProjects\PROJECT\risk\var.py", line 38, in init yahoo_data = pandasdr.get_data_yahoo(s, end=end, start=start)['Close'] File "C:\Users\rapha\PycharmProjects\PROJECT\venv\lib\site-packages\pandas_datareader\data.py", line 80, in get_data_yahoo return YahooDailyReader(*args, **kwargs).read() File "C:\Users\rapha\PycharmProjects\PROJECT\venv\lib\site-packages\pandas_datareader\base.py", line 258, in read df = self._dl_mult_symbols(self.symbols) File "C:\Users\rapha\PycharmProjects\PROJECT\venv\lib\site-packages\pandas_datareader\base.py", line 268, in _dl_mult_symbols stocks[sym] = self._read_one_data(self.url, self._get_params(sym)) File "C:\Users\rapha\PycharmProjects\PROJECT\venv\lib\site-packages\pandas_datareader\yahoo\daily.py", line 238, in _read_one_data data = new_j["HistoricalPriceStore"] UnboundLocalError: local variable 'new_j' referenced before assignment

Process finished with exit code 1 """

Hi @raphf6,

Thanks for all your work! In my opinion this isn't an error that occurs sometimes or for specific stocks but something must have been changed on Yahoo's side again. Your solution worked for me until yesterday and since yesterday I get the same error (and I am only requesting one specific fund all the time).

@hellc
Copy link

hellc commented Jan 15, 2023

hellc@87dda3f fixed

@hellc
Copy link

hellc commented Jan 15, 2023

If u cant wait to merge use this
pip install git+https://github.com/hellc/pandas-datareader.git@87dda3f297df8f4b3253c6f2d5006b5ac43a9150

@raphi6
Copy link
Author

raphi6 commented Jan 15, 2023

If u cant wait to merge use this pip install git+https://github.com/hellc/pandas-datareader.git@87dda3f297df8f4b3253c6f2d5006b5ac43a9150

Encryption genius! Thank you so much Ivan! I will test this out later tonight hopefully.

@raphi6
Copy link
Author

raphi6 commented Jan 15, 2023

If u cant wait to merge use this pip install git+https://github.com/hellc/pandas-datareader.git@87dda3f297df8f4b3253c6f2d5006b5ac43a9150

Do I have to do any merging? Sorry I am quite new to Git

@hellc
Copy link

hellc commented Jan 15, 2023

If u cant wait to merge use this pip install git+https://github.com/hellc/pandas-datareader.git@87dda3f297df8f4b3253c6f2d5006b5ac43a9150

Do I have to do any merging? Sorry I am quite new to Git

Accept this PR into your branch and u would be fine. raphi6#1

@raphi6
Copy link
Author

raphi6 commented Jan 18, 2023

@CharliesAngel1 What does it mean that these are approved? Do we still have to wait for a merge?

Copy link
Author

@raphi6 raphi6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All looks good.

@sangar3
Copy link

sangar3 commented Jan 20, 2023

pip install git+https://github.com/hellc/pandas-datareader.git@87dda3f297df8f4b3253c6f2d5006b5ac43a9150

you are a legend bro

thanks again

@bneumayer
Copy link

All looks good.

Just, to be sure - is this solution working for everybody? @hellc's solution does not work for me (out of the box) as it requires packaging version 22 or higher, which is not available for my setting, but it seems the changes were accepted into @raphi6's solution, right? I saw the update but I still get only errors for my request.
If these solutions work for everyone else, it's obviously me. Which is OK, I just want to make sure :)

@hellc
Copy link

hellc commented Jan 24, 2023

All looks good.

Just, to be sure - is this solution working for everybody? @hellc's solution does not work for me (out of the box) as it requires packaging version 22 or higher, which is not available for my setting, but it seems the changes were accepted into @raphi6's solution, right? I saw the update but I still get only errors for my request. If these solutions work for everyone else, it's obviously me. Which is OK, I just want to make sure :)

It's still working in BTC pairs so was enough for me. Let us know which pairs is not working for you, maybe we will figure out what else they have changed and fix it too

@sangar3
Copy link

sangar3 commented Jan 25, 2023

All looks good.

Just, to be sure - is this solution working for everybody? @hellc's solution does not work for me (out of the box) as it requires packaging version 22 or higher, which is not available for my setting, but it seems the changes were accepted into @raphi6's solution, right? I saw the update but I still get only errors for my request. If these solutions work for everyone else, it's obviously me. Which is OK, I just want to make sure :)

It's still working in BTC pairs so was enough for me. Let us know which pairs is not working for you, maybe we will figure out what else they have changed and fix it too

File "rsi.py", line 14, in <module> data = web.DataReader(stock, 'yahoo', start, end) File "C:\Users\sg17a\anaconda3\envs\pythondev\lib\site-packages\pandas\util\_decorators.py", line 211, in wrapper return func(*args, **kwargs) File "C:\Users\sg17a\anaconda3\envs\pythondev\lib\site-packages\pandas_datareader\data.py", line 370, in DataReader return YahooDailyReader( File "C:\Users\sg17a\anaconda3\envs\pythondev\lib\site-packages\pandas_datareader\base.py", line 253, in read df = self._read_one_data(self.url, params=self._get_params(self.symbols)) File "C:\Users\sg17a\anaconda3\envs\pythondev\lib\site-packages\pandas_datareader\yahoo\daily.py", line 227, in _read_one_data new_j = decrypt_cryptojs_aes( File "C:\Users\sg17a\anaconda3\envs\pythondev\lib\site-packages\pandas_datareader\yahoo\daily.py", line 81, in decrypt_cryptojs_aes plaintext = unpad(plaintext, 16, style="pkcs7") File "C:\Users\sg17a\anaconda3\envs\pythondev\lib\site-packages\Crypto\Util\Padding.py", line 92, in unpad raise ValueError("Padding is incorrect.") ValueError: Padding is incorrect.

All looks good.

Just, to be sure - is this solution working for everybody? @hellc's solution does not work for me (out of the box) as it requires packaging version 22 or higher, which is not available for my setting, but it seems the changes were accepted into @raphi6's solution, right? I saw the update but I still get only errors for my request. If these solutions work for everyone else, it's obviously me. Which is OK, I just want to make sure :)

It's still working in BTC pairs so was enough for me. Let us know which pairs is not working for you, maybe we will figure out what else they have changed and fix it too

I am getting this error for all the pairs now,

File "rsi.py", line 14, in
data = web.DataReader(stock, 'yahoo', start, end)
File "C:\Users\sg17a\anaconda3\envs\pythondev\lib\site-packages\pandas\util_decorators.py", line 211, in wrapper
return func(*args, **kwargs)
File "C:\Users\sg17a\anaconda3\envs\pythondev\lib\site-packages\pandas_datareader\data.py", line 370, in DataReader
return YahooDailyReader(
File "C:\Users\sg17a\anaconda3\envs\pythondev\lib\site-packages\pandas_datareader\base.py", line 253, in read
df = self._read_one_data(self.url, params=self._get_params(self.symbols))
File "C:\Users\sg17a\anaconda3\envs\pythondev\lib\site-packages\pandas_datareader\yahoo\daily.py", line 227, in _read_one_data
new_j = decrypt_cryptojs_aes(
File "C:\Users\sg17a\anaconda3\envs\pythondev\lib\site-packages\pandas_datareader\yahoo\daily.py", line 81, in decrypt_cryptojs_aes
plaintext = unpad(plaintext, 16, style="pkcs7")
File "C:\Users\sg17a\anaconda3\envs\pythondev\lib\site-packages\Crypto\Util\Padding.py", line 92, in unpad
raise ValueError("Padding is incorrect.")
ValueError: Padding is incorrect.

@bneumayer
Copy link

bneumayer commented Jan 28, 2023

I am getting this error for all the pairs now,

File "rsi.py", line 14, in data = web.DataReader(stock, 'yahoo', start, end) File "C:\Users\sg17a\anaconda3\envs\pythondev\lib\site-packages\pandas\util_decorators.py", line 211, in wrapper return func(*args, **kwargs) File "C:\Users\sg17a\anaconda3\envs\pythondev\lib\site-packages\pandas_datareader\data.py", line 370, in DataReader return YahooDailyReader( File "C:\Users\sg17a\anaconda3\envs\pythondev\lib\site-packages\pandas_datareader\base.py", line 253, in read df = self._read_one_data(self.url, params=self._get_params(self.symbols)) File "C:\Users\sg17a\anaconda3\envs\pythondev\lib\site-packages\pandas_datareader\yahoo\daily.py", line 227, in _read_one_data new_j = decrypt_cryptojs_aes( File "C:\Users\sg17a\anaconda3\envs\pythondev\lib\site-packages\pandas_datareader\yahoo\daily.py", line 81, in decrypt_cryptojs_aes plaintext = unpad(plaintext, 16, style="pkcs7") File "C:\Users\sg17a\anaconda3\envs\pythondev\lib\site-packages\Crypto\Util\Padding.py", line 92, in unpad raise ValueError("Padding is incorrect.") ValueError: Padding is incorrect.

Can confirm. When I try:

import pandas_datareader.data as web
symbol = '0P0001ICNW.F'
res_yahoo = web.DataReader(symbol, 'yahoo')

the result is the same for me:
ValueError: Padding is incorrect.

@VoxLight
Copy link

VoxLight commented Feb 4, 2023

Also going to bump this ValueError: Padding is incorrect. issue.

I installed pandas-datareader with:
pip install git+https://github.com/raphi6/pandas-datareader.git@87dda3f297df8f4b3253c6f2d5006b5ac43a9150
And I got the following error:

DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): finance.yahoo.com:443
DEBUG:urllib3.connectionpool:https://finance.yahoo.com:443 "GET /quote/ATEN/history?period1=1674947342&period2=1675587599&interval=1d&frequency=1d&filter=history HTTP/1.1" 200 None
DEBUG:root:################################################################
ERROR:root:Padding is incorrect.
Traceback (most recent call last):
  File "C:\Users\tkkt3\Downloads\XLTickers-master\XLTickers-master\libs\stock_data.py", line 66, in _get_ticker_data
    __get_recent_price(ticker)
  File "C:\Users\tkkt3\Downloads\XLTickers-master\XLTickers-master\libs\stock_data.py", line 51, in __get_recent_price
    data = __get_data(ticker).tail(1)
  File "C:\Users\tkkt3\Downloads\XLTickers-master\XLTickers-master\libs\stock_data.py", line 41, in __get_data
    data = web.DataReader(ticker, data_source='yahoo', start=start, end=dt.datetime.today())
  File "C:\Users\tkkt3\Downloads\XLTickers-master\XLTickers-master\env\lib\site-packages\pandas\util\_decorators.py", line 211, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\tkkt3\Downloads\XLTickers-master\XLTickers-master\env\lib\site-packages\pandas_datareader\data.py", line 379, in DataReader
    ).read()
  File "C:\Users\tkkt3\Downloads\XLTickers-master\XLTickers-master\env\lib\site-packages\pandas_datareader\base.py", line 253, in read
    df = self._read_one_data(self.url, params=self._get_params(self.symbols))
  File "C:\Users\tkkt3\Downloads\XLTickers-master\XLTickers-master\env\lib\site-packages\pandas_datareader\yahoo\daily.py", line 227, in _read_one_data
    new_j = decrypt_cryptojs_aes(
  File "C:\Users\tkkt3\Downloads\XLTickers-master\XLTickers-master\env\lib\site-packages\pandas_datareader\yahoo\daily.py", line 81, in decrypt_cryptojs_aes
    plaintext = unpad(plaintext, 16, style="pkcs7")
  File "C:\Users\tkkt3\Downloads\XLTickers-master\XLTickers-master\env\lib\site-packages\Crypto\Util\Padding.py", line 92, in unpad
    raise ValueError("Padding is incorrect.")
ValueError: Padding is incorrect.

I went into pycryptodome to figure out where this error is coming from:
https://github.com/Legrandin/pycryptodome/blob/8bba4a056fb6b5cb7cc9616da3d36893f759efe8/lib/Crypto/Util/Padding.py#L92
inside of daily.py line 81
plaintext = unpad(plaintext, 16, style="pkcs7") causes this error because the condition

        padding_len = bord(padded_data[-1])
        if padding_len<1 or padding_len>min(block_size, pdata_len):
            raise ValueError("Padding is incorrect.")

fails. I'm not familiar with cryptography or whatever it is that is happening here. However, I did this investigation to hopefully get someone on the right track here. Hopefully someone fixes this soon.

@uad1098
Copy link

uad1098 commented Mar 11, 2023

Haven't heard anything in a month. Any status on fixing issue 953/952? Will Pandas-datareader every work again to scrape price data from Yahoo?

@VoxLight
Copy link

Haven't heard anything in a month. Any status on fixing issue 953/952? Will Pandas-datareader every work again to scrape price data from Yahoo?

More than likely, Pandas-datareader will eventually become functional again. If you need to access market data right now, I recommend checking out the yfinance library. It's clearly possible to get data from Yahoo! Finance, the question is just when it will be supported inside of pandas again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Response format from Yahoo seems to have changed I keep getting this error.