Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add funtion SuperFlatten feature, Flatten multi-level dictionary to 1 #390

Open
NOBB2333 opened this issue Mar 8, 2024 · 3 comments
Open
Assignees
Labels
enhancement New feature or request

Comments

@NOBB2333
Copy link

NOBB2333 commented Mar 8, 2024

When using the flat feature, dictionaries and lists with multiple levels of nested dictionary types are found to be truncated to the dictionary, and the list will not go further down
I also looked at the function 'benedict/core/flatten. py' and did a super expansion, even expanding the dictionary list and nesting it together

from pprint import pprint
from benedict import benedict
import json

# 原始的多层级字典和列表嵌套数据
complex_data = {
    'person': {
        'name': 'John',
        'age': 30,
        'addresses': [
            {'street': '123 Main St', 'city': 'New York'},
            {'street': '456 Elm St', 'city': 'San Francisco'}
        ]
    },
    'company': {
        'name': 'ABC Inc.',
        'employees': [
            {'name': 'Alice', 'department': 'HR'},
            {'name': 'Bob', 'department': 'Engineering'}
        ]
    }
}

# 将复杂数据扁平化
flat_data = benedict(complex_data).flatten(separator='_')

# 打印扁平化后的数据
print('Flattened Data:')
pprint(dict(flat_data))


# Flattened Data:
# {'company_employees': [{'name': 'Alice', 'department': 'HR'},
#                        {'name': 'Bob', 'department': 'Engineering'}],
#  'company_name': 'ABC Inc.',
#  'person_addresses': [{'street': '123 Main St', 'city': 'New York'},
#                       {'street': '456 Elm St', 'city': 'San Francisco'}],
#  'person_age': 30,
#  'person_name': 'John'}`

by using SuperFlatten
list item will concat by "\n
___Json结果展开解析.txt
"


[{'company_employees__department': 'HR\nEngineering',
  'company_employees__name': 'Alice\nBob',
  'company_name': 'ABC Inc.',
  'person_addresses__city': 'New York\nSan Francisco',
  'person_addresses__street': '123 Main St\n456 Elm St',
  'person_age': '30',
  'person_name': 'John'}]

Upvote & Fund

  • We're using Polar.sh so you can upvote and help fund this issue.
  • We receive the funding once the issue is completed & confirmed by you.
  • Thank you in advance for helping prioritize & fund our backlog.
Fund with Polar
@NOBB2333 NOBB2333 added the enhancement New feature or request label Mar 8, 2024
@fabiocaccamo
Copy link
Owner

fabiocaccamo commented Mar 12, 2024

@NOBB2333 thank you for this suggestion. I'm open to add support this feature if it matches all the requirements below:

  • this feature should be optional and backward-compatible, the idea thing is to support a new argument option for using it, eg. deep=True
  • list items should be flattened like mydict.mylist[0].mysublist[0] and so on ...
  • list items must not be concatenated by any separator, this seems to be really a personal/project-specific need.
  • the unflatten method must be updated too, doing unflatten(flatten(dict)) should return the initial dict.
  • add a couple of test cases to avoid regressions

Do you want to work on a Pull Request? That would be great!

@NOBB2333
Copy link
Author

I think your suggestion is very good,i still got some question:

  • deep=True is ok
  • mydict.mylist[0].mysublist[0] infact pandas using is this , why not use this because make data structured more difficult to use , ofcourse one of my request is use to export xlsx, it be very long and cant read;
  • i haven't understand this ,if use last require mydict.mylist[0].mysublist[0] ,quite is need't
  • unflatten has not make that yet, and recent i got some problem with parse, i want some help
  • test cases when all complect will add

abount send Problem clarification

why i choose using "\n" to connect the list objct ,infact is not connect ,just like i say, use mydict.mylist[0].mysublist[0] wen a list is too long is not good to use , my sloution is wen list is the lase level object , child object is pure dict object , the meaning they a the same ,example:
company have 50 employee, if use ing this , parse and using also will be easy, mydict.mylist[0].mysublist[0] willbe vety long , if not the lase level , won't use this connect ,

problem

i want get help

  • because i using json.loads i find some of them will parse by object, sunch as True False None null
    recent i use way is : before loads use data.replace("True",' "True" ') , i met some some icant slove, wen thekey ,vlauealso have a word contains the one of them , parse will be error , example :
    {"keu ":"the result is a True value"}
    have several day didin't slove, so when "\n".join( list ) always got error , xxxxobject can connect with str , i try to got some inspiration from here

after that, the may relly is Niche demand ,this is my project info:
the reason is this , when using other service , always return a json, this json is not fromat,
example with https://openapi.qcc.com/dataApi/213 a search company info website:
i add one list object in json_data['Result']['PartnerList']['KeyNo'] make it more real,
in the ___Json结果展开解析.txt-> dict json_data ChangeList got 4 result , but json_data['Result']['PartnerList']['KeyNo'] got 2 ,
if need save the data, save all json to DB, or parse again , so t use parse last level list to connect by "\n"

and so on, i'm enjoy with Pull Request ,

@fabiocaccamo
Copy link
Owner

@NOBB2333 I'm sorry, but I can't understand well what you mean, could you try using Google Translate or a similar service please?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants