Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XMLParseError: KeyError: 'OBS_STATUS' when reading LSD data #171

Open
edkry opened this issue Apr 4, 2024 · 2 comments
Open

XMLParseError: KeyError: 'OBS_STATUS' when reading LSD data #171

edkry opened this issue Apr 4, 2024 · 2 comments
Labels
data-source Issues related to specific web services/data source(s) reader Handle standard SDMX message types xml SDMX-ML format

Comments

@edkry
Copy link

edkry commented Apr 4, 2024

ecb = sdmx.Client("LSD", backend="memory")
data_msg = ecb.data("S1R003_M8020420", key={"geo": "LT"}, params={'startPeriod':'2020'})

This code errors to:

--- SS without structure ---
1 (94671887778816) False

--- <class 'sdmx.message.StructureMessage'> ---
2 (140455248262432) <sdmx.StructureMessage>
  <Header>
    id: 'DA6C4E4BA2324FDCA027ADE732437137'
    prepared: '2024-04-04T10:12:55.918000+03:00'
    receiver: <Agency not_supplied>
    sender: <Agency unknown>
    source: 
    test: True

--- _stash ---
235 (140455247520064) {<class 'sdmx.model.common.DataAttribute'>: {'DS_LAST_UPDATE': DataAttribute(annotations=[], id='DS_LAST_UPDATE', uri=None, urn='urn:sdmx:org.sdmx.infomodel.datastructure.DataAttribute=LSD:M8020420(1.0).DS_LAST_UPDATE', concept_identity=<Concept DS_LAST_UPDATE: Paskutinio atnaujinimo data>, local_representation=<Representation: None, [Facet(type=FacetType(is_sequence=None, min_length=None, max_length=None, min_value=None, max_value=None, start_value=None, end_value=None, interval=None, time_interval=None, decimals=None, pattern=None, start_time=None, end_time=None, sentinel_values=None), value=None, value_type=<FacetValueType.string: 1>)]>, related_to=<sdmx.model.v21.NoSpecifiedRelationship object at 0x7fbe491f97b0>, usage_status=<UsageStatus.conditional: 2>, concept_role=None), 'DS_REGIONAL': DataAttribute(annotations=[], id='DS_REGIONAL', uri=None, urn='urn:sdmx:org.sdmx.infomodel.datastructure.DataAttribute=LSD:M8020420(1.0).DS_REGIONAL', concept_identity=<Concept DS_REGIONAL: Rodiklio duomenų rinkinys turi administracinių vienetų dimensiją?>, local_representation=<Representation: DS_REGIONAL, []>, related_to=<sdmx.model.v21.NoSpecifiedRelationship object at 0x7fbe491f9840>, usage_status=<UsageStatus.conditional: 2>, concept_role=None), 'DS_TIME_FORMAT': DataAttribute(annotations=[], id='DS_TIME_FORMAT', uri=None, urn='urn:sdmx:org.sdmx.infomodel.datastructure.DataAttribute=LSD:M8020420(1.0).DS_TIME_FORMAT', concept_identity=<Concept DS_TIME_FORMAT: Periodiškumas>, local_representation=<Representation: DS_TIME_FORMAT, []>, related_to=<sdmx.model.v21.NoSpecifiedRelationship object at 0x7fbe491f9b40>, usage_status=<UsageStatus.conditional: 2>, concept_role=None)}}

--- <class 'sdmx.model.common.Agency'> ---
LSD (140455248258784) LSD

--- <class 'sdmx.model.common.Codelist'> ---
MATVNT (140455248262768) MATVNT
Vietove (140455248259120) Vietove
OBS_STATUS (140455247316832) OBS_STATUS
DS_REGIONAL (140455247324656) DS_REGIONAL
DS_TIME_FORMAT (140455247326816) DS_TIME_FORMAT
OSP_MASYVO_STATUSAS (140455247327536) OSP_MASYVO_STATUSAS

--- <class 'sdmx.model.common.Representation'> ---
238 (140455247321104) <Representation: OBS_STATUS, []>

--- <class 'sdmx.model.common.ConceptScheme'> ---
M8020420 (140455247329936) M8020420

--- <class 'sdmx.model.v21.DataStructureDefinition'> ---
M8020420 (140455247330224) M8020420

--- current DSD ---
M8020420 (140455247330224) M8020420

--- Name ---
200 (140455246333568) ('lt', 'Duomenų struktūros apibrėžimas')
201 (140455246358976) ('en', 'Data Structure Definition')

--- <class 'sdmx.model.v21.PrimaryMeasureRelationship'> ---
241 (140455247321872) <sdmx.model.v21.PrimaryMeasureRelationship object at 0x7fbe491f9b10>


Ignore:
 {94671887823840, 140455248258784}
<str:Attribute xmlns:str="http://www.sdmx.org/resources/sdmxml/schemas/v2_1/structure" xmlns:footer="http://www.sdmx.org/resources/sdmxml/schemas/v2_1/message/footer" xmlns:mes="http://www.sdmx.org/resources/sdmxml/schemas/v2_1/message" xmlns:com="http://www.sdmx.org/resources/sdmxml/schemas/v2_1/common" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" id="OBS_STATUS" urn="urn:sdmx:org.sdmx.infomodel.datastructure.DataAttribute=LSD:M8020420(1.0).OBS_STATUS" assignmentStatus="Conditional">
              <str:ConceptIdentity/><str:LocalRepresentation/><str:AttributeRelationship/></str:Attribute>



...


XMLParseError: KeyError: 'OBS_STATUS'

Also if I take a look at DataStructureDefinition and its dimensions, attributes and measures, it's empty:

lsd = sdmx.Client("LSD")
msg = lsd.dataflow(resource=flow_msg.dataflow.S3R0241_M3031021_4)
flow = msg.dataflow.S3R0241_M3031021_4
dsd = flow.structure
dsd.dimensions.components
dsd.attributes.components
dsd.measures.components

However, https://osp-rs.stat.gov.lt/ords/ipospp/ospp/rest_xml/datastructure/LSD/M3031021_4/ this API call would give me a datastructure.

@khaeru
Copy link
Owner

khaeru commented Apr 4, 2024

Hi, thanks for the report!

I get this file from the URL you provided: rest_flow_M3031021_4_20240404123219.xml.txt

The part causing the exception is shown at the bottom of the error trace. It corresponds to this XML:

<str:Attribute id="OBS_STATUS" urn="urn:sdmx:org.sdmx.infomodel.datastructure.DataAttribute=LSD:M3031021_4(1.0).OBS_STATUS" assignmentStatus="Conditional">
  <str:ConceptIdentity>
    <Ref id="OBS_STATUS" maintainableParentID="M3031021_4" maintainableParentVersion="1.0" agencyID="LSD" package="conceptscheme" class="Concept"/>
  </str:ConceptIdentity>
</str:Attribute>

To be clear, this <Ref ... should refer to a Concept in the ConceptScheme with ID "M3031021_4".

This ConceptScheme is also in the message:

<str:ConceptScheme id="M3031021_4" urn="urn:sdmx:org.sdmx.infomodel.conceptscheme.ConceptScheme=LSD:M3031021_4(1.0)" agencyID="LSD" version="1.0" isFinal="true">
  <com:Name xml:lang="lt">Schema M3031021_4</com:Name>
  <com:Name xml:lang="en">Concept Scheme for M3031021_4</com:Name>
  <str:Concept id="Lytis" urn="urn:sdmx:org.sdmx.infomodel.conceptscheme.Concept=LSD:M3031021_4(1.0).Lytis">
    <com:Name xml:lang="lt">Lytis</com:Name>
    <com:Name xml:lang="en">Sex</com:Name>
  </str:Concept>
  <str:Concept id="darbo_vieta" urn="urn:sdmx:org.sdmx.infomodel.conceptscheme.Concept=LSD:M3031021_4(1.0).darbo_vieta">
    <com:Name xml:lang="lt">Pagrindinė darbo vieta</com:Name>
    <com:Name xml:lang="en">The main place of work</com:Name>
  </str:Concept>
  <str:Concept id="MATVNT" urn="urn:sdmx:org.sdmx.infomodel.conceptscheme.Concept=LSD:M3031021_4(1.0).MATVNT">
    <com:Name xml:lang="lt">Matavimo vienetai</com:Name>
    <com:Name xml:lang="en">Measure unit</com:Name>
  </str:Concept>
  <str:Concept id="TIME_PERIOD" urn="urn:sdmx:org.sdmx.infomodel.conceptscheme.Concept=LSD:M3031021_4(1.0).TIME_PERIOD">
    <com:Name xml:lang="lt">Laikotarpis</com:Name>
    <com:Name xml:lang="en">Time period</com:Name>
  </str:Concept>
  <str:Concept id="DS_LAST_UPDATE" urn="urn:sdmx:org.sdmx.infomodel.conceptscheme.Concept=LSD:M3031021_4(1.0).DS_LAST_UPDATE">
    <com:Name xml:lang="lt">Paskutinio atnaujinimo data</com:Name>
    <com:Name xml:lang="en">Last update date</com:Name>
  </str:Concept>
  <str:Concept id="DS_REGIONAL" urn="urn:sdmx:org.sdmx.infomodel.conceptscheme.Concept=LSD:M3031021_4(1.0).DS_REGIONAL">
    <com:Name xml:lang="lt">Rodiklio duomenų rinkinys turi administracinių vienetų dimensiją?</com:Name>
    <com:Name xml:lang="en">The indicator set has an administrative unit dimension?</com:Name>
  </str:Concept>
  <str:Concept id="OBS_VALUE" urn="urn:sdmx:org.sdmx.infomodel.conceptscheme.Concept=LSD:M3031021_4(1.0).OBS_VALUE">
    <com:Name xml:lang="lt">Stebėjimo vertė</com:Name>
    <com:Name xml:lang="en">Observation value</com:Name>
  </str:Concept>
  <str:Concept id="OSP_MASYVO_STATUSAS" urn="urn:sdmx:org.sdmx.infomodel.conceptscheme.Concept=LSD:M3031021_4(1.0).OSP_MASYVO_STATUSAS">
    <com:Name xml:lang="lt">Atnaujinamas</com:Name>
    <com:Name xml:lang="en">Updated</com:Name>
  </str:Concept>
  <str:Concept id="OBS_DECIMALS" urn="urn:sdmx:org.sdmx.infomodel.conceptscheme.Concept=LSD:M3031021_4(1.0).OBS_DECIMALS">
    <com:Name xml:lang="lt">Tikslumas po kablelio</com:Name>
    <com:Name xml:lang="en">Decimal precision</com:Name>
  </str:Concept>
  <str:Concept id="DS_TIME_FORMAT" urn="urn:sdmx:org.sdmx.infomodel.conceptscheme.Concept=LSD:M3031021_4(1.0).DS_TIME_FORMAT">
    <com:Name xml:lang="lt">Periodiškumas</com:Name>
    <com:Name xml:lang="en">Periodicity</com:Name>
  </str:Concept>
</str:ConceptScheme>

As you can see, the concept scheme does not contain a Concept with ID "OBS_STATUS". This is the proximate cause of the exception.

The underlying cause is that this web service is returning invalid/malformed SDMX-ML. The DataAttribute.concept_role refers to a Concept that does not exist. I would suggest you contact the provider that operates the web service, and ask them to update the invalid SDMX-ML. (If it helps, you can link directly to this comment of mine.)

As for the current package (sdmx1), one thing we could change is to be tolerant of this particular kind of invalid SDMX-ML. For example, we could still parse the DataAttribute from the message, but leave its concept_role = None, while emitting or logging a warning message about the malformed SDMX-ML, instead of the exception currently raised.

@khaeru khaeru added xml SDMX-ML format reader Handle standard SDMX message types data-source Issues related to specific web services/data source(s) labels Apr 4, 2024
@khaeru
Copy link
Owner

khaeru commented Apr 30, 2024

I would suggest you contact the provider that operates the web service, and ask them to update the invalid SDMX-ML.

Hi @edkry —did you get any reply from LSD about the invalid SDMX-ML that their service was returning?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data-source Issues related to specific web services/data source(s) reader Handle standard SDMX message types xml SDMX-ML format
Projects
None yet
Development

No branches or pull requests

2 participants